MediaWiki  1.23.0
XMPReader Class Reference

Class for reading xmp data containing properties relevant to images, and spitting out an array that FormatExif accepts. More...

Collaboration diagram for XMPReader:

Public Member Functions

 __construct ()
 Constructor. More...
 
 __destruct ()
 Destroy the xml parser. More...
 
 char ( $parser, $data)
 Character data handler Called whenever character data is found in the xmp document. More...
 
 endElement ( $parser, $elm)
 Handler for hitting a closing element. More...
 
 getResults ()
 Get the result array. More...
 
 parse ( $content, $allOfIt=true, $reset=false)
 Main function to call to parse XMP. More...
 
 parseExtended ( $content)
 Entry point for XMPExtended blocks in jpeg files. More...
 
 startElement ( $parser, $elm, $attribs)
 Hits an opening element. More...
 

Public Attributes

const MODE_ALT = 15
 
const MODE_BAG = 13
 
const MODE_BAGSTRUCT = 16
 
const MODE_IGNORE = 1
 
const MODE_INITIAL = 0
 These are various mode constants. More...
 
const MODE_LANG = 14
 
const MODE_LI = 2
 
const MODE_LI_LANG = 3
 
const MODE_QDESC = 4
 
const MODE_SEQ = 12
 
const MODE_SIMPLE = 10
 
const MODE_STRUCT = 11
 
const NS_RDF = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
 
const NS_XML = 'http://www.w3.org/XML/1998/namespace'
 

Protected Attributes

array $items
 XMP item configuration array *. More...
 

Private Member Functions

 doAttribs ( $attribs)
 Process attributes. More...
 
 endElementModeIgnore ( $elm)
 When we hit a closing element in MODE_IGNORE Check to see if this is the element we started to ignore, in which case we get out of MODE_IGNORE. More...
 
 endElementModeLi ( $elm)
 Hit a closing element in MODE_LI (either rdf:Seq, or rdf:Bag ) Add information about what type of element this is. More...
 
 endElementModeQDesc ( $elm)
 End element while in MODE_QDESC mostly when ending an element when we have a simple value that has qualifiers. More...
 
 endElementModeSimple ( $elm)
 Hit a closing element when in MODE_SIMPLE. More...
 
 endElementNested ( $elm)
 Hit a closing element in MODE_STRUCT, MODE_SEQ, MODE_BAG generally means we've finished processing a nested structure. More...
 
 resetXMLParser ()
 Main use is if a single item has multiple xmp documents describing it. More...
 
 saveValue ( $ns, $tag, $val)
 Given an extracted value, save it to results array. More...
 
 startElementModeBag ( $elm)
 Start element in MODE_BAG (unordered array) this should always be <rdf:Bag> More...
 
 startElementModeIgnore ( $elm)
 Hit an opening element while in MODE_IGNORE. More...
 
 startElementModeInitial ( $ns, $tag, $attribs)
 Starting an element when in MODE_INITIAL This usually happens when we hit an element inside the outer rdf:Description. More...
 
 startElementModeLang ( $elm)
 Start element in MODE_LANG (language alternative) this should always be <rdf:Alt> More...
 
 startElementModeLi ( $elm, $attribs)
 opening element in MODE_LI process elements of arrays. More...
 
 startElementModeLiLang ( $elm, $attribs)
 Opening element in MODE_LI_LANG. More...
 
 startElementModeQDesc ( $elm)
 Start an element when in MODE_QDESC. More...
 
 startElementModeSeq ( $elm)
 Start element in MODE_SEQ (ordered array) this should always be <rdf:Seq> More...
 
 startElementModeSimple ( $elm, $attribs)
 Handle an opening element when in MODE_SIMPLE. More...
 
 startElementModeStruct ( $ns, $tag, $attribs)
 Hit an opening element when in a Struct (MODE_STRUCT) This is generally for fields of a compound property. More...
 

Private Attributes

bool string $ancestorStruct = false
 The structure name when processing nested structures. More...
 
bool string $charContent = false
 Temporary holder for character data that appears in xmp doc. More...
 
bool string $charset = false
 Character set like 'UTF-8' *. More...
 
array $curItem = array()
 Array to hold the current element (and previous element, and so on) *. More...
 
int $extendedXMPOffset = 0
 
bool string $itemLang = false
 Used for lang alts only *. More...
 
array $mode = array()
 Stores the state the xmpreader is in (see MODE_FOO constants) *. More...
 
bool $processingArray = false
 If we're doing a seq or bag. More...
 
array $results = array()
 Array to hold results *. More...
 
resource $xmlParser
 A resource handle for the XML parser *. More...
 

Detailed Description

Class for reading xmp data containing properties relevant to images, and spitting out an array that FormatExif accepts.

Note, this is not meant to recognize every possible thing you can encode in XMP. It should recognize all the properties we want. For example it doesn't have support for structures with multiple nesting levels, as none of the properties we're supporting use that feature. If it comes across properties it doesn't recognize, it should ignore them.

The public methods one would call in this class are

  • parse( $content ) Reads in xmp content. Can potentially be called multiple times with partial data each time.
  • parseExtended( $content ) Reads XMPExtended blocks (jpeg files only).
  • getResults Outputs a results array.

Note XMP kind of looks like rdf. They are not the same thing - XMP is encoded as a specific subset of rdf. This class can read XMP. It cannot read rdf.

Definition at line 49 of file XMP.php.

Constructor & Destructor Documentation

◆ __construct()

XMPReader::__construct ( )

Constructor.

Primary job is to initialize the XMLParser

Definition at line 105 of file XMP.php.

References XMPInfo\getItems(), and resetXMLParser().

◆ __destruct()

XMPReader::__destruct ( )

Destroy the xml parser.

Not sure if this is actually needed.

Definition at line 143 of file XMP.php.

Member Function Documentation

◆ char()

XMPReader::char (   $parser,
  $data 
)

Character data handler Called whenever character data is found in the xmp document.

does nothing if we're in MODE_IGNORE or if the data is whitespace throws an error if we're not in MODE_SIMPLE (as we're not allowed to have character data in the other modes).

As an example, this happens when we encounter XMP like: <exif:DigitalZoomRatio>0/10</exif:DigitalZoomRatio> and are processing the 0/10 bit.

Parameters
XMLParser$parserXMLParser reference to the xml parser
string$dataCharacter data
Exceptions
MWExceptionon invalid data

Definition at line 400 of file XMP.php.

Referenced by doAttribs().

◆ doAttribs()

XMPReader::doAttribs (   $attribs)
private

Process attributes.

Simple values can be stored as either a tag or attribute

Often the initial "<rdf:Description>" tag just has all the simple properties as attributes.

Example:
<rdf:Description rdf:about="" xmlns:exif="http://ns.adobe.com/exif/1.0/" exif:DigitalZoomRatio="0/10">
Parameters
array$attribsArray attribute=>value
Exceptions
MWException

Definition at line 1131 of file XMP.php.

References $attribs, $name, as, char(), list, MODE_QDESC, saveValue(), and wfDebugLog().

Referenced by startElementModeInitial(), startElementModeLi(), and startElementModeStruct().

◆ endElement()

XMPReader::endElement (   $parser,
  $elm 
)

Handler for hitting a closing element.

generally just calls a helper function depending on what mode we're in.

Ignores the outer wrapping elements that are optional in xmp and have no meaning.

Parameters
XMLParser$parser
string$elmNamespace . ' ' . element name
Exceptions
MWException

Definition at line 624 of file XMP.php.

References endElementModeIgnore(), endElementModeLi(), endElementModeQDesc(), endElementModeSimple(), endElementNested(), MODE_BAG, MODE_BAGSTRUCT, MODE_IGNORE, MODE_INITIAL, MODE_LANG, MODE_LI, MODE_LI_LANG, MODE_QDESC, MODE_SEQ, MODE_SIMPLE, MODE_STRUCT, and wfDebugLog().

◆ endElementModeIgnore()

XMPReader::endElementModeIgnore (   $elm)
private

When we hit a closing element in MODE_IGNORE Check to see if this is the element we started to ignore, in which case we get out of MODE_IGNORE.

Parameters
string$elmNamespace of element followed by a space and then tag name of element.

Definition at line 435 of file XMP.php.

Referenced by endElement().

◆ endElementModeLi()

XMPReader::endElementModeLi (   $elm)
private

Hit a closing element in MODE_LI (either rdf:Seq, or rdf:Bag ) Add information about what type of element this is.

Note we still have to hit the outer "</property>"

For example, when processing:
<exif:ISOSpeedRatings> <rdf:Seq> <rdf:li>64</rdf:li>
</rdf:Seq> </exif:ISOSpeedRatings>

This method is called when we hit the "</rdf:Seq>". (For comparison, we call endElementModeSimple when we hit the "</rdf:li>")

Parameters
string$elmNamespace . ' ' . element name
Exceptions
MWException

Definition at line 559 of file XMP.php.

References list, and wfDebugLog().

Referenced by endElement().

◆ endElementModeQDesc()

XMPReader::endElementModeQDesc (   $elm)
private

End element while in MODE_QDESC mostly when ending an element when we have a simple value that has qualifiers.

Qualifiers aren't all that common, and we don't do anything with them.

Parameters
string$elmNamespace and element

Definition at line 598 of file XMP.php.

References list, and saveValue().

Referenced by endElement().

◆ endElementModeSimple()

XMPReader::endElementModeSimple (   $elm)
private

Hit a closing element when in MODE_SIMPLE.

This generally means that we finished processing a property value, and now have to save the result to the results array

For example, when processing: <exif:DigitalZoomRatio>0/10</exif:DigitalZoomRatio> this deals with when we hit </exif:DigitalZoomRatio>.

Or it could be if we hit the end element of a property of a compound data structure (like a member of an array).

Parameters
string$elmNamespace, space, and tag name.

Definition at line 457 of file XMP.php.

References list, and saveValue().

Referenced by endElement().

◆ endElementNested()

XMPReader::endElementNested (   $elm)
private

Hit a closing element in MODE_STRUCT, MODE_SEQ, MODE_BAG generally means we've finished processing a nested structure.

resets some internal variables to indicate that.

Note this means we hit the closing element not the "</rdf:Seq>".

For example, when processing:
<exif:ISOSpeedRatings> <rdf:Seq> <rdf:li>64</rdf:li>
</rdf:Seq> </exif:ISOSpeedRatings>

This method is called when we hit the "</exif:ISOSpeedRatings>" tag.

Parameters
string$elmNamespace . space . tag name.
Exceptions
MWException

Definition at line 492 of file XMP.php.

References array(), list, and wfDebugLog().

Referenced by endElement().

◆ getResults()

XMPReader::getResults ( )

Get the result array.

Do some post-processing before returning the array, and transform any metadata that is special-cased.

Returns
Array array of results as an array of arrays suitable for FormatMetadata::getFormattedData().

Definition at line 154 of file XMP.php.

References $results, array(), as, list, and wfRunHooks().

◆ parse()

XMPReader::parse (   $content,
  $allOfIt = true,
  $reset = false 
)

Main function to call to parse XMP.

Use getResults to get results.

Also catches any errors during processing, writes them to debug log, blanks result array and returns false.

Parameters
string$contentXMP data
bool$allOfItIf this is all the data (true) or if its split up (false). Default true
bool$resetDoes xml parser need to be reset. Default false
Exceptions
MWException
Returns
bool Success.

Definition at line 252 of file XMP.php.

References $e, $error, $ok, array(), resetXMLParser(), wfDebugLog(), wfRestoreWarnings(), and wfSuppressWarnings().

Referenced by parseExtended(), and XMPTest\testXMPParse().

◆ parseExtended()

XMPReader::parseExtended (   $content)

Entry point for XMPExtended blocks in jpeg files.

Todo:
In serious need of testing
See also
http://www.adobe.ge/devnet/xmp/pdfs/XMPSpecificationPart3.pdf XMP spec part 3 page 20
Parameters
string$contentXMPExtended block minus the namespace signature
Returns
bool If it succeeded.

Definition at line 325 of file XMP.php.

References parse(), resetXMLParser(), and wfDebugLog().

◆ resetXMLParser()

XMPReader::resetXMLParser ( )
private

Main use is if a single item has multiple xmp documents describing it.

For example in jpeg's with extendedXMP

Definition at line 121 of file XMP.php.

References array().

Referenced by __construct(), parse(), and parseExtended().

◆ saveValue()

XMPReader::saveValue (   $ns,
  $tag,
  $val 
)
private

Given an extracted value, save it to results array.

note also uses $this->ancestorStruct and $this->processingArray to determine what name to save the value under. (in addition to $tag).

Parameters
string$nsNamespace of tag this is for
string$tagTag name
string$valValue to save

Definition at line 1180 of file XMP.php.

References $ancestorStruct, $itemLang, array(), and wfDebugLog().

Referenced by doAttribs(), endElementModeQDesc(), endElementModeSimple(), and startElementModeSimple().

◆ startElement()

XMPReader::startElement (   $parser,
  $elm,
  $attribs 
)

Hits an opening element.

Generally just calls a helper based on what MODE we're in. Also does some initial set up for the wrapper element

Parameters
$parserXMLParser
string$elmNamespace "<space>" element
array$attribsAttribute name => value
Exceptions
MWException

Definition at line 1038 of file XMP.php.

References $attribs, list, MODE_BAG, MODE_BAGSTRUCT, MODE_IGNORE, MODE_INITIAL, MODE_LANG, MODE_LI, MODE_LI_LANG, MODE_QDESC, MODE_SEQ, MODE_SIMPLE, MODE_STRUCT, startElementModeBag(), startElementModeIgnore(), startElementModeInitial(), startElementModeLang(), startElementModeLi(), startElementModeLiLang(), startElementModeQDesc(), startElementModeSeq(), startElementModeSimple(), startElementModeStruct(), and wfDebugLog().

◆ startElementModeBag()

XMPReader::startElementModeBag (   $elm)
private

Start element in MODE_BAG (unordered array) this should always be <rdf:Bag>

Parameters
string$elmNamespace . ' ' . tag
Exceptions
MWExceptionif we have an element that's not <rdf:Bag>

Definition at line 720 of file XMP.php.

Referenced by startElement().

◆ startElementModeIgnore()

XMPReader::startElementModeIgnore (   $elm)
private

Hit an opening element while in MODE_IGNORE.

XMP is extensible, so ignore any tag we don't understand.

Mostly ignores, unless we encounter the element that we are ignoring. in which case we add it to the item stack, so we can ignore things that are nested, correctly.

Parameters
string$elmNamespace . ' ' . tag name

Definition at line 706 of file XMP.php.

Referenced by startElement().

◆ startElementModeInitial()

XMPReader::startElementModeInitial (   $ns,
  $tag,
  $attribs 
)
private

Starting an element when in MODE_INITIAL This usually happens when we hit an element inside the outer rdf:Description.

This is generally where most properties start.

Parameters
string$nsNamespace
string$tagtag name (without namespace prefix)
array$attribsarray of attributes
Exceptions
MWException

Definition at line 847 of file XMP.php.

References $attribs, $mode, doAttribs(), and wfDebugLog().

Referenced by startElement().

◆ startElementModeLang()

XMPReader::startElementModeLang (   $elm)
private

Start element in MODE_LANG (language alternative) this should always be <rdf:Alt>

This tag tends to be used for metadata like describe this picture, which can be translated into multiple languages.

XMP supports non-linguistic alternative selections, which are really only used for thumbnails, which we don't care about.

Parameters
string$elmNamespace . ' ' . tag
Exceptions
MWExceptionif we have an element that's not <rdf:Alt>

Definition at line 762 of file XMP.php.

Referenced by startElement().

◆ startElementModeLi()

XMPReader::startElementModeLi (   $elm,
  $attribs 
)
private

opening element in MODE_LI process elements of arrays.

Example: <exif:ISOSpeedRatings> <rdf:Seq> <rdf:li>64</rdf:li> </rdf:Seq> </exif:ISOSpeedRatings> This method is called when we hit the <rdf:li> element.

Parameters
string$elmNamespace . ' ' . tagname
array$attribsAttributes. (needed for BAGSTRUCTS)
Exceptions
MWExceptionif gets a tag other than <rdf:li>

Definition at line 957 of file XMP.php.

References $attribs, doAttribs(), and list.

Referenced by startElement().

◆ startElementModeLiLang()

XMPReader::startElementModeLiLang (   $elm,
  $attribs 
)
private

Opening element in MODE_LI_LANG.

process elements of language alternatives

Example: <dc:title> <rdf:Alt> <rdf:li xml:lang="x-default">My house </rdf:li> </rdf:Alt> </dc:title>

This method is called when we hit the <rdf:li> element.

Parameters
string$elmNamespace . ' ' . tag
array$attribsArray of elements (most importantly xml:lang)
Exceptions
MWExceptionIf gets a tag other than <rdf:li> or if no xml:lang

Definition at line 1007 of file XMP.php.

References $attribs.

Referenced by startElement().

◆ startElementModeQDesc()

XMPReader::startElementModeQDesc (   $elm)
private

Start an element when in MODE_QDESC.

This generally happens when a simple element has an inner rdf:Description to hold qualifier elements.

For example in: <exif:DigitalZoomRatio><rdf:Description><rdf:value>0/10</rdf:value> <foo:someQualifier>Bar</foo:someQualifier> </rdf:Description> </exif:DigitalZoomRatio> Called when processing the <rdf:value> or <foo:someQualifier>.

Parameters
string$elmNamespace and tag name separated by a space.

Definition at line 825 of file XMP.php.

Referenced by startElement().

◆ startElementModeSeq()

XMPReader::startElementModeSeq (   $elm)
private

Start element in MODE_SEQ (ordered array) this should always be <rdf:Seq>

Parameters
string$elmNamespace . ' ' . tag
Exceptions
MWExceptionif we have an element that's not <rdf:Seq>

Definition at line 735 of file XMP.php.

References wfDebugLog().

Referenced by startElement().

◆ startElementModeSimple()

XMPReader::startElementModeSimple (   $elm,
  $attribs 
)
private

Handle an opening element when in MODE_SIMPLE.

This should not happen often. This is for if a simple element already opened has a child element. Could happen for a qualified element.

For example: <exif:DigitalZoomRatio><rdf:Description><rdf:value>0/10</rdf:value> <foo:someQualifier>Bar</foo:someQualifier> </rdf:Description> </exif:DigitalZoomRatio>

This method is called when processing the <rdf:Description> element

Parameters
string$elmNamespace and tag names separated by space.
array$attribsAttributes of the element.
Exceptions
MWException

Definition at line 788 of file XMP.php.

References $attribs, list, saveValue(), and wfDebugLog().

Referenced by startElement().

◆ startElementModeStruct()

XMPReader::startElementModeStruct (   $ns,
  $tag,
  $attribs 
)
private

Hit an opening element when in a Struct (MODE_STRUCT) This is generally for fields of a compound property.

Example of a struct (abbreviated; flash has more properties):

<exif:Flash> <rdf:Description> <exif:Fired>True</exif:Fired> <exif:Mode>1</exif:Mode></rdf:Description></exif:Flash>

or:

<exif:Flash rdf:parseType='Resource'> <exif:Fired>True</exif:Fired> <exif:Mode>1</exif:Mode></exif:Flash>

Parameters
string$nsNamespace
string$tagTag name (no ns)
array$attribsArray of attribs w/ values.
Exceptions
MWException

Definition at line 909 of file XMP.php.

References $attribs, and doAttribs().

Referenced by startElement().

Member Data Documentation

◆ $ancestorStruct

bool string XMPReader::$ancestorStruct = false
private

The structure name when processing nested structures.

Definition at line 54 of file XMP.php.

Referenced by saveValue().

◆ $charContent

bool string XMPReader::$charContent = false
private

Temporary holder for character data that appears in xmp doc.

Definition at line 56 of file XMP.php.

◆ $charset

bool string XMPReader::$charset = false
private

Character set like 'UTF-8' *.

Definition at line 68 of file XMP.php.

◆ $curItem

array XMPReader::$curItem = array()
private

Array to hold the current element (and previous element, and so on) *.

Definition at line 52 of file XMP.php.

◆ $extendedXMPOffset

int XMPReader::$extendedXMPOffset = 0
private

Definition at line 70 of file XMP.php.

◆ $itemLang

bool string XMPReader::$itemLang = false
private

Used for lang alts only *.

Definition at line 64 of file XMP.php.

Referenced by saveValue().

◆ $items

array XMPReader::$items
protected

XMP item configuration array *.

Definition at line 50 of file XMP.php.

◆ $mode

array XMPReader::$mode = array()
private

Stores the state the xmpreader is in (see MODE_FOO constants) *.

Definition at line 58 of file XMP.php.

Referenced by startElementModeInitial().

◆ $processingArray

bool XMPReader::$processingArray = false
private

If we're doing a seq or bag.

Definition at line 62 of file XMP.php.

◆ $results

array XMPReader::$results = array()
private

Array to hold results *.

Definition at line 60 of file XMP.php.

Referenced by getResults().

◆ $xmlParser

resource XMPReader::$xmlParser
private

A resource handle for the XML parser *.

Definition at line 66 of file XMP.php.

◆ MODE_ALT

const XMPReader::MODE_ALT = 15

Definition at line 94 of file XMP.php.

◆ MODE_BAG

const XMPReader::MODE_BAG = 13

Definition at line 92 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_BAGSTRUCT

const XMPReader::MODE_BAGSTRUCT = 16

Definition at line 95 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_IGNORE

const XMPReader::MODE_IGNORE = 1

Definition at line 82 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_INITIAL

const XMPReader::MODE_INITIAL = 0

These are various mode constants.

they are used to figure out what to do with an element when its encountered.

For example, MODE_IGNORE is used when processing a property we're not interested in. So if a new element pops up when we're in that mode, we ignore it.

Definition at line 81 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_LANG

const XMPReader::MODE_LANG = 14

Definition at line 93 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_LI

const XMPReader::MODE_LI = 2

Definition at line 83 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_LI_LANG

const XMPReader::MODE_LI_LANG = 3

Definition at line 84 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_QDESC

const XMPReader::MODE_QDESC = 4

Definition at line 85 of file XMP.php.

Referenced by doAttribs(), endElement(), and startElement().

◆ MODE_SEQ

const XMPReader::MODE_SEQ = 12

Definition at line 91 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_SIMPLE

const XMPReader::MODE_SIMPLE = 10

Definition at line 89 of file XMP.php.

Referenced by endElement(), and startElement().

◆ MODE_STRUCT

const XMPReader::MODE_STRUCT = 11

Definition at line 90 of file XMP.php.

Referenced by endElement(), and startElement().

◆ NS_RDF

const XMPReader::NS_RDF = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'

Definition at line 97 of file XMP.php.

◆ NS_XML

const XMPReader::NS_XML = 'http://www.w3.org/XML/1998/namespace'

Definition at line 98 of file XMP.php.


The documentation for this class was generated from the following file: