Parsoid
A bidirectional parser between wikitext and HTML5
Loading...
Searching...
No Matches
Wikimedia\Parsoid\Core\ContentMetadataCollector Interface Reference

Interface for collecting the results of a parse. More...

+ Inheritance diagram for Wikimedia\Parsoid\Core\ContentMetadataCollector:

Public Member Functions

 addCategory ( $c, $sort='')
 Add a category, with the given sort key.
 
 addLink (LinkTarget $link, $id=null)
 Record a local or interwiki inline link for saving in future link tables.
 
 addImage (LinkTarget $name, $timestamp=null, $sha1=null)
 Register a file dependency for this output.
 
 addLanguageLink (LinkTarget $lt)
 Add a language link.
 
 addWarningMsg (string $msg,... $args)
 Add a warning to the output for this page.
 
 addExternalLink (string $url)
 
 setOutputFlag (string $name, bool $val=true)
 Provides a uniform interface to various boolean flags stored in the content metadata.
 
 appendOutputStrings (string $name, array $value)
 Provides a uniform interface to various appendable lists of strings stored in the content metadata.
 
 setPageProperty (string $name, $value)
 Set a page property to be stored in the page_props database table.
 
 setNumericPageProperty (string $propName, $numericValue)
 Set a numeric page property whose value is intended to be sorted and indexed.
 
 setUnsortedPageProperty (string $propName, string $value='')
 Set a page property whose value is not intended to be sorted and indexed.
 
 setExtensionData (string $key, $value)
 Attaches arbitrary data to this content.
 
 appendExtensionData (string $key, $value, string $strategy=self::MERGE_STRATEGY_UNION)
 Appends arbitrary data to this ParserObject.
 
 setJsConfigVar (string $key, $value)
 Add a variable to be set in mw.config in JavaScript.
 
 appendJsConfigVar (string $key, string $value, string $strategy=self::MERGE_STRATEGY_UNION)
 Append a value to a variable to be set in mw.config in JavaScript.
 
 addModules (array $modules)
 
 addModuleStyles (array $modules)
 
 setLimitReportData (string $key, $value)
 Sets parser limit report data for a key.
 
 setTOCData (TOCData $tocData)
 Sets Table of Contents data for this page.
 
 setIndicator ( $name, $content)
 Set the content for an indicator.
 
 getIndicators ()
 

Public Attributes

const MERGE_STRATEGY_UNION = 'union'
 Merge strategy to use for ContentMetadataCollector accumulators: "union" means that values are strings, stored as a set, and exposed as a PHP associative array mapping from values to true.
 

Detailed Description

Interface for collecting the results of a parse.

This class is used by Parsoid to record metainformation about a particular bit of parsed content which is extracted during the parse. This includes (for example) table of contents information, and lists of links/categories/templates/images present in the content. Expected cache lifetime of this parsed content is also recorded here, as it is influenced by certain things which may be encountered during the parse.

In core this is implemented by ParserOutput. Core uses ParserOutput to record the rendered HTML (and rendered table of contents HTML), but on the Parsoid side we're going to keep rendered HTML DOM out of this interface (we use PageBundle for this).

Member Function Documentation

◆ addCategory()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addCategory ( $c,
$sort = '' )

Add a category, with the given sort key.

Parameters
LinkTarget$cCategory name
string$sortSort key (pass the empty string to use the default)

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ addExternalLink()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addExternalLink ( string $url)
Parameters
string$urlExternal link URL

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ addImage()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addImage ( LinkTarget $name,
$timestamp = null,
$sha1 = null )

Register a file dependency for this output.

Parameters
LinkTarget$nameTitle dbKey
string | false | null$timestampMW timestamp of file creation (or false if non-existing)
string | false | null$sha1Base 36 SHA-1 of file (or false if non-existing)

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ addLanguageLink()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addLanguageLink ( LinkTarget $lt)

Add a language link.

Parameters
LinkTarget$lt

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ addLink()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addLink ( LinkTarget $link,
$id = null )

Record a local or interwiki inline link for saving in future link tables.

Parameters
LinkTarget$link(used to require Title until 1.38)
int | null$idOptional known page_id so we can skip the lookup (generally not used by Parsoid)

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ addModules()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addModules ( array $modules)
See also
OutputPage::addModules
Parameters
string[]$modules

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ addModuleStyles()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addModuleStyles ( array $modules)
See also
OutputPage::addModuleStyles
Parameters
string[]$modules

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ addWarningMsg()

Wikimedia\Parsoid\Core\ContentMetadataCollector::addWarningMsg ( string $msg,
$args )

Add a warning to the output for this page.

Parameters
string$msgThe localization message key for the warning
mixed...$args Optional arguments for the message

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ appendExtensionData()

Wikimedia\Parsoid\Core\ContentMetadataCollector::appendExtensionData ( string $key,
$value,
string $strategy = self::MERGE_STRATEGY_UNION )

Appends arbitrary data to this ParserObject.

This can be used to store some information in the ParserOutput object for later use during page output. The data will be cached along with the ParserOutput object, but unlike data set using setPageProperty(), it is not recorded in the database.

See ::setExtensionData() for more details on rationale and use.

In order to provide for out-of-order/asynchronous/incremental parsing, this method appends values to a set. See ::setExtensionData() for the flag-like version of this method.

Note
Only values which can be array keys are currently supported as values. Be aware that array keys which 'look like' numbers are converted to ints by PHP, and so if you put in "0" as a value you will get [0=>true] out.
Parameters
string$keyThe key for accessing the data. Extensions should take care to avoid conflicts in naming keys. It is suggested to use the extension's name as a prefix.
int | string$valueThe value to append to the list.
string$strategyMerge strategy: only MW_MERGE_STRATEGY_UNION is currently supported and external callers should treat this parameter as

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ appendJsConfigVar()

Wikimedia\Parsoid\Core\ContentMetadataCollector::appendJsConfigVar ( string $key,
string $value,
string $strategy = self::MERGE_STRATEGY_UNION )

Append a value to a variable to be set in mw.config in JavaScript.

In order to ensure the result is independent of the parse order, the value of this key will be an associative array, mapping all of the values set under that key to true. (The array is implicitly ordered in PHP, but you should treat it as unordered.) If you want a non-array type for the key, and can ensure that only a single value will be set, you should use ::setJsConfigVar() instead.

Note
Only values which can be array keys are currently supported as values. Be aware that array keys which 'look like' numbers are converted to ints by PHP, and so if you put in "0" as a value you will get [0=>true] out.
Parameters
string$keyKey to use under mw.config
string$valueValue to append to the configuration variable.
string$strategyMerge strategy: only MW_MERGE_STRATEGY_UNION is currently supported and external callers should treat this parameter as

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ appendOutputStrings()

Wikimedia\Parsoid\Core\ContentMetadataCollector::appendOutputStrings ( string $name,
array $value )

Provides a uniform interface to various appendable lists of strings stored in the content metadata.

Strings internal to MediaWiki core should have names which are constants in ParserOutputStrings. Extensions should use ::setExtensionData() rather than creating new keys here in order to prevent namespace conflicts.

Parameters
string$nameA string name
string[]$value

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ getIndicators()

Wikimedia\Parsoid\Core\ContentMetadataCollector::getIndicators ( )
Returns
array<string,string>

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setExtensionData()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setExtensionData ( string $key,
$value )

Attaches arbitrary data to this content.

This can be used to store some information for later use during page output. The data will be cached along with the parsed page, but unlike data set using setPageProperty(), it is not recorded in the database.

To use setExtensionData() to pass extension information from a hook inside the parser to a hook in the page output, use this in the parser hook:

Example:
$parser->getOutput()->setExtensionData( 'my_ext_foo', '...' );

And then later, in OutputPageParserOutput or similar:

Example:
$output->getExtensionData( 'my_ext_foo' );
Note
Only scalar values, e.g. numbers, strings, arrays or MediaWiki\Json\JsonUnserializable instances are supported as a value. Attempt to set other class instance as a extension data will break ParserCache for the page.
As with ::setJsConfigVar(), setting a page property to multiple conflicting values during the parse is not supported.
Parameters
string$keyThe key for accessing the data. Extensions should take care to avoid conflicts in naming keys. It is suggested to use the extension's name as a prefix. Keys beginning with mw- are reserved for use by mediawiki core.
mixed$valueThe value to set. Setting a value to null is equivalent to removing the value.

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setIndicator()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setIndicator ( $name,
$content )

Set the content for an indicator.

Parameters
string$name
string$content
-taint$contentexec_html

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setJsConfigVar()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setJsConfigVar ( string $key,
$value )

Add a variable to be set in mw.config in JavaScript.

In order to ensure the result is independent of the parse order, the values set here must be unique – that is, you can pass the same $key multiple times but ONLY if the $value is identical each time. If you want to collect multiple pieces of data under a single key, use ::appendJsConfigVar().

Parameters
string$keyKey to use under mw.config
mixed | null$valueValue of the configuration variable.

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setLimitReportData()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setLimitReportData ( string $key,
$value )

Sets parser limit report data for a key.

The key is used as the prefix for various messages used for formatting:

  • $key: The label for the field in the limit report
  • $key-value-text: Message used to format the value in the "NewPP limit report" HTML comment. If missing, uses $key-format.
  • $key-value-html: Message used to format the value in the preview limit report table. If missing, uses $key-format.
  • $key-value: Message used to format the value. If missing, uses "$1".

Note that all values are interpreted as wikitext, and so should be encoded with htmlspecialchars() as necessary, but should avoid complex HTML for sanity of display in the "NewPP limit report" comment.

Parameters
string$keyMessage key
mixed$valueAppropriate for Message::params()

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setNumericPageProperty()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setNumericPageProperty ( string $propName,
$numericValue )

Set a numeric page property whose value is intended to be sorted and indexed.

The sort key used for the property will be the value, coerced to a number.

See ::setPageProperty() for details.

In the future, we may allow the value to be specified independent of sort key (T357783).

Parameters
string$propNameThe name of the page property
int | float | string$numericValuethe numeric value
Since
1.42

◆ setOutputFlag()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setOutputFlag ( string $name,
bool $val = true )

Provides a uniform interface to various boolean flags stored in the content metadata.

Flags internal to MediaWiki core should have names which are constants in ParserOutputFlags. Extensions should use ::setExtensionData() rather than creating new flags with ::setOutputFlag() in order to prevent namespace conflicts.

Parameters
string$nameA flag name
bool$val

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setPageProperty()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setPageProperty ( string $name,
$value )

Set a page property to be stored in the page_props database table.

page_props is a key-value store indexed by the page ID. This allows the parser to set a property on a page which can then be quickly retrieved given the page ID or via a DB join when given the page title.

Since 1.23, page_props are also indexed by numeric value, to allow for efficient "top k" queries of pages wrt a given property. This only works if the value is passed as a int, float, or bool. Since 1.42 you should use ::setNumericPageProperty() if you want your page property value to be indexed, which will ensure that the value is of the proper type.

setPageProperty() is thus used to propagate properties from the parsed page to request contexts other than a page view of the currently parsed article.

Some applications examples:

  • To implement hidden categories, hiding pages from category listings by storing a page property.
  • Overriding the displayed article title (ParserOutput::setDisplayTitle()).
  • To implement image tagging, for example displaying an icon on an image thumbnail to indicate that it is listed for deletion on Wikimedia Commons. This is not actually implemented, yet but would be pretty cool.
Note
Use of non-scalar values (anything other than string|int|float|bool) has been deprecated in 1.42. Although any JSON-serializable value can be stored/fetched in ParserOutput, when the values are stored to the database (in deferred/LinksUpdate/PagePropsTable.php) they will be converted: booleans will be converted to '0' and '1', null will become '', and everything else will be cast to string (not JSON-serialized). Page properties obtained from the PageProps service will thus always be strings.
The sort key stored in the database will be NULL unless the value passed here is an int|float|bool. If you do not want your property value indexed and sorted (for example, the value is a title string which can be numeric but only incidentally, like when it gets retrieved from an array key) be sure to cast to string or use ::setUnsortedPageProperty(). If you do want your property value indexed and sorted, you should use ::setNumericPageProperty() instead as this will ensure the value type is correct. Note that either way it is possible to efficiently look up all the pages with a certain property; we are only talking about sorting the values assigned to the property, for example for a "top N values of the property" query.
Note that ::getPageProperty()/::setPageProperty() do not do any conversions themselves; you should therefore be careful to distinguish values returned from the PageProp service (always strings) from values retrieved from a ParserOutput.
Do not use setPageProperty() to set a property which is only used in a context where the ParserOutput object itself is already available, for example a normal page view. There is no need to save such a property in the database since the text is already parsed; use ::setExtensionData() instead.
Example:
$parser->getOutput()->setExtensionData( 'my_ext_foo', '...' );

And then later, in OutputPageParserOutput or similar:

Example:
$output->getExtensionData( 'my_ext_foo' );
Note
The use of null as a value is deprecated since 1.42; use the empty string instead if you need a placeholder value, or ::unsetPageProperty() if you mean to remove a page property.
The use of non-string values is deprecated since 1.42; if you need an page property value with a sort index use ::setNumericPageProperty().
Parameters
string$name
int | float | string | bool | null$value
Since
1.38

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setTOCData()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setTOCData ( TOCData $tocData)

Sets Table of Contents data for this page.

Note that merging of TOCData is not supported; exactly one fragment should set TOCData.

Parameters
TOCData$tocData

Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.

◆ setUnsortedPageProperty()

Wikimedia\Parsoid\Core\ContentMetadataCollector::setUnsortedPageProperty ( string $propName,
string $value = '' )

Set a page property whose value is not intended to be sorted and indexed.

See ::setPageProperty() for details. It is recommended to use the empty string if you need a placeholder value (ie, if it is the presence of the property which is important, not the value the property is set to).

It is still possible to efficiently look up all the pages with a certain property (the "presence" of it is indexed; see Special:PagesWithProp, list=pageswithprop).

Parameters
string$propNameThe name of the page property
string$valueOptional value; defaults to the empty string.
Since
1.42

Member Data Documentation

◆ MERGE_STRATEGY_UNION

const Wikimedia\Parsoid\Core\ContentMetadataCollector::MERGE_STRATEGY_UNION = 'union'

Merge strategy to use for ContentMetadataCollector accumulators: "union" means that values are strings, stored as a set, and exposed as a PHP associative array mapping from values to true.

This constant should be treated as


The documentation for this interface was generated from the following file: