Parsoid
A bidirectional parser between wikitext and HTML5
|
Interface for collecting the results of a parse. More...
Public Member Functions | |
addCategory ( $c, $sort='') | |
Add a category, with the given sort key. | |
addLink (LinkTarget $link, $id=null) | |
Record a local or interwiki inline link for saving in future link tables. | |
addImage (LinkTarget $name, $timestamp=null, $sha1=null) | |
Register a file dependency for this output. | |
addLanguageLink (LinkTarget $lt) | |
Add a language link. | |
addWarningMsg (string $msg,... $args) | |
Add a warning to the output for this page. | |
addExternalLink (string $url) | |
setOutputFlag (string $name, bool $val=true) | |
Provides a uniform interface to various boolean flags stored in the content metadata. | |
appendOutputStrings (string $name, array $value) | |
Provides a uniform interface to various appendable lists of strings stored in the content metadata. | |
setNumericPageProperty (string $propName, $numericValue) | |
Set a numeric page property whose value is intended to be sorted and indexed. | |
setUnsortedPageProperty (string $propName, string $value='') | |
Set a page property whose value is not intended to be sorted and indexed. | |
setExtensionData (string $key, $value) | |
Attaches arbitrary data to this content. | |
appendExtensionData (string $key, $value, string $strategy=self::MERGE_STRATEGY_UNION) | |
Appends arbitrary data to this ParserObject. | |
setJsConfigVar (string $key, $value) | |
Add a variable to be set in mw.config in JavaScript. | |
appendJsConfigVar (string $key, string $value, string $strategy=self::MERGE_STRATEGY_UNION) | |
Append a value to a variable to be set in mw.config in JavaScript. | |
addModules (array $modules) | |
addModuleStyles (array $modules) | |
setLimitReportData (string $key, $value) | |
Sets parser limit report data for a key. | |
setTOCData (TOCData $tocData) | |
Sets Table of Contents data for this page. | |
setIndicator ( $name, $content) | |
Set the content for an indicator. | |
Public Attributes | |
const | MERGE_STRATEGY_UNION = 'union' |
Merge strategy to use for ContentMetadataCollector accumulators: "union" means that values are strings, stored as a set, and exposed as a PHP associative array mapping from values to true . | |
Interface for collecting the results of a parse.
This class is used by Parsoid to record metainformation about a particular bit of parsed content which is extracted during the parse. This includes (for example) table of contents information, and lists of links/categories/templates/images present in the content. Expected cache lifetime of this parsed content is also recorded here, as it is influenced by certain things which may be encountered during the parse.
In core this is implemented by ParserOutput. Core uses ParserOutput to record the rendered HTML (and rendered table of contents HTML), but on the Parsoid side we're going to keep rendered HTML DOM out of this interface (we use PageBundle for this).
Wikimedia\Parsoid\Core\ContentMetadataCollector::addCategory | ( | $c, | |
$sort = '' ) |
Add a category, with the given sort key.
LinkTarget | $c | Category name |
string | $sort | Sort key (pass the empty string to use the default) |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::addExternalLink | ( | string | $url | ) |
string | $url | External link URL |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::addImage | ( | LinkTarget | $name, |
$timestamp = null, | |||
$sha1 = null ) |
Register a file dependency for this output.
LinkTarget | $name | Title dbKey |
string | false | null | $timestamp | MW timestamp of file creation (or false if non-existing) |
string | false | null | $sha1 | Base 36 SHA-1 of file (or false if non-existing) |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::addLanguageLink | ( | LinkTarget | $lt | ) |
Add a language link.
LinkTarget | $lt |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::addLink | ( | LinkTarget | $link, |
$id = null ) |
Record a local or interwiki inline link for saving in future link tables.
LinkTarget | $link | (used to require Title until 1.38) |
int | null | $id | Optional known page_id so we can skip the lookup (generally not used by Parsoid) |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::addModules | ( | array | $modules | ) |
string[] | $modules |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::addModuleStyles | ( | array | $modules | ) |
string[] | $modules |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::addWarningMsg | ( | string | $msg, |
$args ) |
Add a warning to the output for this page.
string | $msg | The localization message key for the warning |
mixed | ...$args Optional arguments for the message |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::appendExtensionData | ( | string | $key, |
$value, | |||
string | $strategy = self::MERGE_STRATEGY_UNION ) |
Appends arbitrary data to this ParserObject.
This can be used to store some information in the ParserOutput object for later use during page output. The data will be cached along with the ParserOutput object, but unlike data set using set*PageProperty(), it is not recorded in the database.
See ::setExtensionData() for more details on rationale and use.
In order to provide for out-of-order/asynchronous/incremental parsing, this method appends values to a set. See ::setExtensionData() for the flag-like version of this method.
"0"
as a value you will get [0=>true]
out.string | $key | The key for accessing the data. Extensions should take care to avoid conflicts in naming keys. It is suggested to use the extension's name as a prefix. |
int | string | $value | The value to append to the list. |
string | $strategy | Merge strategy: only MW_MERGE_STRATEGY_UNION is currently supported and external callers should treat this parameter as |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::appendJsConfigVar | ( | string | $key, |
string | $value, | ||
string | $strategy = self::MERGE_STRATEGY_UNION ) |
Append a value to a variable to be set in mw.config in JavaScript.
In order to ensure the result is independent of the parse order, the value of this key will be an associative array, mapping all of the values set under that key to true. (The array is implicitly ordered in PHP, but you should treat it as unordered.) If you want a non-array type for the key, and can ensure that only a single value will be set, you should use ::setJsConfigVar() instead.
"0"
as a value you will get [0=>true]
out.string | $key | Key to use under mw.config |
string | $value | Value to append to the configuration variable. |
string | $strategy | Merge strategy: only MW_MERGE_STRATEGY_UNION is currently supported and external callers should treat this parameter as |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::appendOutputStrings | ( | string | $name, |
array | $value ) |
Provides a uniform interface to various appendable lists of strings stored in the content metadata.
Strings internal to MediaWiki core should have names which are constants in ParserOutputStrings. Extensions should use ::setExtensionData() rather than creating new keys here in order to prevent namespace conflicts.
string | $name | A string name |
string[] | $value |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setExtensionData | ( | string | $key, |
$value ) |
Attaches arbitrary data to this content.
This can be used to store some information for later use during page output. The data will be cached along with the parsed page, but unlike data set using set*PageProperty(), it is not recorded in the database.
To use setExtensionData() to pass extension information from a hook inside the parser to a hook in the page output, use this in the parser hook:
And then later, in OutputPageParserOutput or similar:
string | $key | The key for accessing the data. Extensions should take care to avoid conflicts in naming keys. It is suggested to use the extension's name as a prefix. Keys beginning with mw- are reserved for use by mediawiki core. |
mixed | $value | The value to set. Setting a value to null is equivalent to removing the value. |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setIndicator | ( | $name, | |
$content ) |
Set the content for an indicator.
string | $name | |
string | $content | |
-taint | $content | exec_html |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setJsConfigVar | ( | string | $key, |
$value ) |
Add a variable to be set in mw.config in JavaScript.
In order to ensure the result is independent of the parse order, the values set here must be unique – that is, you can pass the same $key multiple times but ONLY if the $value is identical each time. If you want to collect multiple pieces of data under a single key, use ::appendJsConfigVar().
string | $key | Key to use under mw.config |
mixed | null | $value | Value of the configuration variable. |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setLimitReportData | ( | string | $key, |
$value ) |
Sets parser limit report data for a key.
The key is used as the prefix for various messages used for formatting:
Note that all values are interpreted as wikitext, and so should be encoded with htmlspecialchars() as necessary, but should avoid complex HTML for sanity of display in the "NewPP limit report" comment.
string | $key | Message key |
mixed | $value | Appropriate for Message::params() |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setNumericPageProperty | ( | string | $propName, |
$numericValue ) |
Set a numeric page property whose value is intended to be sorted and indexed.
The sort key used for the property will be the value, coerced to a number. It is also possible to efficiently look up all the pages with a certain property (the "presence" of the property is also indexed; see Special:PagesWithProp, list=pageswithprop).
The page property is stored in the page_props database table. The page_props table is a key-value store indexed by the page ID. This allows the parser to set a property on a page whose value can then be quickly retrieved given the page ID or via a DB join when given the page title. The page_props table is also indexed on the numeric sort key passed as $numericValue to this method. This allows for efficient "top k" queries of pages with respect to a given property.
In the future, we may allow the value to be specified independent of sort key (T357783).
The setNumericPageProperty() method is thus used to propagate properties from the parsed page to request contexts other than a page view of the currently parsed article.
Some applications examples:
proofread_page_quality_level
as a numeric property to allow efficient retrieval of pages of a certain quality level.If you need a placeholder value, you likely should be using ::setUnsortedPageProperty() instead.
And then later, in the OutputPageParserOutput hook or similar:
string | $propName | The name of the page property |
int | float | string | $numericValue | the numeric value |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setOutputFlag | ( | string | $name, |
bool | $val = true ) |
Provides a uniform interface to various boolean flags stored in the content metadata.
Flags internal to MediaWiki core should have names which are constants in ParserOutputFlags. Extensions should use ::setExtensionData() rather than creating new flags with ::setOutputFlag() in order to prevent namespace conflicts.
string | $name | A flag name |
bool | $val |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setTOCData | ( | TOCData | $tocData | ) |
Sets Table of Contents data for this page.
Note that merging of TOCData is not supported; exactly one fragment should set TOCData.
TOCData | $tocData |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
Wikimedia\Parsoid\Core\ContentMetadataCollector::setUnsortedPageProperty | ( | string | $propName, |
string | $value = '' ) |
Set a page property whose value is not intended to be sorted and indexed.
It is still possible to efficiently look up all the pages with a certain property (the "presence" of the property is indexed; see Special:PagesWithProp, list=pageswithprop).
The page property is stored in the page_props database table. The page_props table is a key-value store indexed by the page ID. This allows the parser to set a property on a page whose value can then be quickly retrieved given the page ID or via a DB join when given the page title.
The setUnsortedPageProperty() method is thus used to propagate properties from the parsed page to request contexts other than a page view of the currently parsed article.
Some applications examples:
It is recommended to use the empty string if you need a placeholder value (ie, if it is the presence of the property which is important, not the value the property is set to).
And then later, in the OutputPageParserOutput hook or similar:
string | $propName | The name of the page property |
string | $value | Optional value; defaults to the empty string. |
Implemented in Wikimedia\Parsoid\Config\StubMetadataCollector.
const Wikimedia\Parsoid\Core\ContentMetadataCollector::MERGE_STRATEGY_UNION = 'union' |
Merge strategy to use for ContentMetadataCollector accumulators: "union" means that values are strings, stored as a set, and exposed as a PHP associative array mapping from values to true
.
This constant should be treated as