Parsoid
A bidirectional parser between wikitext and HTML5
|
Section metadata for generating TOC. More...
Public Member Functions | |
__construct (int $tocLevel=0, int $hLevel=-1, string $line='', string $number='', string $index='', ?string $fromTitle=null, ?int $codepointOffset=null, string $anchor='', string $linkAnchor='', ?array $extensionData=null) | |
setExtensionData (string $key, $value) | |
Attaches arbitrary data to this SectionMetadata object. | |
appendExtensionData (string $key, $value) | |
Appends arbitrary data to this SectionMetadata. | |
getExtensionData ( $key) | |
Gets extension data previously attached to this SectionMetadata. | |
toArray () | |
Alias for :toLegacy(), for b/c compatibility only. | |
toLegacy () | |
Return as associative array, in the format returned by the action API (including the order of fields and the value types). | |
jsonSerialize () | |
toJsonArray () | |
prettyPrint (int $indent=0) | |
For use in parser tests and wherever else humans might appreciate some formatting in the JSON encoded output. | |
Static Public Member Functions | |
static | fromArray (array $data) |
Alias for :fromLegacy(), for b/c compatibility only. | |
static | fromLegacy (array $data) |
Create a new SectionMetadata object from an array in the legacy format returned by the action API. | |
static | newFromJsonArray (array $json) |
Public Attributes | |
int | $hLevel |
The heading tag level: a 1 here means an. | |
int | $tocLevel |
This is a one-indexed TOC level and the nesting level. | |
string | $line |
HTML heading of the section. | |
string | $number |
TOC number string (3.1.3, 4.5.2, etc.) | |
string | $index |
Section id (integer, assigned in depth first traversal order) Template generated sections get a "T-" prefix. | |
string | $fromTitle |
The title of the page that generated this heading. | |
int | $codepointOffset |
Codepoint offset where the section shows up in wikitext; this is null if this section comes from a template, if it comes from a literal HTML <h_> tag, or otherwise doesn't correspond to a "preprocessor
section". | |
string | $anchor |
Anchor attribute. | |
string | $linkAnchor |
Anchor URL fragment. | |
Section metadata for generating TOC.
This is not the complete data for the article section, just the information needed to generate the table of contents.
For now, this schema matches whatever is generated by Parser.php. Parsoid will attempt to match this output for now.
Parser.php::finalizeHeadings() is the authoritative source for how some of these properties are computed right now, especially for the $line, $anchor, and $linkAnchor properties below.
Linker.php::tocLine() and ::makeHeadline() demonstrate how these properties are used to create headings and table of contents lines.
Wikimedia\Parsoid\Core\SectionMetadata::__construct | ( | int | $tocLevel = 0, |
int | $hLevel = -1, | ||
string | $line = '', | ||
string | $number = '', | ||
string | $index = '', | ||
?string | $fromTitle = null, | ||
?int | $codepointOffset = null, | ||
string | $anchor = '', | ||
string | $linkAnchor = '', | ||
?array | $extensionData = null ) |
int | $tocLevel | One-indexed TOC level and the nesting level |
int | $hLevel | The heading tag level |
string | $line | Stripped headline text |
string | $number | TOC number string (3.1.3, 4.5.2, etc) |
string | $index | Section id |
?string | $fromTitle | The title of the page or template that generated this heading, or null. |
?int | $codepointOffset | Codepoint offset (# of characters) where the section shows up in wikitext, or null if this doesn't correspond to a "preprocesor section". (Be careful if using JavaScript, as JavaScript "characters" are UCS-2 encoded and don't correspond directly to code points.) |
string | $anchor | "True" value of the ID attribute |
string | $linkAnchor | URL-escaped value of the anchor, for use in constructing a URL fragment link |
?array | $extensionData | Extension data passed in as an associative array |
Wikimedia\Parsoid\Core\SectionMetadata::appendExtensionData | ( | string | $key, |
$value ) |
Appends arbitrary data to this SectionMetadata.
This can be used to store some information about the section in the ParserOutput object for later use during page output.
See ::setExtensionData() for more details on rationale and use.
string | $key | The key for accessing the data. Extensions should take care to avoid conflicts in naming keys. It is suggested to use the extension's name as a prefix. |
int | string | $value | The value to append to the list. |
|
static |
Alias for :fromLegacy(), for b/c compatibility only.
array | $data |
|
static |
Create a new SectionMetadata object from an array in the legacy format returned by the action API.
This is useful for backward-compatibility, but is expected to be replaced by conversion to/from JSON in the future.
array | $data | Associative array with section metadata |
Wikimedia\Parsoid\Core\SectionMetadata::getExtensionData | ( | $key | ) |
Gets extension data previously attached to this SectionMetadata.
string | $key | The key to look up |
Wikimedia\Parsoid\Core\SectionMetadata::prettyPrint | ( | int | $indent = 0 | ) |
For use in parser tests and wherever else humans might appreciate some formatting in the JSON encoded output.
For now, nothing special.
int | $indent | Additional indentation to apply (defaults to zero) |
Wikimedia\Parsoid\Core\SectionMetadata::setExtensionData | ( | string | $key, |
$value ) |
Attaches arbitrary data to this SectionMetadata object.
This can be used to store some information about this section in the ParserOutput object for later use during page output. The data will be cached along with the ParserOutput object.
This method is provided to overcome the unsafe practice of attaching extra information to a section by directly assigning member variables.
See ParserOutput::setExtensionData() in core for further information about typical usage in hooks.
Setting conflicting values for the same key is not allowed. If you call ::setExtensionData() multiple times with the same key on a SectionMetadata, is is expected that the value will be identical each time. If you want to collect multiple pieces of data under a single key, use ::appendExtensionData().
string | $key | The key for accessing the data. Extensions should take care to avoid conflicts in naming keys. It is suggested to use the extension's name as a prefix. Using the prefix mw: is reserved for core. |
mixed | $value | The value to set. Setting a value to null is equivalent to removing the value. |
Wikimedia\Parsoid\Core\SectionMetadata::toArray | ( | ) |
Wikimedia\Parsoid\Core\SectionMetadata::toLegacy | ( | ) |
Return as associative array, in the format returned by the action API (including the order of fields and the value types).
This is helpful as b/c support while we transition to objects.
string Wikimedia\Parsoid\Core\SectionMetadata::$anchor |
Anchor attribute.
This property is the "true" value of the ID attribute, and should be used when looking up a heading or setting an attribute, for example using Document.getElementById() or Element.setAttribute('id',...).
This value is not HTML-entity escaped; if you are writing HTML as a literal string, you should still entity-escape ampersands and single/double quotes as appropriate.
This value is not URL-escaped either; instead use the linkAnchor
property if you are constructing a URL to target this section.
The anchor attribute is based on the $line property, but does extra processing to turn it into a valid attribute:
int Wikimedia\Parsoid\Core\SectionMetadata::$codepointOffset |
Codepoint offset where the section shows up in wikitext; this is null if this section comes from a template, if it comes from a literal HTML <h_> tag, or otherwise doesn't correspond to a "preprocessor section".
string Wikimedia\Parsoid\Core\SectionMetadata::$fromTitle |
The title of the page that generated this heading.
For template-generated sections, this will be the template title. This string is in "prefixed DB key" format.
int Wikimedia\Parsoid\Core\SectionMetadata::$hLevel |
The heading tag level: a 1 here means an.
string Wikimedia\Parsoid\Core\SectionMetadata::$line |
HTML heading of the section.
Only a narrow set of HTML tags are allowed here.
This starts with the parsed headline seen in wikitext and
string Wikimedia\Parsoid\Core\SectionMetadata::$linkAnchor |
Anchor URL fragment.
This is very similar to the $anchor property, but is appropriately URL-escaped to make it appropriate to use in constructing a URL fragment link. You should almost always prepend a #
symbol to linkAnchor
if you are using it correctly. You are still responsible for HTML-escaping the resulting URL if you are emitting this as an HTML attribute.
string Wikimedia\Parsoid\Core\SectionMetadata::$number |
TOC number string (3.1.3, 4.5.2, etc.)
int Wikimedia\Parsoid\Core\SectionMetadata::$tocLevel |
This is a one-indexed TOC level and the nesting level.
So, if a page has a H2-H4-H6, then, those levels 2,4,6 correspond to TOC-levels 1,2,3.