These utilites pertain to querying / extracting / modifying wikitext information from the DOM.
More...
|
static | hasLiteralHTMLMarker (DataParsoid $dp) |
| Check whether a node's data-parsoid object includes an indicator that the original wikitext was a literal HTML element (like table or p)
|
|
static | isLiteralHTMLNode (?Node $node) |
| Run a node through hasLiteralHTMLMarker .
|
|
static | isZeroWidthWikitextElt (Node $node) |
|
static | isBlockNodeWithVisibleWT (Node $node) |
| Is $node a block node that is also visible in wikitext? An example of an invisible block node is a <p> -tag that Parsoid generated, or a <ul> , <ol> tag.
|
|
static | isATagFromWikiLinkSyntax (Element $node) |
| Helper functions to detect when an A-$node uses [[..]]/[..]/... style syntax (for wikilinks, ext links, url links).
|
|
static | isATagFromExtLinkSyntax (Element $node) |
| Helper function to detect when an A-node uses ext-link syntax.
|
|
static | isATagFromURLLinkSyntax (Element $node) |
| Helper function to detect when an A-node uses url-link syntax.
|
|
static | isATagFromMagicLinkSyntax (Element $node) |
| Helper function to detect when an A-node uses magic-link syntax.
|
|
static | matchTplType (Element $node) |
| Check whether a node's typeof indicates that it is a template expansion.
|
|
static | hasExpandedAttrsType (Element $node) |
| Check whether a typeof indicates that it signifies an expanded attribute.
|
|
static | isTplMarkerMeta (Node $node) |
| Check whether a node is a meta tag that signifies a template expansion.
|
|
static | isTplStartMarkerMeta (Node $node) |
| Check whether a node is a meta signifying the start of a template expansion.
|
|
static | isTplEndMarkerMeta (Node $node) |
| Check whether a node is a meta signifying the end of a template expansion.
|
|
static | isNewElt (Node $node) |
| This tests whether a DOM node is a new node added during an edit session or an existing node from parsed wikitext.
|
|
static | isIndentPre (Node $node) |
| Check whether a pre is caused by indentation in the original wikitext.
|
|
static | isInlineMedia (Node $node) |
|
static | isGeneratedFigure (Node $node) |
|
static | indentPreDSRCorrection (Node $textNode) |
| Find how much offset is necessary for the DSR of an indent-originated pre tag.
|
|
static | isEncapsulatedDOMForestRoot (Node $node) |
| Check if $node is a root in an encapsulated DOM forest.
|
|
static | isRedirectLink (?Node $node) |
| Does $node represent a redirect link?
|
|
static | isCategoryLink (?Node $node) |
| Does $node represent a category link?
|
|
static | isSolTransparentLink (?Node $node) |
| Does $node represent a link that is sol-transparent?
|
|
static | emitsSolTransparentSingleLineWT (Node $node) |
| Check if '$node' emits wikitext that is sol-transparent in wikitext form.
|
|
static | isFallbackIdSpan (Node $node) |
| This is the span added to headings to add fallback ids for when legacy and HTML5 ids don't match up.
|
|
static | isRenderingTransparentNode (Node $node) |
| These are primarily 'metadata'-like $nodes that don't show up in output rendering.
|
|
static | inHTMLTableTag (Node $node) |
| Is $node nested inside a table tag that uses HTML instead of native wikitext?
|
|
static | isFirstEncapsulationWrapperNode (Node $node) |
| Is $node the first wrapper element of encapsulated content?
|
|
static | isFirstExtensionWrapperNode (Node $node) |
| Is $node the first wrapper element of extension content?
|
|
static | isExtensionOutputtingCoreMwDomSpec (Node $node, Env $env) |
| Checks whether a first encapsulation wrapper node is encapsulating an extension that outputs MediaWiki Core DOM Spec HTML (https://www.mediawiki.org/wiki/Specs/HTML)
|
|
static | isEncapsulationWrapper (Node $node) |
| Is $node an encapsulation wrapper elt?
|
|
static | isDOMFragmentWrapper (Node $node) |
| Is $node a DOMFragment wrapper?
|
|
static | isSealedFragmentOfType (Node $node, string $type) |
| Is $node a sealed DOMFragment of a specific type?
|
|
static | isParsoidSectionTag (Node $node) |
| Is $node a Parsoid-generated <section> tag?
|
|
static | fromExtensionContent (Node $node, ?string $extType=null) |
| Is the $node from extension content?
|
|
static | fromEncapsulatedContent (Node $node) |
| Is $node from encapsulated (template, extension, etc.) content?
|
|
static | getWTSource (Frame $frame, Element $node) |
| Compute, when possible, the wikitext source for a $node in an environment env.
|
|
static | getAboutSiblings (Node $node, ?string $about) |
| Gets all siblings that follow '$node' that have an 'about' as their about id.
|
|
static | skipOverEncapsulatedContent (Node $node) |
| This function is only intended to be used on encapsulated $nodes (Template/Extension/Param content).
|
|
static | encodeComment (string $comment) |
| Comment encoding/decoding.
|
|
static | decodeComment (string $comment) |
| Map an HTML DOM-escaped comment to a wikitext-escaped comment.
|
|
static | decodedCommentLength ( $node) |
| Utility function: we often need to know the wikitext DSR length for an HTML DOM comment value.
|
|
static | getExtTagName (Node $node) |
|
static | getNativeExt (Env $env, Node $node) |
|
static | isIncludeTag (string $name) |
| Is this an include directive?
|
|
static | isAnnotationTag (Env $env, string $name) |
|
static | isAnnOrExtTag (Env $env, string $name) |
| Check if tag is annotation or extension directive Adapted from similar grammar function.
|
|
static | createEmptyLocalizationFragment (Document $doc) |
| Creates a DocumentFragment containing a single span with type "mw:I18n".
|
|
static | createPageContentI18nFragment (Document $doc, string $key, ?array $params=null) |
| Creates an internationalization (i18n) message that will be localized into the page content language.
|
|
static | createInterfaceI18nFragment (Document $doc, string $key, ?array $params=null) |
| Creates an internationalization (i18n) message that will be localized into the user interface language.
|
|
static | createLangI18nFragment (Document $doc, Bcp47Code $lang, string $key, ?array $params=null) |
| Creates an internationalization (i18n) message that will be localized into an arbitrary language.
|
|
static | addPageContentI18nAttribute (Element $element, string $name, string $key, ?array $params=null) |
| Adds to $element the internationalization information needed for the attribute $name to be localized in a later pass into the page content language.
|
|
static | addInterfaceI18nAttribute (Element $element, string $name, string $key, ?array $params=null) |
| Adds to $element the internationalization information needed for the attribute $name to be localized in a later pass into the user interface language.
|
|
static | addLangI18nAttribute (Element $element, Bcp47Code $lang, string $name, string $key, ?array $params=null) |
| Adds to $element the internationalization information needed for the attribute $name to be localized in a later pass into the provided language.
|
|
static | matchAnnotationMeta (Node $node) |
| Check whether a node is an annotation meta; if yes, returns its type.
|
|
static | extractAnnotationType (Node $node, bool &$isStart=false) |
| Extract the annotation type, excluding potential "/End" suffix; returns null if not a valid annotation meta.
|
|
static | isAnnotationStartMarkerMeta (Node $node) |
| Check whether a node is a meta signifying the start of an annotated part of the DOM.
|
|
static | isAnnotationEndMarkerMeta (Node $node) |
| Check whether a node is a meta signifying the end of an annotated part of the DOM.
|
|
static | isMovedMetaTag (Node $node) |
| Check whether the meta tag was moved from its initial position.
|
|
static | isMarkerAnnotation (?Node $n) |
| Returns true if a node is a (start or end) annotation meta tag.
|
|
static | getMediaFormat (Element $node) |
| Extracts the media format from the attribute string.
|
|
static | hasVisibleCaption (Element $node) |
|
static | textContentFromCaption (Node $node) |
| Ref dom post-processing happens after adding media info, so the linkbacks aren't available in the textContent added to the alt.
|
|
These utilites pertain to querying / extracting / modifying wikitext information from the DOM.
- Note
- Many of these methods are not safe to use unless the DOM has been loaded and prepared, as they consult DataParsoid from the NodeData.
static Wikimedia\Parsoid\Utils\WTUtils::encodeComment |
( |
string | $comment | ) |
|
|
static |
Comment encoding/decoding.
- Some relevant phab tickets: T94055, T70146, T60184, T95039
The wikitext comment rule is very simple: ends a comment. This means we can have almost anything as the contents of a comment (except the string "-->", but see below), including several things that are not valid in HTML5 comments:
- For one, the html5 comment parsing algorithm [0] leniently accepts –!> as a closing comment tag, which differs from the php+tidy combo.
- If the comment's data matches /^-?>/, html5 will end the comment. For example, breaks up as (as text).
- Finally, comment data shouldn't contain two consecutive hyphen-minus characters (–), nor end in a hyphen-minus character (/-$/) as defined in the spec [1].
We work around all these problems by using HTML entity encoding inside the comment body. The characters -, >, and & must be encoded in order to prevent premature termination of the comment by one of the cases above. Encoding other characters is optional; all entities will be decoded during wikitext serialization.
In order to allow arbitrary content inside a wikitext comment, including the forbidden string "-->" we also do some minimal entity decoding on the wikitext. We are also limited by our inability to encode DSR attributes on the comment $node, so our wikitext entity decoding must be 1-to-1: that is, there must be a unique "decoded" string for every wikitext sequence, and for every decoded string there must be a unique wikitext which creates it.
The basic idea here is to replace every string ab*c with the string with one more b in it. This creates a string with no instance of "ac", so you can use 'ac' to encode one more code point. In this case a is "--&", "b" is "amp;", and "c" is "gt;" and we use ac to encode "-->" (which is otherwise unspeakable in wikitext).
Note that any user content which does not match the regular expression /–(>|&(amp;)*gt;)/ is unchanged in its wikitext representation, as shown in the first two examples below.
User-authored comment text Wikitext HTML5 DOM
& - > & - > & + > Use > here Use > here Use > here --> –> ++> –> –> ++> –> –&gt; ++&gt;
[0] http://www.w3.org/TR/html5/syntax.html#comment-start-state [1] http://www.w3.org/TR/html5/syntax.html#comments
Map a wikitext-escaped comment to an HTML DOM-escaped comment.
- Parameters
-
string | $comment | Wikitext-escaped comment. |
- Returns
- string DOM-escaped comment.