These utilities are for processing content that's generated by parsing source input (ex: wikitext)
More...
|
static | toXML (Node $node, array $options=[]) |
| XML Serializer.
|
|
static | ppToXML (Node $node, array $options=[]) |
| dataobject aware XML serializer, to be used in the DOM post-processing phase.
|
|
static | createDocument (string $html='', bool $validateXMLNames=false) |
| XXX: Don't use this outside of testing.
|
|
static | createAndLoadDocument (string $html, array $options=[]) |
| XXX: Don't use this outside of testing.
|
|
static | createAndLoadDocumentFragment (Document $doc, string $html, array $options=[]) |
|
static | extractDpAndSerialize (Node $node, array $options=[]) |
| Pull the data-parsoid script element out of the doc before serializing.
|
|
static | stripUnnecessaryWrappersAndSyntheticNodes (Element $node) |
| Strip Parsoid-inserted section wrappers, annotation wrappers, and synthetic nodes (fallback id spans with HTML4 ids for headings, auto-generated TOC metas and possibly other such in the future) from the DOM.
|
|
static | processAttributeEmbeddedHTML (ParsoidExtensionAPI $extAPI, Element $elt, Closure $proc) |
| Extensions might be interested in examining their content embedded in data-mw attributes that don't otherwise show up in the DOM.
|
|
static | shiftDSR (Env $env, Node $rootNode, callable $dsrFunc, ParsoidExtensionAPI $extAPI) |
| Shift the DOM Source Range (DSR) of a DOM fragment.
|
|
static | convertOffsets (Env $env, Document $doc, string $from, string $to) |
| Convert DSR offsets in a Document between utf-8/ucs2/codepoint indices.
|
|
static | dumpDOM (Node $rootNode, string $title='', array $options=[]) |
| Dump the DOM with attributes.
|
|
These utilities are for processing content that's generated by parsing source input (ex: wikitext)
◆ convertOffsets()
static Wikimedia\Parsoid\Utils\ContentUtils::convertOffsets |
( |
Env | $env, |
|
|
Document | $doc, |
|
|
string | $from, |
|
|
string | $to ) |
|
static |
Convert DSR offsets in a Document between utf-8/ucs2/codepoint indices.
Offset types are:
- 'byte': Bytes (UTF-8 encoding), e.g. PHP
substr()
or strlen()
.
- 'char': Unicode code points (encoding irrelevant), e.g. PHP
mb_substr()
or mb_strlen()
.
- 'ucs2': 16-bit code units (UTF-16 encoding), e.g. JavaScript
.substring()
or .length
.
- See also
- TokenUtils::convertTokenOffsets for a related function on tokens.
- Parameters
-
Env | $env | |
Document | $doc | The document to convert |
string | $from | Offset type to convert from. |
string | $to | Offset type to convert to. |
◆ createAndLoadDocument()
static Wikimedia\Parsoid\Utils\ContentUtils::createAndLoadDocument |
( |
string | $html, |
|
|
array | $options = [] ) |
|
static |
XXX: Don't use this outside of testing.
It shouldn't be necessary to create new documents when parsing or serializing. A document lives on the environment which can be used to create fragments. The bag added as a dynamic property to the PHP wrapper around the libxml doc is at risk of being GC-ed.
- Parameters
-
string | $html | |
array | $options | |
- Returns
- Document
◆ createAndLoadDocumentFragment()
static Wikimedia\Parsoid\Utils\ContentUtils::createAndLoadDocumentFragment |
( |
Document | $doc, |
|
|
string | $html, |
|
|
array | $options = [] ) |
|
static |
- Parameters
-
Document | $doc | |
string | $html | |
array | $options | |
- Returns
- DocumentFragment
◆ createDocument()
static Wikimedia\Parsoid\Utils\ContentUtils::createDocument |
( |
string | $html = '', |
|
|
bool | $validateXMLNames = false ) |
|
static |
XXX: Don't use this outside of testing.
It shouldn't be necessary to create new documents when parsing or serializing. A document lives on the environment which can be used to create fragments. The bag added as a dynamic property to the PHP wrapper around the libxml doc is at risk of being GC-ed.
- Parameters
-
string | $html | |
bool | $validateXMLNames | |
- Returns
- Document
◆ dumpDOM()
static Wikimedia\Parsoid\Utils\ContentUtils::dumpDOM |
( |
Node | $rootNode, |
|
|
string | $title = '', |
|
|
array | $options = [] ) |
|
static |
Dump the DOM with attributes.
- Parameters
-
Node | $rootNode | |
string | $title | |
array | $options | Associative array of options:
- dumpFragmentMap: Dump the fragment map from env
- quiet: Suppress separators
|
storeDataAttribs options:
- discardDataParsoid
- keepTmp
- storeInPageBundle
- storeDiffMark
- env
- idIndex
XMLSerializer options:
- smartQuote
- innerXML
- captureOffsets
- addDoctype
- Returns
- string The dump result
◆ extractDpAndSerialize()
static Wikimedia\Parsoid\Utils\ContentUtils::extractDpAndSerialize |
( |
Node | $node, |
|
|
array | $options = [] ) |
|
static |
Pull the data-parsoid script element out of the doc before serializing.
- Parameters
-
Node | $node | |
array | $options | XMLSerializer options. |
- Returns
- array
◆ ppToXML()
static Wikimedia\Parsoid\Utils\ContentUtils::ppToXML |
( |
Node | $node, |
|
|
array | $options = [] ) |
|
static |
dataobject aware XML serializer, to be used in the DOM post-processing phase.
- Parameters
-
- Returns
- string
◆ processAttributeEmbeddedHTML()
static Wikimedia\Parsoid\Utils\ContentUtils::processAttributeEmbeddedHTML |
( |
ParsoidExtensionAPI | $extAPI, |
|
|
Element | $elt, |
|
|
Closure | $proc ) |
|
static |
Extensions might be interested in examining their content embedded in data-mw attributes that don't otherwise show up in the DOM.
Ex: inline media captions that aren't rendered, language variant markup, attributes that are transcluded. More scenarios might be added later.
- Parameters
-
ParsoidExtensionAPI | $extAPI | |
Element | $elt | The node whose data attributes need to be examined |
Closure | $proc | The processor that will process the embedded HTML Signature: (string) -> string This processor will be provided the HTML string as input and is expected to return a possibly modified string. |
◆ shiftDSR()
static Wikimedia\Parsoid\Utils\ContentUtils::shiftDSR |
( |
Env | $env, |
|
|
Node | $rootNode, |
|
|
callable | $dsrFunc, |
|
|
ParsoidExtensionAPI | $extAPI ) |
|
static |
Shift the DOM Source Range (DSR) of a DOM fragment.
- Parameters
-
Env | $env | |
Node | $rootNode | |
callable | $dsrFunc | |
ParsoidExtensionAPI | $extAPI | |
- Returns
- Node Returns the $rootNode passed in to allow chaining.
◆ stripUnnecessaryWrappersAndSyntheticNodes()
static Wikimedia\Parsoid\Utils\ContentUtils::stripUnnecessaryWrappersAndSyntheticNodes |
( |
Element | $node | ) |
|
|
static |
Strip Parsoid-inserted section wrappers, annotation wrappers, and synthetic nodes (fallback id spans with HTML4 ids for headings, auto-generated TOC metas and possibly other such in the future) from the DOM.
- Parameters
-
◆ toXML()
static Wikimedia\Parsoid\Utils\ContentUtils::toXML |
( |
Node | $node, |
|
|
array | $options = [] ) |
|
static |
XML Serializer.
- Parameters
-
Node | $node | |
array | $options | XMLSerializer options. |
- Returns
- string
The documentation for this class was generated from the following file:
- src/Utils/ContentUtils.php