Parsoid
A bidirectional parser between wikitext and HTML5
Parsoid\Utils\ContentUtils Class Reference

These utilities are for processing content that's generated by parsing source input (ex: wikitext) More...

Static Public Member Functions

static toXML (DOMNode $node, array $options=[])
 XML Serializer. More...
 
static ppToXML (DOMNode $node, array $options=[])
 dataobject aware XML serializer, to be used in the DOM post-processing phase. More...
 
static ppToDOM (Env $env, string $html, array $options=[])
 .dataobject aware HTML parser, to be used in the DOM post-processing phase. More...
 
static extractDpAndSerialize (DOMNode $node, array $options=[])
 Pull the data-parsoid script element out of the doc before serializing. More...
 
static stripSectionTagsAndFallbackIds (DOMElement $node)
 Strip Parsoid-inserted section wrappers and fallback id spans with HTML4 ids for headings from the DOM. More...
 
static shiftDSR (Env $env, DOMNode $rootNode, callable $dsrFunc)
 Shift the DSR of a DOM fragment. More...
 
static convertOffsets (Env $env, DOMDocument $doc, string $from, string $to)
 Convert DSR offsets in a Document between utf-8/ucs2/codepoint indices. More...
 
static dumpDOM (DOMNode $rootNode, string $title, array &$options=[])
 Dump the DOM with attributes. More...
 

Detailed Description

These utilities are for processing content that's generated by parsing source input (ex: wikitext)

Member Function Documentation

◆ convertOffsets()

static Parsoid\Utils\ContentUtils::convertOffsets ( Env  $env,
DOMDocument  $doc,
string  $from,
string  $to 
)
static

Convert DSR offsets in a Document between utf-8/ucs2/codepoint indices.

Offset types are:

  • 'byte': Bytes (UTF-8 encoding), e.g. PHP substr() or strlen().
  • 'char': Unicode code points (encoding irrelevant), e.g. PHP mb_substr() or mb_strlen().
  • 'ucs2': 16-bit code units (UTF-16 encoding), e.g. JavaScript .substring() or .length.
See also
TokenUtils::convertTokenOffsets for a related function on tokens.
Parameters
Env$env
DOMDocument$docThe document to convert
string$fromOffset type to convert from.
string$toOffset type to convert to.

◆ dumpDOM()

static Parsoid\Utils\ContentUtils::dumpDOM ( DOMNode  $rootNode,
string  $title,
array &  $options = [] 
)
static

Dump the DOM with attributes.

Parameters
DOMNode$rootNode
string$title
array&$options

◆ extractDpAndSerialize()

static Parsoid\Utils\ContentUtils::extractDpAndSerialize ( DOMNode  $node,
array  $options = [] 
)
static

Pull the data-parsoid script element out of the doc before serializing.

Parameters
DOMNode$node
array$optionsXMLSerializer options.
Returns
array

◆ ppToDOM()

static Parsoid\Utils\ContentUtils::ppToDOM ( Env  $env,
string  $html,
array  $options = [] 
)
static

.dataobject aware HTML parser, to be used in the DOM post-processing phase.

Parameters
Env$env
string$html
array | null$options
Returns
DOMElement

◆ ppToXML()

static Parsoid\Utils\ContentUtils::ppToXML ( DOMNode  $node,
array  $options = [] 
)
static

dataobject aware XML serializer, to be used in the DOM post-processing phase.

Parameters
DOMNode$node
array$options
Returns
string

◆ shiftDSR()

static Parsoid\Utils\ContentUtils::shiftDSR ( Env  $env,
DOMNode  $rootNode,
callable  $dsrFunc 
)
static

Shift the DSR of a DOM fragment.

Parameters
Env$env
DOMNode$rootNode
callable$dsrFunc
Returns
DOMNode Returns the $rootNode passed in to allow chaining.

◆ stripSectionTagsAndFallbackIds()

static Parsoid\Utils\ContentUtils::stripSectionTagsAndFallbackIds ( DOMElement  $node)
static

Strip Parsoid-inserted section wrappers and fallback id spans with HTML4 ids for headings from the DOM.

Parameters
DOMElement$node

◆ toXML()

static Parsoid\Utils\ContentUtils::toXML ( DOMNode  $node,
array  $options = [] 
)
static

XML Serializer.

Parameters
DOMNode$node
array$optionsXMLSerializer options.
Returns
string

The documentation for this class was generated from the following file: