Parsoid
A bidirectional parser between wikitext and HTML5
Loading...
Searching...
No Matches
Wikimedia\Parsoid\Utils\DOMUtils Class Reference

DOM utilities for querying the DOM. More...

Static Public Member Functions

static parseHTML (string $html, bool $validateXMLNames=false)
 Parse HTML, return the tree.
 
static visitDOM (Node $node, callable $handler,... $args)
 This is a simplified version of the DOMTraverser.
 
static migrateChildren (Node $from, Node $to, ?Node $beforeNode=null)
 Move 'from'.childNodes to 'to' adding them before 'beforeNode' If 'beforeNode' is null, the nodes are appended at the end.
 
static childNodes (Node $n)
 Many DOM implementations will de-optimize the representation of a Node if $node->childNodes is accessed, converting the linked list of node children to an array which is then expensive to mutate.
 
static migrateChildrenBetweenDocs (Node $from, Node $to, ?Node $beforeNode=null)
 Copy 'from'.childNodes to 'to' adding them before 'beforeNode' 'from' and 'to' belong to different documents.
 
static assertElt (?Node $node)
 Assert that this is a DOM element node.
 
static isRemexBlockNode (?Node $node)
 
static isWikitextBlockNode (?Node $node)
 
static isFormattingElt (?Node $node)
 Determine whether this is a formatting DOM element.
 
static isQuoteElt (?Node $node)
 Determine whether this is a quote DOM element.
 
static isBody (?Node $node)
 Determine whether this is the <body> DOM element.
 
static isRemoved (?Node $node)
 Determine whether this is a removed DOM node but Node object yet.
 
static pathToRoot (Node $node)
 Build path from a node to the root of the document.
 
static nodeDepth (Node $node)
 Compute the edge length of the path from $node to the root.
 
static pathToSibling (Node $node, Node $sibling, bool $left)
 Build path from a node to its passed-in sibling.
 
static inSiblingOrder (Node $n1, Node $n2)
 Check whether a node n1 comes before another node n2 in their parent's children list.
 
static isAncestorOf (Node $n1, Node $n2)
 Check that a node 'n1' is an ancestor of another node 'n2' in the DOM.
 
static findAncestorOfName (Node $node, string $name)
 Find an ancestor of $node with nodeName $name.
 
static hasNameOrHasAncestorOfName (Node $node, string $name)
 Check whether $node has $name or has an ancestor named $name.
 
static matchNameAndTypeOf (Node $n, string $name, string $typeRe)
 Determine whether the node matches the given nodeName and attribute value.
 
static hasNameAndTypeOf (Node $n, string $name, string $type)
 Determine whether the node matches the given nodeName and typeof attribute value; the typeof is given as string.
 
static matchTypeOf (Node $n, string $typeRe)
 Determine whether the node matches the given typeof attribute value.
 
static matchRel (Node $n, string $relRe)
 Determine whether the node matches the given rel attribute value.
 
static hasTypeOf (Node $n, string $type)
 Determine whether the node matches the given typeof attribute value.
 
static hasRel (Node $n, string $rel)
 Determine whether the node matches the given rel attribute value.
 
static hasClass (Element $element, string $regex)
 
static addTypeOf (Element $node, string $type, bool $prepend=false)
 Add a type to the typeof attribute.
 
static addRel (Element $node, string $rel)
 Add a type to the rel attribute.
 
static removeTypeOf (Element $node, string $type)
 Remove a type from the typeof attribute.
 
static removeRel (Element $node, string $rel)
 Remove a type from the rel attribute.
 
static isFosterablePosition (?Node $n)
 Check whether node is in a fosterable position.
 
static isHeading (?Node $n)
 Check whether node is a heading.
 
static isList (?Node $n)
 Check whether node is a list.
 
static isListItem (?Node $n)
 Check whether node is a list item.
 
static isListOrListItem (?Node $n)
 Check whether node is a list or list item.
 
static isNestedInListItem (?Node $n)
 Check whether node is nestee in a list item.
 
static isNestedListOrListItem (?Node $n)
 Check whether node is a nested list or a list item.
 
static isMarkerMeta (Node $n, string $type)
 Check a node to see whether it's a meta with some typeof.
 
static hasElementChild (Node $node)
 Check whether a node has any children that are elements.
 
static hasBlockElementDescendant (Node $node)
 Check if a node has a block-level element descendant.
 
static isIEW (?Node $node)
 Is a node representing inter-element whitespace?
 
static isDocumentFragment (?Node $node)
 Is a node a document fragment?
 
static atTheTop (?Node $node)
 Is a node at the top?
 
static allChildrenAreTextOrComments (Node $node)
 Are all children of this node text or comment nodes?
 
static treeHasElement (Node $node, string $tagName, bool $checkRoot=false)
 Check if the dom-subtree rooted at node has an element with tag name 'tagName' By default, the root node is not checked.
 
static isTableTag (Node $node)
 Is node a table tag (table, tbody, td, tr, etc.)?
 
static selectMediaElt (Element $node)
 Returns a media element nested in node
 
static findHttpEquivHeaders (Document $doc)
 Extract http-equiv headers from the HTML, including content-language and vary headers, if present.
 
static addHttpEquivHeaders (Document $doc, array $headers)
 Add or replace http-equiv headers in the HTML <head>.
 
static extractInlinedContentVersion (Document $doc)
 
static addAttributes (Element $elt, array $attrs)
 Add attributes to a node element.
 
static appendToHead (Document $document, string $tagName, array $attrs=[])
 Create an element in the document head with the given attrs.
 
static getFragmentInnerHTML (DocumentFragment $frag)
 innerHTML and outerHTML are not defined on DocumentFragment.
 
static setFragmentInnerHTML (DocumentFragment $frag, string $html)
 innerHTML and outerHTML are not defined on DocumentFragment.
 
static parseHTMLToFragment (Document $doc, string $html)
 
static isRawTextElement (Node $node)
 
static hasBlockTag (Node $n)
 Is $n a block tag OR does the subtree rooted at $n have a block tag in it?
 
static attributes (Element $element)
 
static isMetaDataTag (Element $node)
 
static stripPWrapper (string $ret)
 Strip a paragraph wrapper, if any, before parsing HTML to DOM.
 

Detailed Description

DOM utilities for querying the DOM.

This is largely independent of Parsoid although some Parsoid details (TokenUtils, inline content version) have snuck in.

Member Function Documentation

◆ addAttributes()

static Wikimedia\Parsoid\Utils\DOMUtils::addAttributes ( Element $elt,
array $attrs )
static

Add attributes to a node element.

Parameters
Element$eltelement
array$attrsattributes

◆ addHttpEquivHeaders()

static Wikimedia\Parsoid\Utils\DOMUtils::addHttpEquivHeaders ( Document $doc,
array $headers )
static

Add or replace http-equiv headers in the HTML <head>.

This is used for content-language and vary headers, among possible others.

Parameters
Document$docThe HTML document to update
array<string,string|string[]>$headers An array mapping HTTP header names (which are case-insensitive) to new values. If an array of values is provided, they will be joined with commas.

◆ addRel()

static Wikimedia\Parsoid\Utils\DOMUtils::addRel ( Element $node,
string $rel )
static

Add a type to the rel attribute.

This method should almost always be used instead of setAttribute, to ensure we don't overwrite existing rel information.

◆ addTypeOf()

static Wikimedia\Parsoid\Utils\DOMUtils::addTypeOf ( Element $node,
string $type,
bool $prepend = false )
static

Add a type to the typeof attribute.

This method should almost always be used instead of setAttribute, to ensure we don't overwrite existing typeof information.

Parameters
Element$nodenode
string$typetype
bool$prependIf true, adds value to start, rather than end. Use of this option in new code is discouraged.

◆ appendToHead()

static Wikimedia\Parsoid\Utils\DOMUtils::appendToHead ( Document $document,
string $tagName,
array $attrs = [] )
static

Create an element in the document head with the given attrs.

Creates the head element in the document if needed.

Parameters
Document$document
string$tagName
array$attrs
Returns
Element The newly-appended Element

◆ assertElt()

static Wikimedia\Parsoid\Utils\DOMUtils::assertElt ( ?Node $node)
static

Assert that this is a DOM element node.

This is primarily to help phan analyze variable types.

@phan-assert Element $node

Parameters
?Node$node
Returns
true Always returns true

◆ attributes()

static Wikimedia\Parsoid\Utils\DOMUtils::attributes ( Element $element)
static
See also
DOMCompat::attributes()
Deprecated
Use DOMCompat::attributes

◆ childNodes()

static Wikimedia\Parsoid\Utils\DOMUtils::childNodes ( Node $n)
static

Many DOM implementations will de-optimize the representation of a Node if $node->childNodes is accessed, converting the linked list of node children to an array which is then expensive to mutate.

This method returns an array of child nodes, but uses the ->firstChild/->nextSibling accessors to obtain it, avoiding deoptimization. This is also robust against concurrent mutation.

Parameters
Node$n
Returns
list<Node> the child nodes

◆ findHttpEquivHeaders()

static Wikimedia\Parsoid\Utils\DOMUtils::findHttpEquivHeaders ( Document $doc)
static

Extract http-equiv headers from the HTML, including content-language and vary headers, if present.

Parameters
Document$doc
Returns
array<string,string>

◆ getFragmentInnerHTML()

static Wikimedia\Parsoid\Utils\DOMUtils::getFragmentInnerHTML ( DocumentFragment $frag)
static

innerHTML and outerHTML are not defined on DocumentFragment.

Defined similarly to DOMCompat::getInnerHTML()

◆ hasClass()

static Wikimedia\Parsoid\Utils\DOMUtils::hasClass ( Element $element,
string $regex )
static
Parameters
Element$element
string$regexPartial regular expression, e.g. "foo|bar"
Returns
bool

◆ hasNameAndTypeOf()

static Wikimedia\Parsoid\Utils\DOMUtils::hasNameAndTypeOf ( Node $n,
string $name,
string $type )
static

Determine whether the node matches the given nodeName and typeof attribute value; the typeof is given as string.

Parameters
Node$n
string$namenode name to test for
string$typeExpected value of "typeof" attribute (literal string)
Returns
bool True if the node matches.

◆ hasRel()

static Wikimedia\Parsoid\Utils\DOMUtils::hasRel ( Node $n,
string $rel )
static

Determine whether the node matches the given rel attribute value.

Parameters
Node$n
string$relExpected value of "rel" attribute, as a literal string.
Returns
bool True if the node matches.

◆ hasTypeOf()

static Wikimedia\Parsoid\Utils\DOMUtils::hasTypeOf ( Node $n,
string $type )
static

Determine whether the node matches the given typeof attribute value.

Parameters
Node$n
string$typeExpected value of "typeof" attribute, as a literal string.
Returns
bool True if the node matches.

◆ inSiblingOrder()

static Wikimedia\Parsoid\Utils\DOMUtils::inSiblingOrder ( Node $n1,
Node $n2 )
static

Check whether a node n1 comes before another node n2 in their parent's children list.

Parameters
Node$n1The node you expect to come first.
Node$n2Expected later sibling.
Returns
bool

◆ isAncestorOf()

static Wikimedia\Parsoid\Utils\DOMUtils::isAncestorOf ( Node $n1,
Node $n2 )
static

Check that a node 'n1' is an ancestor of another node 'n2' in the DOM.

Returns true if n1 === n2.

Parameters
Node$n1the suspected ancestor.
Node$n2the suspected descendant.
Returns
bool

◆ matchNameAndTypeOf()

static Wikimedia\Parsoid\Utils\DOMUtils::matchNameAndTypeOf ( Node $n,
string $name,
string $typeRe )
static

Determine whether the node matches the given nodeName and attribute value.

Returns true if node name matches and the attribute equals "typeof"

Parameters
Node$nThe node to test
string$nameThe expected nodeName of $n
string$typeReRegular expression matching the expected value of typeof attribute.
Returns
?string The matching typeof value, or null if there is no match.

◆ matchRel()

static Wikimedia\Parsoid\Utils\DOMUtils::matchRel ( Node $n,
string $relRe )
static

Determine whether the node matches the given rel attribute value.

Parameters
Node$nThe node to test
string$relReRegular expression matching the expected value of the rel attribute.
Returns
?string The matching rel value, or null if there is no match.

◆ matchTypeOf()

static Wikimedia\Parsoid\Utils\DOMUtils::matchTypeOf ( Node $n,
string $typeRe )
static

Determine whether the node matches the given typeof attribute value.

Parameters
Node$nThe node to test
string$typeReRegular expression matching the expected value of the typeof attribute.
Returns
?string The matching typeof value, or null if there is no match.

◆ migrateChildren()

static Wikimedia\Parsoid\Utils\DOMUtils::migrateChildren ( Node $from,
Node $to,
?Node $beforeNode = null )
static

Move 'from'.childNodes to 'to' adding them before 'beforeNode' If 'beforeNode' is null, the nodes are appended at the end.

Parameters
Node$fromSource node. Children will be removed.
Node$toDestination node. Children of $from will be added here
?Node$beforeNodeAdd the children before this node.

◆ migrateChildrenBetweenDocs()

static Wikimedia\Parsoid\Utils\DOMUtils::migrateChildrenBetweenDocs ( Node $from,
Node $to,
?Node $beforeNode = null )
static

Copy 'from'.childNodes to 'to' adding them before 'beforeNode' 'from' and 'to' belong to different documents.

If 'beforeNode' is null, the nodes are appended at the end.

Parameters
Node$from
Node$to
?Node$beforeNode

◆ nodeDepth()

static Wikimedia\Parsoid\Utils\DOMUtils::nodeDepth ( Node $node)
static

Compute the edge length of the path from $node to the root.

Root document is at depth 0, <html> at 1, <body> at 2.

◆ parseHTML()

static Wikimedia\Parsoid\Utils\DOMUtils::parseHTML ( string $html,
bool $validateXMLNames = false )
static

Parse HTML, return the tree.

Note
The resulting document is not "prepared and loaded"; use ContentUtils::prepareAndLoadDocument() instead if that's what you need.

◆ pathToRoot()

static Wikimedia\Parsoid\Utils\DOMUtils::pathToRoot ( Node $node)
static

Build path from a node to the root of the document.

Parameters
Node$node
Returns
Node[] Path including all nodes from $node to the root of the document

◆ pathToSibling()

static Wikimedia\Parsoid\Utils\DOMUtils::pathToSibling ( Node $node,
Node $sibling,
bool $left )
static

Build path from a node to its passed-in sibling.

Return will not include the passed-in sibling.

Parameters
Node$node
Node$sibling
bool$leftindicates whether to go backwards, use previousSibling instead of nextSibling.
Returns
Node[]

◆ setFragmentInnerHTML()

static Wikimedia\Parsoid\Utils\DOMUtils::setFragmentInnerHTML ( DocumentFragment $frag,
string $html )
static

innerHTML and outerHTML are not defined on DocumentFragment.

See also
DOMCompat::setInnerHTML() for the Element version

◆ treeHasElement()

static Wikimedia\Parsoid\Utils\DOMUtils::treeHasElement ( Node $node,
string $tagName,
bool $checkRoot = false )
static

Check if the dom-subtree rooted at node has an element with tag name 'tagName' By default, the root node is not checked.

Parameters
Node$nodeThe DOM node whose tree should be checked
string$tagNameTag name to look for
bool$checkRootShould the root be checked?
Returns
bool

◆ visitDOM()

static Wikimedia\Parsoid\Utils\DOMUtils::visitDOM ( Node $node,
callable $handler,
$args )
static

This is a simplified version of the DOMTraverser.

Consider using that before making this more complex.

FIXME: Move to DOMTraverser OR create a new class?

Parameters
Node$node
callable$handler
mixed...$args

The documentation for this class was generated from the following file: