RemexHtml
Fast HTML 5 parser
Loading...
Searching...
No Matches
Wikimedia\RemexHtml\Serializer\HtmlFormatter Class Reference

A formatter which follows the HTML 5 fragment serialization algorithm. More...

+ Inheritance diagram for Wikimedia\RemexHtml\Serializer\HtmlFormatter:

Public Member Functions

 __construct ( $options=[])
 Constructor.
 
 startDocument ( $fragmentNamespace, $fragmentName)
 Get a string which starts the document.
 
 characters (SerializerNode $parent, $text, $start, $length)
 Encode the given character substring.
 
 element (SerializerNode $parent, SerializerNode $node, $contents)
 Encode the given element.
 
 comment (SerializerNode $parent, $text)
 Encode a comment.
 
 doctype ( $name, $public, $system)
 Encode a doctype.
 
string formatDOMElement ( $node, $contents)
 
- Public Member Functions inherited from Wikimedia\RemexHtml\DOM\DOMFormatter
string formatDOMNode (DOMNode $node)
 Recursively format a DOMNode.
 
 formatDOMElement (DOMElement $element, $contents)
 Non-recursively format a DOMElement.
 

Protected Attributes

array< string, bool > $voidElements
 The elements for which a closing tag is omitted.
 
array< string, bool > $prefixLfElements
 The elements which need a leading newline in their contents to be duplicated, since the parser strips a leading newline.
 
array< string, bool > $rawTextElements
 The elements which have unescaped contents.
 
array< string, string > $attributeEscapes
 The escape table for attribute values.
 
array< string, string > $textEscapes
 The escape table for text nodes.
 
array< string, bool > $unqualifiedNamespaces
 Attribute namespaces which have unqualified local names.
 
 $useSourceDoctype
 
 $reverseCoercion
 

Detailed Description

A formatter which follows the HTML 5 fragment serialization algorithm.

Constructor & Destructor Documentation

◆ __construct()

Wikimedia\RemexHtml\Serializer\HtmlFormatter::__construct ( $options = [])

Constructor.

Parameters
array$optionsAn associative array of options:
  • scriptingFlag : Set this to false to disable scripting. True by default.
  • useSourceDoctype : Emit the doctype used in the source. If this is false or absent, an HTML doctype will be used.
  • reverseCoercion : When formatting a DOM node, reverse the encoding of invalid names. False by default.

Reimplemented in Wikimedia\RemexHtml\Serializer\DepurateFormatter.

Member Function Documentation

◆ characters()

Wikimedia\RemexHtml\Serializer\HtmlFormatter::characters ( SerializerNode $parent,
$text,
$start,
$length )

Encode the given character substring.

Parameters
SerializerNode$parentThe parent of the text node (at creation time)
string$text
int$startThe offset within $text
int$lengthThe number of bytes within $text
Returns
string

Implements Wikimedia\RemexHtml\Serializer\Formatter.

◆ comment()

Wikimedia\RemexHtml\Serializer\HtmlFormatter::comment ( SerializerNode $parent,
$text )

Encode a comment.

Parameters
SerializerNode$parentThe parent of the node (at creation time)
string$textThe inner text of the comment
Returns
string

Implements Wikimedia\RemexHtml\Serializer\Formatter.

◆ doctype()

Wikimedia\RemexHtml\Serializer\HtmlFormatter::doctype ( $name,
$public,
$system )

Encode a doctype.

This event occurs when the source document has a doctype, it can return an empty string if the formatter wants to use its own doctype.

Parameters
string$nameThe doctype name, usually "html"
string$publicThe PUBLIC identifier
string$systemThe SYSTEM identifier
Returns
string

Implements Wikimedia\RemexHtml\Serializer\Formatter.

◆ element()

Wikimedia\RemexHtml\Serializer\HtmlFormatter::element ( SerializerNode $parent,
SerializerNode $node,
$contents )

Encode the given element.

Parameters
SerializerNode$parentThe parent of the node (when it is closed)
SerializerNode$nodeThe element to encode
string | null$contentsThe previously-encoded contents, or null for a void element. Void elements can be serialized as self-closing tags.
Returns
string

Implements Wikimedia\RemexHtml\Serializer\Formatter.

Reimplemented in Wikimedia\RemexHtml\Serializer\DepurateFormatter.

◆ formatDOMElement()

string Wikimedia\RemexHtml\Serializer\HtmlFormatter::formatDOMElement ( $node,
$contents )
Parameters
\DOMElement$node
string$contents
Returns
string

◆ startDocument()

Wikimedia\RemexHtml\Serializer\HtmlFormatter::startDocument ( $fragmentNamespace,
$fragmentName )

Get a string which starts the document.

Parameters
string | null$fragmentNamespace
string | null$fragmentName
Returns
string

Implements Wikimedia\RemexHtml\Serializer\Formatter.

Member Data Documentation

◆ $attributeEscapes

array<string,string> Wikimedia\RemexHtml\Serializer\HtmlFormatter::$attributeEscapes
protected
Initial value:
= array(
'&' => '&amp;',
"\xc2\xa0" => '&nbsp;',
'"' => '&quot;',
)

The escape table for attribute values.

◆ $prefixLfElements

array<string,bool> Wikimedia\RemexHtml\Serializer\HtmlFormatter::$prefixLfElements
protected
Initial value:
= array(
'pre' => true,
'textarea' => true,
'listing' => true
)

The elements which need a leading newline in their contents to be duplicated, since the parser strips a leading newline.

◆ $rawTextElements

array<string,bool> Wikimedia\RemexHtml\Serializer\HtmlFormatter::$rawTextElements
protected
Initial value:
= array(
'style' => true,
'script' => true,
'xmp' => true,
'iframe' => true,
'noembed' => true,
'noframes' => true,
'plaintext' => true,
)

The elements which have unescaped contents.

◆ $textEscapes

array<string,string> Wikimedia\RemexHtml\Serializer\HtmlFormatter::$textEscapes
protected
Initial value:
= array(
'&' => '&amp;',
"\xc2\xa0" => '&nbsp;',
'<' => '&lt;',
'>' => '&gt;',
)

The escape table for text nodes.

◆ $unqualifiedNamespaces

array<string,bool> Wikimedia\RemexHtml\Serializer\HtmlFormatter::$unqualifiedNamespaces
protected
Initial value:
= array(
HTMLData::NS_HTML => true,
HTMLData::NS_MATHML => true,
HTMLData::NS_SVG => true,
)

Attribute namespaces which have unqualified local names.

◆ $voidElements

array<string,bool> Wikimedia\RemexHtml\Serializer\HtmlFormatter::$voidElements
protected
Initial value:
= array(
'area' => true,
'base' => true,
'basefont' => true,
'bgsound' => true,
'br' => true,
'col' => true,
'embed' => true,
'frame' => true,
'hr' => true,
'img' => true,
'input' => true,
'keygen' => true,
'link' => true,
'menuitem' => true,
'meta' => true,
'param' => true,
'source' => true,
'track' => true,
'wbr' => true,
)

The elements for which a closing tag is omitted.


The documentation for this class was generated from the following file: