MediaWiki  1.23.2
HtmlFormatter Class Reference

Public Member Functions

 __construct ( $html)
 Constructor. More...
 
 filterContent ()
 Removes content we've chosen to remove. More...
 
 flatten ( $elements)
 Adds one or more element name to the list to flatten (remove tag, but not its content) Can accept undelimited regexes. More...
 
 flattenAllTags ()
 Instructs the formatter to flatten all tags. More...
 
 getDoc ()
 
 getText ( $element=null)
 Performs final transformations and returns resulting HTML. More...
 
 remove ( $selectors)
 Adds one or more selector of content to remove. More...
 
 setRemoveMedia ( $flag=true)
 Sets whether images/videos/sounds should be removed from output. More...
 

Static Public Member Functions

static wrapHTML ( $html)
 Turns a chunk of HTML into a proper document. More...
 

Protected Member Functions

 onHtmlReady ( $html)
 Override this in descendant class to modify HTML after it has been converted from DOM tree. More...
 
 parseItemsToRemove ()
 Transforms CSS selectors into an internal representation suitable for processing. More...
 
 parseSelector ( $selector, &$type, &$rawName)
 

Protected Attributes

 $removeMedia = false
 

Private Member Functions

 fixLibXML ( $html)
 libxml in its usual pointlessness converts many chars to entities - this function perfoms a reverse conversion More...
 
 removeElements ( $elements)
 Removes a list of elelments from DOMDocument. More...
 

Private Attributes

DOMDocument $doc
 
 $elementsToFlatten = array()
 
 $html
 
 $itemsToRemove = array()
 

Detailed Description

Definition at line 23 of file HtmlFormatter.php.

Constructor & Destructor Documentation

◆ __construct()

HtmlFormatter::__construct (   $html)

Constructor.

Parameters
string$htmlText to process

Definition at line 38 of file HtmlFormatter.php.

References $html.

Member Function Documentation

◆ filterContent()

HtmlFormatter::filterContent ( )

Removes content we've chosen to remove.

Definition at line 132 of file HtmlFormatter.php.

References $doc, array(), as, getDoc(), parseItemsToRemove(), removeElements(), wfProfileIn(), and wfProfileOut().

◆ fixLibXML()

HtmlFormatter::fixLibXML (   $html)
private

libxml in its usual pointlessness converts many chars to entities - this function perfoms a reverse conversion

Parameters
string$html
Returns
string

Definition at line 227 of file HtmlFormatter.php.

References $html, array(), wfProfileIn(), and wfProfileOut().

◆ flatten()

HtmlFormatter::flatten (   $elements)

Adds one or more element name to the list to flatten (remove tag, but not its content) Can accept undelimited regexes.

Note this interface may fail in surprising unexpected ways due to usage of regexes, so should not be relied on for HTML markup security measures.

Parameters
array | string$elementsName(s) of tag(s) to flatten

Definition at line 118 of file HtmlFormatter.php.

References array().

Referenced by flattenAllTags().

◆ flattenAllTags()

HtmlFormatter::flattenAllTags ( )

Instructs the formatter to flatten all tags.

Definition at line 125 of file HtmlFormatter.php.

References flatten().

◆ getDoc()

HtmlFormatter::getDoc ( )
Returns
DOMDocument DOM to manipulate

Definition at line 63 of file HtmlFormatter.php.

References $doc, and $html.

Referenced by filterContent().

◆ getText()

HtmlFormatter::getText (   $element = null)

Performs final transformations and returns resulting HTML.

Parameters
DOMElement | string | null$elementID of element to get HTML from or false to get it from the whole tree
Returns
string Processed HTML

Definition at line 252 of file HtmlFormatter.php.

References $html, array(), as, onHtmlReady(), wfIsWindows(), wfProfileIn(), and wfProfileOut().

◆ onHtmlReady()

HtmlFormatter::onHtmlReady (   $html)
protected

Override this in descendant class to modify HTML after it has been converted from DOM tree.

Parameters
string$htmlHTML to process
Returns
string Processed HTML

Definition at line 56 of file HtmlFormatter.php.

References $html.

Referenced by getText().

◆ parseItemsToRemove()

HtmlFormatter::parseItemsToRemove ( )
protected

Transforms CSS selectors into an internal representation suitable for processing.

Returns
array

Definition at line 325 of file HtmlFormatter.php.

References $type, array(), as, parseSelector(), wfProfileIn(), and wfProfileOut().

Referenced by filterContent().

◆ parseSelector()

HtmlFormatter::parseSelector (   $selector,
$type,
$rawName 
)
protected
Parameters
string$selectorCSS selector to parse
string$type
string$rawName
Returns
bool Whether the selector was successfully recognised

Definition at line 301 of file HtmlFormatter.php.

References $selector, and $type.

Referenced by parseItemsToRemove().

◆ remove()

HtmlFormatter::remove (   $selectors)

Adds one or more selector of content to remove.

A subset of CSS selector syntax is supported:

<tag> <tag>.class .<class> #<id>

Parameters
array | string$selectorsSelector(s) of stuff to remove

Definition at line 105 of file HtmlFormatter.php.

References array().

◆ removeElements()

HtmlFormatter::removeElements (   $elements)
private

Removes a list of elelments from DOMDocument.

Parameters
array | DOMNodeList$elements

Definition at line 205 of file HtmlFormatter.php.

References array(), and as.

Referenced by filterContent().

◆ setRemoveMedia()

HtmlFormatter::setRemoveMedia (   $flag = true)

Sets whether images/videos/sounds should be removed from output.

Parameters
bool$flag

Definition at line 90 of file HtmlFormatter.php.

◆ wrapHTML()

static HtmlFormatter::wrapHTML (   $html)
static

Turns a chunk of HTML into a proper document.

Parameters
string$html
Returns
string

Definition at line 47 of file HtmlFormatter.php.

References $html.

Referenced by HtmlFormatterTest\testTransform().

Member Data Documentation

◆ $doc

DOMDocument HtmlFormatter::$doc
private

Definition at line 26 of file HtmlFormatter.php.

Referenced by filterContent(), and getDoc().

◆ $elementsToFlatten

HtmlFormatter::$elementsToFlatten = array()
private

Definition at line 30 of file HtmlFormatter.php.

◆ $html

HtmlFormatter::$html
private

Definition at line 28 of file HtmlFormatter.php.

Referenced by __construct(), fixLibXML(), getDoc(), getText(), onHtmlReady(), and wrapHTML().

◆ $itemsToRemove

HtmlFormatter::$itemsToRemove = array()
private

Definition at line 29 of file HtmlFormatter.php.

◆ $removeMedia

HtmlFormatter::$removeMedia = false
protected

Definition at line 31 of file HtmlFormatter.php.


The documentation for this class was generated from the following file: