Parsoid
A bidirectional parser between wikitext and HTML5
Loading...
Searching...
No Matches
Wikimedia\Parsoid\Core\Sanitizer Class Reference

Static Public Member Functions

static attributesAllowedInternal (string $element)
 Fetch the list of acceptable attributes for a given element name.
 
static normalizeCharReferences ( $text)
 Ensure that any entities and character references are legal for XML and XHTML specifically.
 
static cleanUrl (SiteConfig $siteConfig, string $href, string $mode)
 
static decodeCharReferences (string $text)
 Decode any character references, numeric or named entities, in the text and return a UTF-8 string.
 
static normalizeCss (string $value)
 Normalize CSS into a format we can easily search for hostile input.
 
static isReservedDataAttribute (string $attr)
 Given an attribute name, checks whether it is a reserved data attribute (such as data-mw-foo) which is unavailable to user-generated HTML so MediaWiki core and extension code can safely use it to communicate with frontend code.
 
static sanitizeTagAttrs (SiteConfig $siteConfig, ?string $tagName, ?Token $token, array $attrs)
 
static applySanitizedArgs (SiteConfig $siteConfig, Element $wrapper, array $attrs)
 Sanitize and apply attributes to a wrapper element.
 
static checkCss (string $text)
 
static cssDecodeCallback ( $matches)
 
static sanitizeTitleURI (string $title, bool $isInterwiki=false)
 Sanitize a title to be used in a URI?
 

Public Attributes

const ID_FALLBACK = 1
 Tells escapeUrlForHtml() to encode the ID using the fallback encoding, or return false if no fallback is configured.
 

Member Function Documentation

◆ applySanitizedArgs()

static Wikimedia\Parsoid\Core\Sanitizer::applySanitizedArgs ( SiteConfig $siteConfig,
Element $wrapper,
array $attrs )
static

Sanitize and apply attributes to a wrapper element.

Used primarily when we're applying tokenized attributes directly to dom elements, which wouldn't have had a chance to be sanitized before tree building.

Parameters
SiteConfig$siteConfig
Element$wrapperwrapper
array$attrsattributes

◆ attributesAllowedInternal()

static Wikimedia\Parsoid\Core\Sanitizer::attributesAllowedInternal ( string $element)
static

Fetch the list of acceptable attributes for a given element name.

Parameters
string$element
Returns
array

◆ checkCss()

static Wikimedia\Parsoid\Core\Sanitizer::checkCss ( string $text)
static
Parameters
string$text
Returns
string

◆ cleanUrl()

static Wikimedia\Parsoid\Core\Sanitizer::cleanUrl ( SiteConfig $siteConfig,
string $href,
string $mode )
static
Parameters
SiteConfig$siteConfig
string$href
string$mode
Returns
string|null

◆ cssDecodeCallback()

static Wikimedia\Parsoid\Core\Sanitizer::cssDecodeCallback ( $matches)
static
Parameters
array$matches
Returns
string

◆ decodeCharReferences()

static Wikimedia\Parsoid\Core\Sanitizer::decodeCharReferences ( string $text)
static

Decode any character references, numeric or named entities, in the text and return a UTF-8 string.

Parameters
string$text
Returns
string

◆ isReservedDataAttribute()

static Wikimedia\Parsoid\Core\Sanitizer::isReservedDataAttribute ( string $attr)
static

Given an attribute name, checks whether it is a reserved data attribute (such as data-mw-foo) which is unavailable to user-generated HTML so MediaWiki core and extension code can safely use it to communicate with frontend code.

Parameters
string$attrAttribute name.
Returns
bool

◆ normalizeCharReferences()

static Wikimedia\Parsoid\Core\Sanitizer::normalizeCharReferences ( $text)
static

Ensure that any entities and character references are legal for XML and XHTML specifically.

Any stray bits will be &-escaped to result in a valid text fragment.

a. named char refs can only be < > & ", others are numericized (this way we're well-formed even without a DTD) b. any numeric char refs must be legal chars, not invalid or forbidden c. use lower cased "&#x", not "&#X" d. fix or reject non-valid attributes

Parameters
string$text
Returns
string

◆ normalizeCss()

static Wikimedia\Parsoid\Core\Sanitizer::normalizeCss ( string $value)
static

Normalize CSS into a format we can easily search for hostile input.

  • decode character references
  • decode escape sequences
  • convert characters that IE6 interprets into ascii
  • remove comments, unless the entire value is one single comment
    Parameters
    string$valuethe css string
    Returns
    string normalized css

◆ sanitizeTagAttrs()

static Wikimedia\Parsoid\Core\Sanitizer::sanitizeTagAttrs ( SiteConfig $siteConfig,
?string $tagName,
?Token $token,
array $attrs )
static
Parameters
SiteConfig$siteConfig
?string$tagName
?Token$token
array$attrs
Returns
array

◆ sanitizeTitleURI()

static Wikimedia\Parsoid\Core\Sanitizer::sanitizeTitleURI ( string $title,
bool $isInterwiki = false )
static

Sanitize a title to be used in a URI?

Parameters
string$title
bool$isInterwiki
Returns
string

Member Data Documentation

◆ ID_FALLBACK

const Wikimedia\Parsoid\Core\Sanitizer::ID_FALLBACK = 1

Tells escapeUrlForHtml() to encode the ID using the fallback encoding, or return false if no fallback is configured.

Since
1.30

The documentation for this class was generated from the following file: