Parsoid
A bidirectional parser between wikitext and HTML5
Loading...
Searching...
No Matches
Wikimedia\Parsoid\Core\Sanitizer Class Reference

HTML sanitizer for MediaWiki. More...

Static Public Member Functions

static escapeLiteralHTMLTag (XMLTagTk $token)
 Token-based version of core \MediaWiki\Parser\Sanitizer::validateTag.
 
static isReservedDataAttribute (string $attr)
 Given an attribute name, checks whether it is a reserved data attribute (such as data-mw-foo) which is unavailable to user-generated HTML so MediaWiki core and extension code can safely use it to communicate with frontend code.
 
static normalizeCss (string $value)
 Normalize CSS into a format we can easily search for hostile input.
 
static checkCss ( $value)
 Pick apart some CSS and check it for forbidden or unsafe structures.
 

Public Attributes

const ID_PRIMARY = 0
 Tells escapeUrlForHtml() to encode the ID using the wiki's primary encoding.
 
const ID_FALLBACK = 1
 Tells escapeUrlForHtml() to encode the ID using the fallback encoding, or return false if no fallback is configured.
 

Detailed Description

HTML sanitizer for MediaWiki.

Member Function Documentation

◆ checkCss()

static Wikimedia\Parsoid\Core\Sanitizer::checkCss ( $value)
static

Pick apart some CSS and check it for forbidden or unsafe structures.

Returns a sanitized string. This sanitized string will have character references and escape sequences decoded and comments stripped (unless it is itself one valid comment, in which case the value will be passed through). If the input is just too evil, only a comment complaining about evilness will be returned.

Currently URL references, 'expression', 'tps' are forbidden.

NOTE: Despite the fact that character references are decoded, the returned string may contain character references given certain clever input strings. These character references must be escaped before the return value is embedded in HTML.

Warning
This method is intended to sanitize style attributes on html tags only. It is not safe to use on full CSS files.
Parameters
string$value
Returns
string

◆ escapeLiteralHTMLTag()

static Wikimedia\Parsoid\Core\Sanitizer::escapeLiteralHTMLTag ( XMLTagTk $token)
static

Token-based version of core \MediaWiki\Parser\Sanitizer::validateTag.

Parameters
XMLTagTk$token
Returns
bool
See also
\MediaWiki\Parser\Sanitizer::validateTag

◆ isReservedDataAttribute()

static Wikimedia\Parsoid\Core\Sanitizer::isReservedDataAttribute ( string $attr)
static

Given an attribute name, checks whether it is a reserved data attribute (such as data-mw-foo) which is unavailable to user-generated HTML so MediaWiki core and extension code can safely use it to communicate with frontend code.

Parameters
string$attrAttribute name.
Returns
bool

◆ normalizeCss()

static Wikimedia\Parsoid\Core\Sanitizer::normalizeCss ( string $value)
static

Normalize CSS into a format we can easily search for hostile input.

  • decode character references
  • decode escape sequences
  • convert characters that IE6 interprets into ascii
  • remove comments, unless the entire value is one single comment
    Parameters
    string$valuethe css string
    Returns
    string normalized css

Member Data Documentation

◆ ID_FALLBACK

const Wikimedia\Parsoid\Core\Sanitizer::ID_FALLBACK = 1

Tells escapeUrlForHtml() to encode the ID using the fallback encoding, or return false if no fallback is configured.

Since
1.30

◆ ID_PRIMARY

const Wikimedia\Parsoid\Core\Sanitizer::ID_PRIMARY = 0

Tells escapeUrlForHtml() to encode the ID using the wiki's primary encoding.

Since
1.30

The documentation for this class was generated from the following file: