Parsoid
A bidirectional parser between wikitext and HTML5
|
Wikitext to HTML serializer. More...
Public Member Functions | |
__construct (Env $env, $options) | |
linkHandler (Element $node) | |
Main link handler. | |
languageVariantHandler (Node $node) | |
escapeWikitext (SerializerState $state, string $text, array $opts) | |
Escape wikitext-like strings in '$text' so that $text renders as a plain string when rendered as HTML. | |
domToWikitext (array $opts, DocumentFragment $node) | |
htmlToWikitext (array $opts, string $html) | |
getAttributeKey (Element $node, string $key) | |
getAttributeValue (Element $node, string $key) | |
getAttributeValueAsShadowInfo (Element $node, string $key) | |
serializedImageAttrVal (Element $dataMWnode, Element $htmlAttrNode, string $key) | |
serializedAttrVal (Element $node, string $name) | |
tagNeedsEscaping (string $name) | |
Check if token needs escaping. | |
wrapAngleBracket (Token $token, string $inner) | |
serializeHTMLTag (Element $node, bool $wrapperUnmodified) | |
serializeHTMLEndTag (Element $node, $wrapperUnmodified) | |
serializeAttributes (Element $node, Token $token, bool $isWt=false) | |
handleLIHackIfApplicable (Element $node) | |
FIXME: Get rid of this function after content version 2.2.0 has expired from caches. | |
serializeFromParts (SerializerState $state, Element $node, array $srcParts) | |
Serialize a template from its parts. | |
serializeExtensionStartTag (Element $node, SerializerState $state) | |
defaultExtensionHandler (Element $node, SerializerState $state) | |
emitWikitext (string $res, Node $node) | |
Emit non-separator wikitext that does not need to be escaped. | |
serializeDOM (Node $node, bool $selserMode=false) | |
Serialize an HTML DOM. | |
trace (... $args) | |
Public Attributes | |
$wteHandlers | |
$env | |
Wikitext to HTML serializer.
Serializes a chunk of tokens or an HTML DOM to MediaWiki's wikitext flavor.
This serializer is designed to eventually
Not much effort has been invested so far on supporting non-Parsoid/VE-generated HTML. Some of this involves adaptively switching between wikitext and HTML representations based on the values of attributes and DOM context. A few special cases are already handled adaptively (multi-paragraph list item contents are serialized as HTML tags for example, generic A elements are serialized to HTML A tags), but in general support for this is mostly missing.
Example issue:
What to do about this?
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::__construct | ( | Env | $env, |
$options ) |
Env | $env | |
array | $options | List of options for serialization:
|
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::defaultExtensionHandler | ( | Element | $node, |
SerializerState | $state ) |
Element | $node | |
SerializerState | $state |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::domToWikitext | ( | array | $opts, |
DocumentFragment | $node ) |
array | $opts | |
DocumentFragment | $node |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::emitWikitext | ( | string | $res, |
Node | $node ) |
Emit non-separator wikitext that does not need to be escaped.
string | $res | |
Node | $node |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::escapeWikitext | ( | SerializerState | $state, |
string | $text, | ||
array | $opts ) |
Escape wikitext-like strings in '$text' so that $text renders as a plain string when rendered as HTML.
The escaping is done based on the context in which $text is present (ex: start-of-line, in a link, etc.)
SerializerState | $state | |
string | $text | |
array | $opts |
|
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::getAttributeKey | ( | Element | $node, |
string | $key ) |
Element | $node | |
string | $key |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::getAttributeValue | ( | Element | $node, |
string | $key ) |
Element | $node | |
string | $key | Attribute name. |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::getAttributeValueAsShadowInfo | ( | Element | $node, |
string | $key ) |
Element | $node | |
string | $key |
WTSUtils::getShadowInfo()
format, with an extra 'fromDataMW' flag. Wikimedia\Parsoid\Html2Wt\WikitextSerializer::handleLIHackIfApplicable | ( | Element | $node | ) |
FIXME: Get rid of this function after content version 2.2.0 has expired from caches.
Element | $node |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::htmlToWikitext | ( | array | $opts, |
string | $html ) |
array | $opts | |
string | $html |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::languageVariantHandler | ( | Node | $node | ) |
Element | $node |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::linkHandler | ( | Element | $node | ) |
Main link handler.
Element | $node | Used in multiple tag handlers ( and <link>), and hence added as top-level method |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializeAttributes | ( | Element | $node, |
Token | $token, | ||
bool | $isWt = false ) |
Element | $node | |
Token | $token | |
bool | $isWt |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializedAttrVal | ( | Element | $node, |
string | $name ) |
Element | $node | |
string | $name |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializedImageAttrVal | ( | Element | $dataMWnode, |
Element | $htmlAttrNode, | ||
string | $key ) |
Element | $dataMWnode | |
Element | $htmlAttrNode | |
string | $key |
WTSUtils::getShadowInfo()
format, possibly with an extra 'fromDataMW' flag. Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializeDOM | ( | Node | $node, |
bool | $selserMode = false ) |
Serialize an HTML DOM.
WARNING: You probably want to use WikitextContentModelHandler::fromDOM instead.
Document | DocumentFragment | $node | |
bool | $selserMode |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializeExtensionStartTag | ( | Element | $node, |
SerializerState | $state ) |
Element | $node | |
SerializerState | $state |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializeFromParts | ( | SerializerState | $state, |
Element | $node, | ||
array | $srcParts ) |
Serialize a template from its parts.
SerializerState | $state | |
Element | $node | |
stdClass[] | $srcParts | Template parts from TemplateInfo::getDataMw() |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializeHTMLEndTag | ( | Element | $node, |
$wrapperUnmodified ) |
Element | $node | |
bool | $wrapperUnmodified |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::serializeHTMLTag | ( | Element | $node, |
bool | $wrapperUnmodified ) |
Element | $node | |
bool | $wrapperUnmodified |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::tagNeedsEscaping | ( | string | $name | ) |
Check if token needs escaping.
string | $name |
string Log type for Wikimedia\Parsoid\Html2Wt\WikitextSerializer::trace | ( | $args | ) |
mixed | ...$args |
Wikimedia\Parsoid\Html2Wt\WikitextSerializer::wrapAngleBracket | ( | Token | $token, |
string | $inner ) |
Token | $token | |
string | $inner |