Parsoid
A bidirectional parser between wikitext and HTML5
Loading...
Searching...
No Matches
Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler Class Reference

HTML -> Wikitext serialization relies on walking the DOM and delegating the serialization requests to different DOM nodes. More...

+ Inheritance diagram for Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler:

Public Member Functions

 __construct (bool $forceSOL=false)
 
 handle (Element $node, SerializerState $state, bool $wrapperUnmodified=false)
 Serialize a DOM node to wikitext.
 
 before (Element $node, Node $otherNode, SerializerState $state)
 How many newlines should be emitted before this node?
 
 after (Element $node, Node $otherNode, SerializerState $state)
 How many newlines should be emitted after this node?
 
 firstChild (Node $node, Node $otherNode, SerializerState $state)
 How many newlines should be emitted before the first child?
 
 lastChild (Node $node, Node $otherNode, SerializerState $state)
 How many newlines should be emitted after the last child?
 
 forceSOL ()
 Put the serializer in start-of-line mode before it is handled.
 

Protected Member Functions

 getListBullets (SerializerState $state, Element $node)
 List helper: DOM-based list bullet construction.
 
 maxNLsInTable (Node $node, Node $origNode)
 Helper: Newline constraint helper for table nodes.
 
 serializeTableTag (string $symbol, ?string $endSymbol, SerializerState $state, Element $node, bool $wrapperUnmodified)
 Helper: Handles content serialization for table nodes.
 
 stxInfoValidForTableCell (SerializerState $state, Element $node)
 Helper: Checks whether syntax information in data-parsoid is valid in the presence of table edits.
 
 getLeadingSpace (SerializerState $state, Element $node, string $newEltDefault)
 Helper for several DOM handlers: Returns whitespace that needs to be emitted between the markup for the node and its content (ex: table cells, list items) based on node state (whether the node is original or new content) and other state (HTML version, whether selective serialization is enabled or not).
 
 getTrailingSpace (SerializerState $state, Element $node, string $newEltDefault)
 Helper for several DOM handlers: Returns whitespace that needs to be emitted between the markup for the node and its next sibling based on node state (whether the node is original or new content) and other state (HTML version, whether selective serialization is enabled or not).
 
 emitPlaceholderSrc (Element $node, SerializerState $state)
 Uneditable forms wrapped with mw:Placeholder tags OR unedited nowikis N.B.
 

Detailed Description

HTML -> Wikitext serialization relies on walking the DOM and delegating the serialization requests to different DOM nodes.

This class represents the interface that various DOM handlers are expected to implement.

There is the core 'handle' method that deals with converting the content of the node into wikitext markup.

Then there are 4 newline-constraint methods that specify the constraints that need to be satisfied for the markup to be valid. For example, list items should always start on a newline, but can only have a single newline separator. Paragraphs always start on a newline and need at least 2 newlines in wikitext for them to be recognized as paragraphs.

Each of the 4 newline-constraint methods (before, after, firstChild, lastChild) return an array with a 'min' and 'max' property. If a property is missing, it means that the dom node doesn't have any newline constraints. Some DOM handlers might therefore choose to implement none, some, or all of these methods.

The return values of each of these methods are treated as consraints and the caller will have to resolve potentially conflicting constraints between a pair of nodes (siblings, parent-child). For example, if an after handler of a node wants 1 newline, but the before handler of its sibling wants none.

Ideally, there should not be any incompatible constraints, but we haven't actually verified that this is the case. All consraint-hanlding code is in the separators-handling methods.

Member Function Documentation

◆ after()

◆ before()

◆ emitPlaceholderSrc()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::emitPlaceholderSrc ( Element $node,
SerializerState $state )
protected

Uneditable forms wrapped with mw:Placeholder tags OR unedited nowikis N.B.

We no longer emit self-closed nowikis as placeholders, so remove this once all our stored content is updated.

Parameters
Element$node
SerializerState$state

◆ firstChild()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::firstChild ( Node $node,
Node $otherNode,
SerializerState $state )

◆ forceSOL()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::forceSOL ( )

Put the serializer in start-of-line mode before it is handled.

All non-newline whitespace found between HTML nodes is stripped to ensure SOL state is guaranteed.

Returns
bool

◆ getLeadingSpace()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::getLeadingSpace ( SerializerState $state,
Element $node,
string $newEltDefault )
protected

Helper for several DOM handlers: Returns whitespace that needs to be emitted between the markup for the node and its content (ex: table cells, list items) based on node state (whether the node is original or new content) and other state (HTML version, whether selective serialization is enabled or not).

Parameters
SerializerState$state
Element$node
string$newEltDefault
Returns
string

◆ getListBullets()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::getListBullets ( SerializerState $state,
Element $node )
protected

List helper: DOM-based list bullet construction.

Parameters
SerializerState$state
Element$node
Returns
string

◆ getTrailingSpace()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::getTrailingSpace ( SerializerState $state,
Element $node,
string $newEltDefault )
protected

Helper for several DOM handlers: Returns whitespace that needs to be emitted between the markup for the node and its next sibling based on node state (whether the node is original or new content) and other state (HTML version, whether selective serialization is enabled or not).

Parameters
SerializerState$state
Element$node
string$newEltDefault
Returns
string

◆ handle()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::handle ( Element $node,
SerializerState $state,
bool $wrapperUnmodified = false )

Serialize a DOM node to wikitext.

Serialized wikitext should be returned via $state::emitChunk().

Parameters
Element$node
SerializerState$state
bool$wrapperUnmodified
Returns
Node|null The node to continue with (need not be an element always)

Reimplemented in Wikimedia\Parsoid\Html2Wt\DOMHandlers\AHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\BodyHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\BRHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\CaptionHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\DDHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\DTHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\EncapsulatedContentHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\FallbackHTMLHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\FigureHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\HeadingHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\HRHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\HTMLPreHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\ImgHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\JustChildrenHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\LIHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\LinkHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\ListHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\MediaHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\MetaHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\PHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\PreHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\QuoteHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\SpanHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\TableHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\TDHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\THHandler, and Wikimedia\Parsoid\Html2Wt\DOMHandlers\TRHandler.

◆ lastChild()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::lastChild ( Node $node,
Node $otherNode,
SerializerState $state )

How many newlines should be emitted after the last child?

Parameters
Element | DocumentFragment$node
Node$otherNode
SerializerState$state
Returns
array

Reimplemented in Wikimedia\Parsoid\Html2Wt\DOMHandlers\BodyHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\HTMLPreHandler, Wikimedia\Parsoid\Html2Wt\DOMHandlers\PreHandler, and Wikimedia\Parsoid\Html2Wt\DOMHandlers\TableHandler.

◆ maxNLsInTable()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::maxNLsInTable ( Node $node,
Node $origNode )
protected

Helper: Newline constraint helper for table nodes.

Parameters
Node$node
Node$origNode
Returns
int

◆ serializeTableTag()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::serializeTableTag ( string $symbol,
?string $endSymbol,
SerializerState $state,
Element $node,
bool $wrapperUnmodified )
protected

Helper: Handles content serialization for table nodes.

Parameters
string$symbol
?string$endSymbol
SerializerState$state
Element$node
bool$wrapperUnmodified
Returns
string

◆ stxInfoValidForTableCell()

Wikimedia\Parsoid\Html2Wt\DOMHandlers\DOMHandler::stxInfoValidForTableCell ( SerializerState $state,
Element $node )
protected

Helper: Checks whether syntax information in data-parsoid is valid in the presence of table edits.

For example "|" is no longer valid table-cell markup if a table cell is added before this cell.

Parameters
SerializerState$state
Element$node
Returns
bool

The documentation for this class was generated from the following file: