Parsoid
A bidirectional parser between wikitext and HTML5
|
PRE-handling relies on the following 6-state FSM. More...
Public Member Functions | ||||
__construct (TokenTransformManager $manager, array $options) | ||||
resetState (array $opts) | ||||
Resets any internal state for this token handler. | ||||
onNewline (NlTk $token) | ||||
This handler is called for newline tokens only.
| ||||
onEnd (EOFTk $token) | ||||
This handler is called for EOF tokens only.
| ||||
onAny ( $token) | ||||
This handler is called for all tokens in the token stream except if (a) The more specific handlers above modified the token (b) the more specific handlers (onTag, onEnd, onNewline) have set the skip flag in their return values.(c) this handlers 'active' flag is set to false (can be set by any of the handlers).
| ||||
Public Member Functions inherited from Wikimedia\Parsoid\Wt2Html\TT\TokenHandler | ||||
setPipelineId (int $id) | ||||
isDisabled () | ||||
Is this transformer disabled? | ||||
onTag (Token $token) | ||||
This handler is called for tokens that are not EOFTk or NLTk tokens. | ||||
process ( $tokens) | ||||
Push an input array of tokens through the transformer and return the transformed tokens. | ||||
Static Public Member Functions | |
static | newIndentPreWS () |
Create a token to represent the indent-pre whitespace character. | |
static | isIndentPreWS ( $tokenOrNode) |
Does this token or node represent an indent-pre whitespace character? | |
Additional Inherited Members | |
Protected Attributes inherited from Wikimedia\Parsoid\Wt2Html\TT\TokenHandler | |
$env | |
$manager | |
$pipelineId | |
$options | |
bool | $disabled = false |
This is set if the token handler is disabled for the entire pipeline. | |
bool | $onAnyEnabled = true |
This is set/reset by the token handlers at various points in the token stream based on what is encountered. | |
$atTopLevel = false | |
PRE-handling relies on the following 6-state FSM.
genPre : return merge("<pre>$TOKS</pre>" while skipping sol-tr toks, sol-tr toks) processCurrLine : $TOKS += $PRE_TOKS; $PRE_TOKS = []; purgeBuffers : convert meta token to ' '; processCurrLine; RET = $TOKS; $TOKS = []; return RET discardCurrLinePre : return merge(genPre, purgeBuffers)
Wikimedia\Parsoid\Wt2Html\TT\PreHandler::__construct | ( | TokenTransformManager | $manager, |
array | $options ) |
TokenTransformManager | $manager | manager enviroment |
array | $options | various configuration options |
Reimplemented from Wikimedia\Parsoid\Wt2Html\TT\TokenHandler.
|
static |
Does this token or node represent an indent-pre whitespace character?
Token | Node | string | $tokenOrNode |
|
static |
Create a token to represent the indent-pre whitespace character.
This token will not make it to the final output and is only present to ensure DSR computation can account for this whitespace character. This meta tag will be removed in CleanUp::stripMarkerMetas().
Given that this token is purely an internal bookkeeping placeholder, it really does not matter how we represent it as long as (a) it doesn't impede code comprehension (b) it is more or less consistent with how other instances of this token behave (c) it doesn't introduce a lot of special-case handling and checks to deal with it.
Based on that consideration, we settle for a meta tag because meta tags are transparent to most token and DOM handlers.
Once we are done with all DOM processing, we expect indent-pre
tags to have DSR that looks like [ _, _, 1, 0 ], i.e. it has an opening tag width of 1 char and closing tag width of 0 char. But, since we are now explicitly representing the ws char as a meta-tag, wetag will not get a 1-char width during DSR computation since this meta-tag will consume that width. Accordingly, once we strip this meta-tag in the cleanup pass, we will reassign its width to the opening tag width of thetag.
Wikimedia\Parsoid\Wt2Html\TT\PreHandler::onAny | ( | $token | ) |
This handler is called for all tokens in the token stream except if (a) The more specific handlers above modified the token (b) the more specific handlers (onTag, onEnd, onNewline) have set the skip flag in their return values.(c) this handlers 'active' flag is set to false (can be set by any of the handlers).
Token | string | $token | Token to be processed |
Reimplemented from Wikimedia\Parsoid\Wt2Html\TT\TokenHandler.
Wikimedia\Parsoid\Wt2Html\TT\PreHandler::onEnd | ( | EOFTk | $token | ) |
This handler is called for EOF tokens only.
EOFTk | $token | EOF token to be processed |
Reimplemented from Wikimedia\Parsoid\Wt2Html\TT\TokenHandler.
Wikimedia\Parsoid\Wt2Html\TT\PreHandler::onNewline | ( | NlTk | $token | ) |
This handler is called for newline tokens only.
NlTk | $token | Newline token to be processed |
Reimplemented from Wikimedia\Parsoid\Wt2Html\TT\TokenHandler.
Wikimedia\Parsoid\Wt2Html\TT\PreHandler::resetState | ( | array | $options | ) |
Resets any internal state for this token handler.
array | $options |
Reimplemented from Wikimedia\Parsoid\Wt2Html\TT\TokenHandler.