Parsoid
A bidirectional parser between wikitext and HTML5
Parsoid\Config\Env Class Reference

Environment/Envelope class for Parsoid. More...

+ Inheritance diagram for Parsoid\Config\Env:

Public Member Functions

 __construct (SiteConfig $siteConfig, PageConfig $pageConfig, DataAccess $dataAccess, array $options=null)
 
 getSiteConfig ()
 Get the site config. More...
 
 getPageConfig ()
 Get the page config. More...
 
 getDataAccess ()
 Get the data access object. More...
 
 noDataAccess ()
 
 nativeTemplateExpansionEnabled ()
 
 getUID ()
 Get the current uid counter value. More...
 
 getFID ()
 Get the current fragment id counter value. More...
 
 getWrapSections ()
 Whether <section> wrappers should be added. More...
 
 getPipelineFactory ()
 
 getRequestOffsetType ()
 Return the external format of character offsets in source ranges. More...
 
 getCurrentOffsetType ()
 Return the current format of character offsets in source ranges. More...
 
 setCurrentOffsetType (string $offsetType)
 Update the current offset type. More...
 
 resolveTitle (string $str, bool $resolveOnly=false)
 Resolve strings that are page-fragments or subpage references with respect to the current page name. More...
 
 normalizedTitleKey (string $str, bool $noExceptions=false, bool $ignoreFragment=false)
 Get normalized title key for a title string. More...
 
 normalizeAndResolvePageTitle ()
 Normalize and resolve the page title. More...
 
 makeTitleFromText (string $str, $defaultNs=0, bool $noExceptions=false)
 Create a Title object. More...
 
 makeTitleFromURLDecodedStr (string $str, $defaultNs=0, bool $noExceptions=false)
 Create a Title object. More...
 
 makeLink (Title $title)
 Make a link to a Title. More...
 
 isValidLinkTarget ( $href)
 Test if an href attribute value could be a valid link target. More...
 
 generateUID ()
 Generate a new uid. More...
 
 newObjectId ()
 Generate a new object id. More...
 
 newAboutId ()
 Generate a new about id. More...
 
 setOrigDOM (DOMElement $domBody)
 Store reference to original DOM (body) More...
 
 getOrigDOM ()
 Return reference to original DOM (body) More...
 
 setDOMDiff ( $doc)
 Store reference to DOM diff document. More...
 
 getDOMDiff ()
 Return reference to DOM diff document. More...
 
 newFragmentId ()
 Generate a new fragment id. More...
 
 referenceDataObject (DOMDocument $doc, ?DataBag $bag=null)
 FIXME: This function could be given a better name to reflect what it does. More...
 
 createDocument (string $html='')
 
 setVariable (string $variable, $state)
 BehaviorSwitchHandler support function that adds a property named by $variable and sets it to $state. More...
 
 setBehaviorSwitch (string $switch, $state)
 Record a behavior switch. More...
 
 getBehaviorSwitch (string $switch, $default=null)
 Fetch the state of a previously-recorded behavior switch. More...
 
 getPageMainContent ()
 FIXME: Once we remove the hardcoded slot name here, the name of this method could be updated, if necessary. More...
 
 getFragmentMap ()
 
 getFragment (string $id)
 
 setFragment (string $id, array $forest)
 
 recordLint (string $type, array $lintData)
 Record a lint. More...
 
 getLints ()
 Retrieve recorded lints. More...
 
 log (... $args)
 
 bumpTimeUse (string $resource, $time, $cat)
 Update a profile timer. More...
 
 bumpCount (string $resource, int $n=1)
 Update a profile counter. More...
 
 bumpWt2HtmlResourceUse (string $resource, int $count=1)
 Bump usage of some limited parser resource (ex: tokens, # transclusions, # list items, etc.) More...
 
 bumpHtml2WtResourceUse (string $resource, int $count=1)
 Bump usage of some limited serializer resource (ex: html size) More...
 
 getContentHandler (?string $forceContentModel=null)
 Get an appropriate content handler, given a contentmodel. More...
 
 langConverterEnabled ()
 Is the language converter enabled on this page? More...
 
 shouldScrubWikitext ()
 Indicates emit "clean" wikitext compared to what we would if we didn't normalize HTML. More...
 
 getInputContentVersion ()
 The HTML content version of the input document (for html2wt and html2html conversions). More...
 
 setInputContentVersion (string $version)
 
 getOutputContentVersion ()
 The HTML content version of the input document (for html2wt and html2html conversions). More...
 
 setOutputContentVersion (string $version)
 
 resolveContentVersion ( $version)
 See if any content version Parsoid knows how to produce satisfies the the supplied version, when interpreted with semver caret semantics. More...
 
 getHtmlVariantLanguage ()
 If non-null, the language variant used for Parsoid HTML; we convert to this if wt2html, or from this (if html2wt). More...
 
 getWtVariantLanguage ()
 If non-null, the language variant to be used for wikitext. More...
 
 addOutputProperty (string $key, array $value)
 Update K=[V1,V2,...] that might need to be output as part of the generated HTML. More...
 
 getOutputProperties ()
 
 htmlVary ()
 Determine appropriate vary headers for the HTML form of this page. More...
 
 htmlContentLanguage ()
 Determine an appropriate content-language for the HTML form of this page. More...
 

Public Attributes

const AVAILABLE_VERSIONS = [ '2.1.0', '999.0.0' ]
 Available HTML content versions. More...
 
 $topFrame
 
 $traceFlags
 
 $dumpFlags
 
 $debugFlags
 
 $startTime
 
 $styleTagKeys = []
 
 $pageBundle = false
 
 $discardDataParsoid = false
 
 $pageCache = []
 
 $transclusionCache = []
 
 $mediaCache = []
 
 $extensionCache = []
 

Detailed Description

Environment/Envelope class for Parsoid.

Carries around the SiteConfig and PageConfig during an operation and provides certain other services.

Constructor & Destructor Documentation

◆ __construct()

Parsoid\Config\Env::__construct ( SiteConfig  $siteConfig,
PageConfig  $pageConfig,
DataAccess  $dataAccess,
array  $options = null 
)
Parameters
SiteConfig$siteConfig
PageConfig$pageConfig
DataAccess$dataAccess
array | null$options
  • wrapSections: (bool) Whether <section> wrappers should be added.
  • pageBundle: (bool) Sets ids on nodes and stores data-* attributes in a JSON blob.
  • scrubWikitext: (bool) Indicates emit "clean" wikitext.
  • traceFlags: (array) Flags indicating which components need to be traced
  • dumpFlags: (bool[]) Dump flags
  • debugFlags: (bool[]) Debug flags
  • noDataAccess: boolean
  • nativeTemplateExpansion: boolean
  • discardDataParsoid: boolean
  • offsetType: 'byte' (default), 'ucs2', 'char' See Parsoid\Wt2Html\PP\Processors\ConvertOffsets.
  • titleShouldExist: (bool) Are we expecting page content to exist?

Member Function Documentation

◆ addOutputProperty()

Parsoid\Config\Env::addOutputProperty ( string  $key,
array  $value 
)

Update K=[V1,V2,...] that might need to be output as part of the generated HTML.

Ex: module styles, modules scripts, ...

Parameters
string$key
array$value

◆ bumpCount()

Parsoid\Config\Env::bumpCount ( string  $resource,
int  $n = 1 
)

Update a profile counter.

Parameters
string$resource
int$nThe amount to increment the counter; defaults to 1.

◆ bumpHtml2WtResourceUse()

Parsoid\Config\Env::bumpHtml2WtResourceUse ( string  $resource,
int  $count = 1 
)

Bump usage of some limited serializer resource (ex: html size)

Parameters
string$resource
int$countHow much of the resource is used? (defaults to 1)
Exceptions
ResourceLimitExceededException

◆ bumpTimeUse()

Parsoid\Config\Env::bumpTimeUse ( string  $resource,
  $time,
  $cat 
)

Update a profile timer.

Parameters
string$resource
mixed$time
mixed$cat

◆ bumpWt2HtmlResourceUse()

Parsoid\Config\Env::bumpWt2HtmlResourceUse ( string  $resource,
int  $count = 1 
)

Bump usage of some limited parser resource (ex: tokens, # transclusions, # list items, etc.)

Parameters
string$resource
int$countHow much of the resource is used?
Exceptions
ResourceLimitExceededException

◆ createDocument()

Parsoid\Config\Env::createDocument ( string  $html = '')
Parameters
string$html
Returns
DOMDocument

◆ generateUID()

Parsoid\Config\Env::generateUID ( )

Generate a new uid.

Returns
int

◆ getBehaviorSwitch()

Parsoid\Config\Env::getBehaviorSwitch ( string  $switch,
  $default = null 
)

Fetch the state of a previously-recorded behavior switch.

Todo:
Does this belong here, or on some equivalent to MediaWiki's ParserOutput?
Parameters
string$switchSwitch name
mixed | null$defaultDefault value if the switch was never set
Returns
mixed State data that was previously passed to setBehaviorSwitch(), or $default

◆ getContentHandler()

Parsoid\Config\Env::getContentHandler ( ?string  $forceContentModel = null)

Get an appropriate content handler, given a contentmodel.

Parameters
string | null$forceContentModelAn optional content model which will override whatever the source specifies.
Returns
ContentModelHandler An appropriate content handler

◆ getCurrentOffsetType()

Parsoid\Config\Env::getCurrentOffsetType ( )

Return the current format of character offsets in source ranges.

This allows us to track whether the internal byte offsets have been converted to the external format (as returned by getRequestOffsetType) yet.

See also
Parsoid
Returns
string 'byte', 'ucs2', or 'char'

◆ getDataAccess()

Parsoid\Config\Env::getDataAccess ( )

Get the data access object.

Returns
DataAccess

◆ getDOMDiff()

Parsoid\Config\Env::getDOMDiff ( )

Return reference to DOM diff document.

Returns
DOMDocument|null

◆ getFID()

Parsoid\Config\Env::getFID ( )

Get the current fragment id counter value.

Returns
int

◆ getFragment()

Parsoid\Config\Env::getFragment ( string  $id)
Parameters
string$idFragment id
Returns
DOMNode[]

◆ getFragmentMap()

Parsoid\Config\Env::getFragmentMap ( )
Returns
array<string,DOMNode[]>

◆ getHtmlVariantLanguage()

Parsoid\Config\Env::getHtmlVariantLanguage ( )

If non-null, the language variant used for Parsoid HTML; we convert to this if wt2html, or from this (if html2wt).

Returns
string|null

◆ getInputContentVersion()

Parsoid\Config\Env::getInputContentVersion ( )

The HTML content version of the input document (for html2wt and html2html conversions).

See also
https://www.mediawiki.org/wiki/Parsoid/API#Content_Negotiation
https://www.mediawiki.org/wiki/Specs/HTML/2.1.0#Versioning
Returns
string A semver version number

◆ getLints()

Parsoid\Config\Env::getLints ( )

Retrieve recorded lints.

Returns
array[]

◆ getOrigDOM()

Parsoid\Config\Env::getOrigDOM ( )

Return reference to original DOM (body)

Returns
DOMElement

◆ getOutputContentVersion()

Parsoid\Config\Env::getOutputContentVersion ( )

The HTML content version of the input document (for html2wt and html2html conversions).

See also
https://www.mediawiki.org/wiki/Parsoid/API#Content_Negotiation
https://www.mediawiki.org/wiki/Specs/HTML/2.1.0#Versioning
Returns
string A semver version number

◆ getOutputProperties()

Parsoid\Config\Env::getOutputProperties ( )
Returns
array

◆ getPageConfig()

Parsoid\Config\Env::getPageConfig ( )

Get the page config.

Returns
PageConfig

◆ getPageMainContent()

Parsoid\Config\Env::getPageMainContent ( )

FIXME: Once we remove the hardcoded slot name here, the name of this method could be updated, if necessary.

Shortcut method to get page source

Deprecated:
Use $this->topFrame->getSrcText()
Returns
string

◆ getRequestOffsetType()

Parsoid\Config\Env::getRequestOffsetType ( )

Return the external format of character offsets in source ranges.

Internally we always keep DomSourceRange and SourceRange information as UTF-8 byte offsets for efficiency (matches the native string representation), but for external use we can convert these to other formats when we output wt2html or input for html2wt.

See also
Parsoid
Returns
string 'byte', 'ucs2', or 'char'

◆ getSiteConfig()

Parsoid\Config\Env::getSiteConfig ( )

Get the site config.

Returns
SiteConfig

◆ getUID()

Parsoid\Config\Env::getUID ( )

Get the current uid counter value.

Returns
int

◆ getWrapSections()

Parsoid\Config\Env::getWrapSections ( )

Whether <section> wrappers should be added.

Todo:
Does this actually belong here? Should it be a behavior switch?
Returns
bool

◆ getWtVariantLanguage()

Parsoid\Config\Env::getWtVariantLanguage ( )

If non-null, the language variant to be used for wikitext.

If null, heuristics will be used to identify the original wikitext variant in wt2html mode, and in html2wt mode new or edited HTML will be left unconverted.

Returns
string|null

◆ htmlContentLanguage()

Parsoid\Config\Env::htmlContentLanguage ( )

Determine an appropriate content-language for the HTML form of this page.

Returns
string

◆ htmlVary()

Parsoid\Config\Env::htmlVary ( )

Determine appropriate vary headers for the HTML form of this page.

Returns
string

◆ isValidLinkTarget()

Parsoid\Config\Env::isValidLinkTarget (   $href)

Test if an href attribute value could be a valid link target.

Parameters
string|(Token|string)[]$href
Returns
bool

◆ langConverterEnabled()

Parsoid\Config\Env::langConverterEnabled ( )

Is the language converter enabled on this page?

Returns
bool

◆ log()

Parsoid\Config\Env::log (   $args)
Parameters
mixed...$args

◆ makeLink()

Parsoid\Config\Env::makeLink ( Title  $title)

Make a link to a Title.

Parameters
Title$title
Returns
string

◆ makeTitleFromText()

Parsoid\Config\Env::makeTitleFromText ( string  $str,
  $defaultNs = 0,
bool  $noExceptions = false 
)

Create a Title object.

See also
Title::newFromURL in MediaWiki
Parameters
string$strURL-encoded text
int | TitleNamespace$defaultNs
bool$noExceptions
Returns
Title|null

◆ makeTitleFromURLDecodedStr()

Parsoid\Config\Env::makeTitleFromURLDecodedStr ( string  $str,
  $defaultNs = 0,
bool  $noExceptions = false 
)

Create a Title object.

See also
Title::newFromText in MediaWiki
Parameters
string$strURL-decoded text
int | TitleNamespace$defaultNs
bool$noExceptions
Returns
Title|null

◆ newAboutId()

Parsoid\Config\Env::newAboutId ( )

Generate a new about id.

Returns
string

◆ newFragmentId()

Parsoid\Config\Env::newFragmentId ( )

Generate a new fragment id.

Returns
string

◆ newObjectId()

Parsoid\Config\Env::newObjectId ( )

Generate a new object id.

Returns
string

◆ normalizeAndResolvePageTitle()

Parsoid\Config\Env::normalizeAndResolvePageTitle ( )

Normalize and resolve the page title.

Deprecated:
Just use $this->getPageConfig()->getTitle() directly
Returns
string

◆ normalizedTitleKey()

Parsoid\Config\Env::normalizedTitleKey ( string  $str,
bool  $noExceptions = false,
bool  $ignoreFragment = false 
)

Get normalized title key for a title string.

Parameters
string$strShould be in url-decoded format.
bool$noExceptionsReturn null instead of throwing exceptions.
bool$ignoreFragmentIgnore the fragment, if any.
Returns
string|null Normalized title key for a title string (or null for invalid titles).

◆ recordLint()

Parsoid\Config\Env::recordLint ( string  $type,
array  $lintData 
)

Record a lint.

Parameters
string$typeLint type key
array$lintDataData for the lint.

◆ referenceDataObject()

Parsoid\Config\Env::referenceDataObject ( DOMDocument  $doc,
?DataBag  $bag = null 
)

FIXME: This function could be given a better name to reflect what it does.

Parameters
DOMDocument$doc
DataBag | null$bag

◆ resolveContentVersion()

Parsoid\Config\Env::resolveContentVersion (   $version)

See if any content version Parsoid knows how to produce satisfies the the supplied version, when interpreted with semver caret semantics.

This will allow us to make backwards compatible changes, without the need for clients to bump the version in their headers all the time.

Parameters
string$version
Returns
string|null

◆ resolveTitle()

Parsoid\Config\Env::resolveTitle ( string  $str,
bool  $resolveOnly = false 
)

Resolve strings that are page-fragments or subpage references with respect to the current page name.

TODO: Handle namespaces relative links like [[User:../../]] correctly, they shouldn't be treated like links at all.

Parameters
string$strPage fragment or subpage reference. Not URL encoded.
bool$resolveOnlyIf true, only trim and add the current title to lone fragments. TODO: This parameter seems poorly named.
Returns
string Resolved title

◆ setBehaviorSwitch()

Parsoid\Config\Env::setBehaviorSwitch ( string  $switch,
  $state 
)

Record a behavior switch.

Todo:
Does this belong here, or on some equivalent to MediaWiki's ParserOutput?
Parameters
string$switchSwitch name
mixed$stateRelevant state data to record

◆ setCurrentOffsetType()

Parsoid\Config\Env::setCurrentOffsetType ( string  $offsetType)

Update the current offset type.

Only Parsoid should be doing this.

Parameters
string$offsetType'byte', 'ucs2', or 'char'

◆ setDOMDiff()

Parsoid\Config\Env::setDOMDiff (   $doc)

Store reference to DOM diff document.

Parameters
DOMDocument$doc

◆ setFragment()

Parsoid\Config\Env::setFragment ( string  $id,
array  $forest 
)
Parameters
string$idFragment id
DOMNode[]$forest DOM forest (contiguous array of DOM trees) to store against the fragment id

◆ setOrigDOM()

Parsoid\Config\Env::setOrigDOM ( DOMElement  $domBody)

Store reference to original DOM (body)

Parameters
DOMElement$domBody

◆ setVariable()

Parsoid\Config\Env::setVariable ( string  $variable,
  $state 
)

BehaviorSwitchHandler support function that adds a property named by $variable and sets it to $state.

Deprecated:
Use setBehaviorSwitch() instead.
Parameters
string$variable
mixed$state

◆ shouldScrubWikitext()

Parsoid\Config\Env::shouldScrubWikitext ( )

Indicates emit "clean" wikitext compared to what we would if we didn't normalize HTML.

Returns
bool

Member Data Documentation

◆ AVAILABLE_VERSIONS

const Parsoid\Config\Env::AVAILABLE_VERSIONS = [ '2.1.0', '999.0.0' ]

The documentation for this class was generated from the following file: