Parsoid
A bidirectional parser between wikitext and HTML5
|
SiteConfig via MediaWiki's Action API. More...
Public Member Functions | |||||||||||||
__construct (ApiHelper $api, array $opts) | |||||||||||||
hasVideoInfo () | |||||||||||||
galleryOptions () | |||||||||||||
Default gallery options for this wiki. | |||||||||||||
allowedExternalImagePrefixes () | |||||||||||||
Allowed external image URL prefixes. | |||||||||||||
baseURI () | |||||||||||||
Site base URI. | |||||||||||||
relativeLinkPrefix () | |||||||||||||
Prefix for relative links. | |||||||||||||
canonicalNamespaceId (string $name) | |||||||||||||
Map a canonical namespace name to its index.
| |||||||||||||
namespaceId (string $name) | |||||||||||||
Map a namespace name to its index.
| |||||||||||||
namespaceName (int $ns) | |||||||||||||
Map a namespace index to its preferred name.
| |||||||||||||
namespaceHasSubpages (int $ns) | |||||||||||||
Test if a namespace has subpages.
| |||||||||||||
namespaceCase (int $ns) | |||||||||||||
Return namespace case setting.
| |||||||||||||
specialPageLocalName (string $alias) | |||||||||||||
Get the default local name for a special page.
| |||||||||||||
interwikiMagic () | |||||||||||||
Treat language links as magic connectors, not inline links. | |||||||||||||
interwikiMap () | |||||||||||||
Interwiki link data. | |||||||||||||
iwp () | |||||||||||||
Wiki identifier, for cache keys. | |||||||||||||
legalTitleChars () | |||||||||||||
Legal title characters. | |||||||||||||
linkPrefixRegex () | |||||||||||||
Link prefix regular expression. | |||||||||||||
lang () | |||||||||||||
Wiki language code. | |||||||||||||
mainpage () | |||||||||||||
Main page title. | |||||||||||||
responsiveReferences () | |||||||||||||
Responsive references configuration. | |||||||||||||
rtl () | |||||||||||||
Whether the wiki language is right-to-left. | |||||||||||||
langConverterEnabled (string $lang) | |||||||||||||
Whether language converter is enabled for the specified language.
| |||||||||||||
script () | |||||||||||||
The URL path to index.php. | |||||||||||||
scriptpath () | |||||||||||||
FIXME: This is only used to compute the modules path below and maybe shouldn't be exposed. | |||||||||||||
server () | |||||||||||||
The base URL of the server. | |||||||||||||
exportMetadataToHead (Document $document, ContentMetadataCollector $metadata, string $defaultTitle, string $lang) | |||||||||||||
Export content metadata via meta tags (and via a stylesheet for now to aid some clients).
| |||||||||||||
redirectRegexp () | |||||||||||||
A regexp matching the localized 'REDIRECT' marker for this wiki. | |||||||||||||
categoryRegexp () | |||||||||||||
A regexp matching the localized 'Category' prefix for this wiki. | |||||||||||||
bswRegexp () | |||||||||||||
A regexp matching localized behavior switches for this wiki. | |||||||||||||
timezoneOffset () | |||||||||||||
The wiki's time zone offset. | |||||||||||||
variants () | |||||||||||||
Language variant information. | |||||||||||||
widthOption () | |||||||||||||
Default thumbnail width. | |||||||||||||
getMagicWordMatcher (string $id) | |||||||||||||
Get a regexp matching a localized magic word, given its id.FIXME: misleading function name
| |||||||||||||
getParameterizedAliasMatcher (array $words) | |||||||||||||
Get a matcher function for fetching values out of interpolated magic words, ie those with $1 in their aliases.The matcher takes a string and returns null if it doesn't match any of the words, or an associative array if it did match:
| |||||||||||||
ensureExtensionTag (string $tag) | |||||||||||||
This function is public so it can be used to synchronize env for hybrid parserTests. | |||||||||||||
getMaxTemplateDepth () | |||||||||||||
Get the maximum template depth.
| |||||||||||||
metrics () | |||||||||||||
Statistics aggregator, for counting and timing.
| |||||||||||||
getNoFollowConfig () | |||||||||||||
| |||||||||||||
getExternalLinkTarget () | |||||||||||||
| |||||||||||||
Public Member Functions inherited from Wikimedia\Parsoid\Config\SiteConfig | |||||||||||||
registerExtensionModule ( $configOrSpec) | |||||||||||||
Register a Parsoid extension module. | |||||||||||||
unregisterExtensionModule (int $extId) | |||||||||||||
Unregister a Parsoid extension module. | |||||||||||||
getExtensionModules () | |||||||||||||
Return the set of Parsoid extension modules associated with this SiteConfig. | |||||||||||||
__construct () | |||||||||||||
Base constructor. | |||||||||||||
fakeTimestamp () | |||||||||||||
Fake timestamp, for unit tests. | |||||||||||||
getObjectFactory () | |||||||||||||
Return an object factory to use when instantiating extensions. | |||||||||||||
tagNeedsNowikiStrippedInTagPF (string $lowerTagName) | |||||||||||||
getContentModelHandler (string $contentmodel) | |||||||||||||
Return a ContentModelHandler for the specified $contentmodel, if one is registered. | |||||||||||||
isExtensionTag (string $name) | |||||||||||||
Determine whether a given name, which must have already been converted to lower case, is a valid extension tag name. | |||||||||||||
isAnnotationTag (string $tagName) | |||||||||||||
getAnnotationTags () | |||||||||||||
Get an array of defined annotation tags in lower case. | |||||||||||||
getExtensionTagNameMap () | |||||||||||||
Get an array of defined extension tags, with the lower case name in the key, and the value being arbitrary. | |||||||||||||
getExtTagConfig (string $tagName) | |||||||||||||
getExtTagImpl (string $tagName) | |||||||||||||
getExtDOMProcessors () | |||||||||||||
Return an array mapping extension name to an array of object factory specs for Ext\DOMProcessor objects. | |||||||||||||
getWt2HtmlLimits () | |||||||||||||
getHtml2WtLimits () | |||||||||||||
getLogger () | |||||||||||||
General log channel. | |||||||||||||
setLogger (?LoggerInterface $logger) | |||||||||||||
Set the log channel, for debugging. | |||||||||||||
nativeGalleryEnabled () | |||||||||||||
"Native gallery" serialization. | |||||||||||||
addHTMLTemplateParameters () | |||||||||||||
When processing template parameters, parse them to HTML and add it to the template parameters data. | |||||||||||||
linting () | |||||||||||||
Whether to enable linter Backend. | |||||||||||||
tidyWhitespaceBugMaxLength () | |||||||||||||
Maximum run length for Tidy whitespace bug. | |||||||||||||
scrubBidiChars () | |||||||||||||
If enabled, bidi chars adjacent to category links will be stripped in the html -> wt serialization pass. | |||||||||||||
bswPagePropRegexp () | |||||||||||||
Regex matching all double-underscore magic words. | |||||||||||||
namespaceIsTalk (int $ns) | |||||||||||||
Test if a namespace is a talk namespace. | |||||||||||||
ucfirst (string $str) | |||||||||||||
Uppercasing method for titles. | |||||||||||||
interwikiMapNoNamespaces () | |||||||||||||
Interwiki link data, after removing items that conflict with namespace names. | |||||||||||||
interwikiMatcher (string $href) | |||||||||||||
Match interwiki URLs. | |||||||||||||
linkTrailRegex () | |||||||||||||
Link trail regular expression. | |||||||||||||
langConverterEnabledForLanguage (string $lang) | |||||||||||||
Is the language converter enabled for this language? | |||||||||||||
solTransparentWikitextRegexp () | |||||||||||||
A regex matching a line containing just whitespace, comments, and sol transparent links and behavior switches. | |||||||||||||
solTransparentWikitextNoWsRegexp (bool $addIncludes=false) | |||||||||||||
A regex matching a line containing just comments and sol transparent links and behavior switches. | |||||||||||||
magicWords () | |||||||||||||
List all magic words by alias. | |||||||||||||
mwAliases () | |||||||||||||
List all magic words by canonical name. | |||||||||||||
getMagicWordForFunctionHook (string $str) | |||||||||||||
Return canonical magic word for a function hook. | |||||||||||||
getMagicWordForVariable (string $str) | |||||||||||||
Return canonical magic word for a variable. | |||||||||||||
magicWordCanonicalName (string $word) | |||||||||||||
Get canonical magicword name for the input word. | |||||||||||||
isMagicWord (string $word) | |||||||||||||
Check if a string is a recognized magic word. | |||||||||||||
getMagicWordWT (string $word, string $suggest) | |||||||||||||
Convert the internal canonical magic word name to the wikitext alias. | |||||||||||||
getMediaPrefixParameterizedAliasMatcher () | |||||||||||||
Get a matcher function for fetching values out of interpolated magic words which are media prefix options. | |||||||||||||
getExtResourceURLPatternMatcher () | |||||||||||||
Matcher for ISBN/RFC/PMID URL patterns, returning the type and number. | |||||||||||||
makeExtResourceURL (array $match, string $href, string $content) | |||||||||||||
Serialize ISBN/RFC/PMID URL patterns. | |||||||||||||
getProtocolsRegex ( $excludeProtRel=false) | |||||||||||||
Get a regex fragment matching URL protocols, quoted for an exclamation mark delimiter. | |||||||||||||
hasValidProtocol (string $potentialLink) | |||||||||||||
Matcher for valid protocols, must be anchored at start of string. | |||||||||||||
findValidProtocol (string $potentialLink) | |||||||||||||
Matcher for valid protocols, may occur at any point within string. | |||||||||||||
Static Public Member Functions | |
static | fromSettings (array $parsoidSettings) |
Static Public Member Functions inherited from Wikimedia\Parsoid\Config\SiteConfig | |
static | createLogger (?string $filePath=null) |
Public Attributes | |
const | SITE_CONFIG_QUERY_PARAMS |
Protected Member Functions | ||||||||||
reset () | ||||||||||
addNamespace (array $ns) | ||||||||||
Add a new namespace to the config. | ||||||||||
linkTrail () | ||||||||||
Return raw link trail regexp from config.
| ||||||||||
getVariableIDs () | ||||||||||
| ||||||||||
haveComputedFunctionSynonyms () | ||||||||||
Does the SiteConfig provide precomputed function synonyms? If no, the SiteConfig is expected to provide an implementation for updateFunctionSynonym.
| ||||||||||
updateFunctionSynonym (string $func, string $magicword, bool $caseSensitive) | ||||||||||
| ||||||||||
getMagicWords () | ||||||||||
| ||||||||||
getNonNativeExtensionTags () | ||||||||||
Get an array of defined extension tags, with the lower case name in the key, the value arbitrary.This is the set of extension tags that are configured in M/W core. $coreExtModules may already be part of it, but eventually this distinction will disappear since all extension tags have to be defined against the Parsoid's extension API.
| ||||||||||
getSpecialNSAliases () | ||||||||||
Return name spaces aliases for the NS_SPECIAL namespace.
| ||||||||||
getSpecialPageAliases (string $specialPage) | ||||||||||
Return Special Page aliases for a special page name.
| ||||||||||
getProtocols () | ||||||||||
Get the list of valid protocols.
| ||||||||||
Protected Member Functions inherited from Wikimedia\Parsoid\Config\SiteConfig | ||||||||||
processExtensionModule (ExtensionModule $ext) | ||||||||||
Register a Parsoid-compatible extension. | ||||||||||
getExtConfig () | ||||||||||
exportMetadataHelper (Document $document, string $modulesLoadURI, array $modules, array $moduleStyles, array $jsConfigVars, string $htmlTitle, string $lang) | ||||||||||
Helper function to create <head> elements from metadata. | ||||||||||
getFunctionSynonyms () | ||||||||||
Get a list of precomputed function synonyms. | ||||||||||
Protected Attributes | |
$nsNames = [] | |
@phan-var array<int,string> | |
$nsCase = [] | |
@phan-var array<int,string> | |
$nsIds = [] | |
@phan-var array<string,int> | |
$nsCanon = [] | |
@phan-var array<string,int> | |
$nsWithSubpages = [] | |
@phan-var array<int,bool> | |
Protected Attributes inherited from Wikimedia\Parsoid\Config\SiteConfig | |
$magicWordMap | |
$functionSynonyms | |
$interwikiMapNoNamespaces | |
$linkTrailRegex = false | |
$logger = null | |
$iwMatcherBatchSize = 4096 | |
$addHTMLTemplateParameters = false | |
$scrubBidiChars = false | |
$linterEnabled = false | |
$extConfig = null | |
$wt2htmlLimits | |
@phan-var array<string,int> | |
$html2wtLimits | |
@phan-var array<string,int> | |
Additional Inherited Members | |
Static Protected Member Functions inherited from Wikimedia\Parsoid\Config\SiteConfig | |
static | quoteTitleRe (string $s, string $delimiter='/') |
Quote a title regex. | |
SiteConfig via MediaWiki's Action API.
Note this is intended for testing, not performance.
Wikimedia\Parsoid\Config\Api\SiteConfig::__construct | ( | ApiHelper | $api, |
array | $opts ) |
ApiHelper | $api | |
array | $opts |
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
|
protected |
Add a new namespace to the config.
Protected access to let mocks and parser tests versions add new namespaces as required.
array | $ns | Namespace info |
Wikimedia\Parsoid\Config\Api\SiteConfig::allowedExternalImagePrefixes | ( | ) |
Allowed external image URL prefixes.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::baseURI | ( | ) |
Site base URI.
This would be the URI found in <base href="..." />
.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::bswRegexp | ( | ) |
A regexp matching localized behavior switches for this wiki.
The regexp should be delimited, but should not have boundary anchors or capture groups.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::canonicalNamespaceId | ( | string | $name | ) |
Map a canonical namespace name to its index.
string | $name | all-lowercase and with underscores rather than spaces. |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::categoryRegexp | ( | ) |
A regexp matching the localized 'Category' prefix for this wiki.
The regexp should be delimited, but should not have boundary anchors or capture groups.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::ensureExtensionTag | ( | string | $tag | ) |
This function is public so it can be used to synchronize env for hybrid parserTests.
The parserTests setup includes the definition of a number of non-standard extension tags, whose names are passed over from the JS side in hybrid testing.
string | $tag | Name of an extension tag assumed to be present |
Wikimedia\Parsoid\Config\Api\SiteConfig::exportMetadataToHead | ( | Document | $document, |
ContentMetadataCollector | $metadata, | ||
string | $defaultTitle, | ||
string | $lang ) |
Export content metadata via meta tags (and via a stylesheet for now to aid some clients).
Document | $document | |
ContentMetadataCollector | $metadata | |
string | $defaultTitle | The default title to display, as an unescaped string |
string | $lang |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
static |
array | $parsoidSettings |
Wikimedia\Parsoid\Config\Api\SiteConfig::galleryOptions | ( | ) |
Default gallery options for this wiki.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::getExternalLinkTarget | ( | ) |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::getMagicWordMatcher | ( | string | $id | ) |
Get a regexp matching a localized magic word, given its id.FIXME: misleading function name
string | $id |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::getMaxTemplateDepth | ( | ) |
Wikimedia\Parsoid\Config\Api\SiteConfig::getNoFollowConfig | ( | ) |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Get an array of defined extension tags, with the lower case name in the key, the value arbitrary.This is the set of extension tags that are configured in M/W core. $coreExtModules may already be part of it, but eventually this distinction will disappear since all extension tags have to be defined against the Parsoid's extension API.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::getParameterizedAliasMatcher | ( | array | $words | ) |
Get a matcher function for fetching values out of interpolated magic words, ie those with $1
in their aliases.The matcher takes a string and returns null if it doesn't match any of the words, or an associative array if it did match:
string[] | $words | Magic words to match |
$name is the canonical magic word name $re has patterns for matching aliases
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Get the list of valid protocols.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Return name spaces aliases for the NS_SPECIAL namespace.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Return Special Page aliases for a special page name.
string | $specialPage |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Does the SiteConfig provide precomputed function synonyms? If no, the SiteConfig is expected to provide an implementation for updateFunctionSynonym.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::interwikiMagic | ( | ) |
Treat language links as magic connectors, not inline links.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::interwikiMap | ( | ) |
Interwiki link data.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::iwp | ( | ) |
Wiki identifier, for cache keys.
Should match a key in mwApiMap()?
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::lang | ( | ) |
Wikimedia\Parsoid\Config\Api\SiteConfig::langConverterEnabled | ( | string | $lang | ) |
Whether language converter is enabled for the specified language.
string | $lang | Language code |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::legalTitleChars | ( | ) |
Legal title characters.
Regex is intended to match bytes, not Unicode characters.
[]
) Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::linkPrefixRegex | ( | ) |
Link prefix regular expression.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
|
protected |
Return raw link trail regexp from config.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::mainpage | ( | ) |
Wikimedia\Parsoid\Config\Api\SiteConfig::metrics | ( | ) |
Statistics aggregator, for counting and timing.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::namespaceCase | ( | int | $ns | ) |
Return namespace case setting.
int | $ns |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::namespaceHasSubpages | ( | int | $ns | ) |
Test if a namespace has subpages.
int | $ns |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::namespaceId | ( | string | $name | ) |
Map a namespace name to its index.
string | $name |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::namespaceName | ( | int | $ns | ) |
Map a namespace index to its preferred name.
int | $ns |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::redirectRegexp | ( | ) |
A regexp matching the localized 'REDIRECT' marker for this wiki.
The regexp should be delimited, but should not have boundary anchors or capture groups.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::relativeLinkPrefix | ( | ) |
Prefix for relative links.
Prefix to prepend to a page title to link to that page. Intended to be relative to the URI returned by baseURI().
If possible, keep the default "./" so clients need not know this value to extract titles from link hrefs.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::responsiveReferences | ( | ) |
Responsive references configuration.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::rtl | ( | ) |
Whether the wiki language is right-to-left.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::script | ( | ) |
The URL path to index.php.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::scriptpath | ( | ) |
FIXME: This is only used to compute the modules path below and maybe shouldn't be exposed.
The base wiki path
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::server | ( | ) |
The base URL of the server.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::specialPageLocalName | ( | string | $alias | ) |
Get the default local name for a special page.
string | $alias | Special page alias |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::timezoneOffset | ( | ) |
The wiki's time zone offset.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
|
protected |
string | $func | |
string | $magicword | |
bool | $caseSensitive |
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::variants | ( | ) |
Language variant information.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Wikimedia\Parsoid\Config\Api\SiteConfig::widthOption | ( | ) |
Default thumbnail width.
Reimplemented from Wikimedia\Parsoid\Config\SiteConfig.
Reimplemented in Wikimedia\Parsoid\ParserTests\SiteConfig.
const Wikimedia\Parsoid\Config\Api\SiteConfig::SITE_CONFIG_QUERY_PARAMS |