MediaWiki  master
WikiTextStructure Class Reference

Class allowing to explore structure of parsed wikitext. More...

Collaboration diagram for WikiTextStructure:

Public Member Functions

 __construct (ParserOutput $parserOutput)
 
 getAuxiliaryText ()
 
 getDefaultSort ()
 Get the defaultsort property. More...
 
 getMainText ()
 
 getOpeningText ()
 
 headings ()
 Get headings on the page. More...
 

Static Public Member Functions

static parseSettingsInMessage ( $message)
 Parse a message content into an array. More...
 

Private Member Functions

 extractHeadingBeforeFirstHeading ( $text)
 Get text before first heading. More...
 
 extractWikitextParts ()
 Extract parts of the text - opening, main and auxiliary. More...
 
 getIgnoredHeadings ()
 Get list of heading to ignore. More...
 

Private Attributes

string $allText
 
string[] $auxiliaryElementSelectors
 selectors to elements that are considered auxiliary to article text for search More...
 
string[] $auxText = []
 
string[] $excludedElementSelectors
 selectors to elements that are excluded entirely from search More...
 
string $openingText
 
ParserOutput $parserOutput
 

Detailed Description

Class allowing to explore structure of parsed wikitext.

Definition at line 8 of file WikiTextStructure.php.

Constructor & Destructor Documentation

◆ __construct()

WikiTextStructure::__construct ( ParserOutput  $parserOutput)
Parameters
ParserOutput$parserOutput

Definition at line 68 of file WikiTextStructure.php.

References $parserOutput.

Member Function Documentation

◆ extractHeadingBeforeFirstHeading()

WikiTextStructure::extractHeadingBeforeFirstHeading (   $text)
private

Get text before first heading.

Parameters
string$text
Returns
string|null

Definition at line 189 of file WikiTextStructure.php.

References $matches, and Sanitizer\stripAllTags().

Referenced by extractWikitextParts().

◆ extractWikitextParts()

WikiTextStructure::extractWikitextParts ( )
private

Extract parts of the text - opening, main and auxiliary.

Definition at line 149 of file WikiTextStructure.php.

References extractHeadingBeforeFirstHeading(), and Sanitizer\stripAllTags().

Referenced by getAuxiliaryText(), getMainText(), and getOpeningText().

◆ getAuxiliaryText()

WikiTextStructure::getAuxiliaryText ( )
Returns
string[]

Definition at line 237 of file WikiTextStructure.php.

References $auxText, and extractWikitextParts().

◆ getDefaultSort()

WikiTextStructure::getDefaultSort ( )

Get the defaultsort property.

Returns
string|null

Definition at line 246 of file WikiTextStructure.php.

◆ getIgnoredHeadings()

WikiTextStructure::getIgnoredHeadings ( )
private

Get list of heading to ignore.

Returns
string[]

Definition at line 129 of file WikiTextStructure.php.

References $lines, $source, parseSettingsInMessage(), and wfMessage().

Referenced by headings().

◆ getMainText()

WikiTextStructure::getMainText ( )
Returns
string

Definition at line 229 of file WikiTextStructure.php.

References $allText, and extractWikitextParts().

◆ getOpeningText()

WikiTextStructure::getOpeningText ( )
Returns
string

Definition at line 221 of file WikiTextStructure.php.

References $openingText, and extractWikitextParts().

◆ headings()

WikiTextStructure::headings ( )

Get headings on the page.

Returns
string[] First strip out things that look like references. We can't use HTML filtering because the references come back as tags without a class. To keep from breaking stuff like ==Applicability of the strict mass–energy equivalence formula, ''E'' = ''mc''2== we don't remove the whole tag. We also don't want to strip the tag and remove everything that looks like [2] because, I dunno, maybe there is a band named Word [2] Foo or something. Whatever. So we only strip things that look like tags wrapping a reference. And since the data looks like: Reference in heading [1][2] we can not really use HtmlFormatter as we have no suitable selector.

Definition at line 85 of file WikiTextStructure.php.

References getIgnoredHeadings(), and Sanitizer\stripAllTags().

◆ parseSettingsInMessage()

static WikiTextStructure::parseSettingsInMessage (   $message)
static

Parse a message content into an array.

This function is generally used to parse settings stored as i18n messages (see search-ignored-headings).

Parameters
string$message
Returns
string[]

Definition at line 117 of file WikiTextStructure.php.

References $lines.

Referenced by getIgnoredHeadings().

Member Data Documentation

◆ $allText

string WikiTextStructure::$allText
private

Definition at line 16 of file WikiTextStructure.php.

Referenced by getMainText().

◆ $auxiliaryElementSelectors

string [] WikiTextStructure::$auxiliaryElementSelectors
private
Initial value:
= [
'.thumbcaption',
'table',
'.rellink',
'.dablink',
'.searchaux',
]

selectors to elements that are considered auxiliary to article text for search

Definition at line 52 of file WikiTextStructure.php.

◆ $auxText

string [] WikiTextStructure::$auxText = []
private

Definition at line 20 of file WikiTextStructure.php.

Referenced by getAuxiliaryText().

◆ $excludedElementSelectors

string [] WikiTextStructure::$excludedElementSelectors
private
Initial value:
= [
'audio', 'video',
'style',
'sup.reference',
'.mw-cite-backlink',
'h1', 'h2', 'h3', 'h4', 'h5', 'h6',
'.autocollapse',
'.navigation-not-searchable',
'.wbmi-entityview-emptyCaption',
]

selectors to elements that are excluded entirely from search

Definition at line 29 of file WikiTextStructure.php.

◆ $openingText

string WikiTextStructure::$openingText
private

Definition at line 12 of file WikiTextStructure.php.

Referenced by getOpeningText().

◆ $parserOutput

ParserOutput WikiTextStructure::$parserOutput
private

Definition at line 24 of file WikiTextStructure.php.

Referenced by __construct().


The documentation for this class was generated from the following file: