Extract searchable properties from the MediaWiki ParserOutput.
More...
|
| __construct (SearchConfig $config) |
|
| initialize (Document $doc, WikiPage $page, RevisionRecord $revision) |
| Perform initial building of a page document.Called once per page when starting an update and is shared between all clusters written to. This doc may be written to the jobqueue multiple times and should not contain any large (in number of bytes) values.- Parameters
-
Document | $doc | The document to be populated |
WikiPage | $page | The page to scope operation to |
RevisionRecord | $revision | The page revision to use |
|
|
| finishInitializeBatch () |
| Called after a batch of pages have been passed to self::initialize.Allows implementations to batch calls to external services necessary for collecting page properties. Implementations must update the Document instances previously provided.The builder will be disposed of after finishing a batch.
|
|
| finalize (Document $doc, Title $title, RevisionRecord $revision) |
| Finalize document building before sending to cluster.Called on every write attempt for every cluster to perform any final document building. Intended for bulk loading of content from wiki databases that would only serve to bloat the job queue.- Parameters
-
Document | $doc | |
Title | $title | |
RevisionRecord | $revision | |
- Exceptions
-
|
|
| finalizeReal (Document $doc, WikiPage $page, CirrusSearch $engine, RevisionRecord $revision) |
| Visible for testing.
|
|
Extract searchable properties from the MediaWiki ParserOutput.
◆ __construct()
CirrusSearch\BuildDocument\ParserOutputPageProperties::__construct |
( |
SearchConfig | $config | ) |
|
◆ finalize()
CirrusSearch\BuildDocument\ParserOutputPageProperties::finalize |
( |
Document | $doc, |
|
|
Title | $title, |
|
|
RevisionRecord | $revision ) |
Finalize document building before sending to cluster.Called on every write attempt for every cluster to perform any final document building. Intended for bulk loading of content from wiki databases that would only serve to bloat the job queue.
- Parameters
-
Document | $doc | |
Title | $title | |
RevisionRecord | $revision | |
- Exceptions
-
Implements CirrusSearch\BuildDocument\PagePropertyBuilder.
◆ finalizeReal()
CirrusSearch\BuildDocument\ParserOutputPageProperties::finalizeReal |
( |
Document | $doc, |
|
|
WikiPage | $page, |
|
|
CirrusSearch | $engine, |
|
|
RevisionRecord | $revision ) |
Visible for testing.
Much simpler to test with all objects resolved.
- Parameters
-
Document | $doc | Document to finalize |
WikiPage | $page | WikiPage to scope operation to |
CirrusSearch | $engine | SearchEngine implementation |
RevisionRecord | $revision | The page revision to use |
- Exceptions
-
◆ finishInitializeBatch()
CirrusSearch\BuildDocument\ParserOutputPageProperties::finishInitializeBatch |
( |
| ) |
|
Called after a batch of pages have been passed to self::initialize.Allows implementations to batch calls to external services necessary for collecting page properties. Implementations must update the Document instances previously provided.The builder will be disposed of after finishing a batch.
Implements CirrusSearch\BuildDocument\PagePropertyBuilder.
◆ fixAndFlagInvalidUTF8InSource()
static CirrusSearch\BuildDocument\ParserOutputPageProperties::fixAndFlagInvalidUTF8InSource |
( |
array | $fieldDefinitions, |
|
|
int | $pageId ) |
|
static |
Find invalid UTF-8 sequence in the source text.
Fix them and flag the doc with the CirrusSearchInvalidUTF8 template.
Temporary solution to help investigate/fix T225200
Visible for testing only
- Parameters
-
array | $fieldDefinitions | |
int | $pageId | |
- Returns
- array
◆ initialize()
CirrusSearch\BuildDocument\ParserOutputPageProperties::initialize |
( |
Document | $doc, |
|
|
WikiPage | $page, |
|
|
RevisionRecord | $revision ) |
Perform initial building of a page document.Called once per page when starting an update and is shared between all clusters written to. This doc may be written to the jobqueue multiple times and should not contain any large (in number of bytes) values.
- Parameters
-
Document | $doc | The document to be populated |
WikiPage | $page | The page to scope operation to |
RevisionRecord | $revision | The page revision to use |
Implements CirrusSearch\BuildDocument\PagePropertyBuilder.
◆ truncateFileTextContent()
static CirrusSearch\BuildDocument\ParserOutputPageProperties::truncateFileTextContent |
( |
int | $maxLen, |
|
|
array | $fieldContent ) |
|
static |
Visible for testing only.
- Parameters
-
int | $maxLen | |
array | $fieldContent | |
- Returns
- array
The documentation for this class was generated from the following file:
- includes/BuildDocument/ParserOutputPageProperties.php