CirrusSearch
Elasticsearch-powered search for MediaWiki
|
Performs updates and deletes on the Elasticsearch index. More...
Public Member Functions | |
__construct (Connection $readConnection, $writeToClusterName=null) | |
updateFromTitle ( $title, ?string $updateKind, ?int $rootEventTime) | |
Update a single page. | |
traceRedirects ( $title) | |
Trace redirects from the title to the destination. | |
updatePages ( $pages, $flags, ?string $updateKind=null, ?int $rootEventTime=null) | |
This updates pages in elasticsearch. | |
updateWeightedTags (ProperPageIdentity $page, string $tagPrefix, ?array $tagWeights=null, ?string $trigger=null) | |
@inheritDoc | |
resetWeightedTags (ProperPageIdentity $page, array $tagPrefixes, ?string $trigger=null) | |
@inheritDoc | |
deletePages ( $titles, $docIds, $indexSuffix=null, array $writeJobParams=[]) | |
Delete pages from the elasticsearch index. | |
archivePages ( $archived) | |
Add documents to archive index. | |
updateLinkedArticles ( $titles) | |
Update the search index for newly linked or unlinked articles. | |
Public Member Functions inherited from CirrusSearch\ElasticsearchIntermediary | |
start (RequestLog $log) | |
Mark the start of a request to Elasticsearch. | |
success ( $result=null, ?Connection $connection=null) | |
Log a successful request and return the provided result in a good Status. | |
successViaCache (RequestLog $log) | |
Log a successful request when the response comes from a cache outside elasticsearch. | |
failure (?ExceptionInterface $exception=null, ?Connection $connection=null) | |
Log a failure and return an appropriate status. | |
getSearchMetrics () | |
Get the search metrics we have. | |
Static Public Member Functions | |
static | build (SearchConfig $config, $cluster) |
Static Public Member Functions inherited from CirrusSearch\ElasticsearchIntermediary | |
static | setResultPages (array $matches) |
This is set externally because we don't have complete control, from the SearchEngine interface, of what is actually sent to the user. | |
static | getQueryTypesUsed () |
Report the types of queries that were issued within the current request. | |
static | hasQueryLogs () |
static | appendLastLogPayload ( $key, $value) |
static | isMSearchResultSetOK (MultiResultSet $multiResultSet) |
check validity of the multisearch response | |
Protected Member Functions | |
pushElasticaWriteJobs (string $updateGroup, array $items, $factory, int $batchSize=10) | |
newLog ( $description, $queryType, array $extra=[]) | |
Protected Member Functions inherited from CirrusSearch\ElasticsearchIntermediary | |
__construct (Connection $connection, ?UserIdentity $user=null, $slowSeconds=null, $extraBackendLatency=0) | |
startNewLog ( $description, $queryType, array $extra=[]) | |
getTimeout ( $searchType='default') | |
getClientTimeout ( $searchType='default') | |
appendMetrics (SearchMetricsProvider $provider) | |
runMSearch (Search $search, RequestLog $log, ?Connection $connection=null, ?callable $resultsTransformer=null) | |
Protected Attributes | |
$writeToClusterName | |
Protected Attributes inherited from CirrusSearch\ElasticsearchIntermediary | |
$connection | |
$user | |
$currentRequestLog = null | |
Additional Inherited Members | |
Public Attributes inherited from CirrusSearch\WeightedTagsUpdater | |
const | SERVICE = self::class |
Static Protected Attributes inherited from CirrusSearch\ElasticsearchIntermediary | |
static | $requestLogger |
Performs updates and deletes on the Elasticsearch index.
Called by CirrusSearch.php (our SearchEngine implementation), forceSearchIndex (for bulk updates), and CirrusSearch's jobs.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. http://www.gnu.org/copyleft/gpl.html
CirrusSearch\Updater::__construct | ( | Connection | $readConnection, |
$writeToClusterName = null ) |
Connection | $readConnection | connection used to pull data out of elasticsearch |
string | null | $writeToClusterName |
CirrusSearch\Updater::archivePages | ( | $archived | ) |
Add documents to archive index.
array | $archived |
|
static |
SearchConfig | $config | |
string | null | $cluster | cluster to read from and write to, null to read from the default cluster and write to all |
CirrusSearch\Updater::deletePages | ( | $titles, | |
$docIds, | |||
$indexSuffix = null, | |||
array | $writeJobParams = [] ) |
Delete pages from the elasticsearch index.
$titles and $docIds must point to the same pages and should point to them in the same order.
Title[] | $titles | List of titles to delete. If empty then skipped other index maintenance is skipped. |
int[] | string[] | $docIds | List of elasticsearch document ids to delete |
string | null | $indexSuffix | index from which to delete. null means all. |
array | $writeJobParams | Parameters passed on to ElasticaWriteJob |
|
protected |
string | $description | |
string | $queryType | |
string[] | $extra |
Reimplemented from CirrusSearch\ElasticsearchIntermediary.
|
protected |
string | $updateGroup | UpdateGroup::* constant |
mixed[] | $items | |
callable | $factory | |
int | $batchSize |
CirrusSearch\Updater::resetWeightedTags | ( | ProperPageIdentity | $page, |
array | $tagPrefixes, | ||
?string | $trigger = null ) |
@inheritDoc
Implements CirrusSearch\WeightedTagsUpdater.
CirrusSearch\Updater::traceRedirects | ( | $title | ) |
Trace redirects from the title to the destination.
Also registers the title in the memory of titles updated and detects special pages.
Title | $title | title to trace |
CirrusSearch\Updater::updateFromTitle | ( | $title, | |
?string | $updateKind, | ||
?int | $rootEventTime ) |
Update a single page.
Title | $title | |
string | null | $updateKind | kind of update to perform (used for monitoring) |
int | null | $rootEventTime | the time of MW event that caused this update (used for monitoring) |
CirrusSearch\Updater::updateLinkedArticles | ( | $titles | ) |
Update the search index for newly linked or unlinked articles.
Title[] | $titles | titles to update |
CirrusSearch\Updater::updatePages | ( | $pages, | |
$flags, | |||
?string | $updateKind = null, | ||
?int | $rootEventTime = null ) |
This updates pages in elasticsearch.
$flags includes: INDEX_EVERYTHING Cirrus will parse the page and count the links and send the document to Elasticsearch as an index so if it doesn't exist it'll be created. SKIP_PARSE Cirrus will skip parsing the page when building the document. It makes sense to do this when you know the page hasn't changed like when it is newly linked from another page. SKIP_LINKS Cirrus will skip collecting links information. It makes sense to do this when you know the link counts aren't yet available like during the first phase of the two phase index build. INDEX_ON_SKIP Cirrus will send an update if SKIP_PARSE or SKIP_LINKS rather than an index. Indexing with any portion of the document skipped is dangerous because it can put half created pages in the index. This is only a good idea during the first half of the two phase index build.
WikiPage[] | $pages | pages to update |
int | $flags | Bit field containing instructions about how the document should be built and sent to Elasticsearch. |
string | null | $updateKind | kind of update to perform (used for monitoring) |
int | null | $rootEventTime | the time of MW event that caused this update (used for monitoring) |
CirrusSearch\Updater::updateWeightedTags | ( | ProperPageIdentity | $page, |
string | $tagPrefix, | ||
?array | $tagWeights = null, | ||
?string | $trigger = null ) |
@inheritDoc
Implements CirrusSearch\WeightedTagsUpdater.