CirrusSearch
Elasticsearch-powered search for MediaWiki
Loading...
Searching...
No Matches
CirrusSearch\Maintenance\AnalyzerBuilder Class Reference

Builds one elasticsearch analyzer to add to an analysis config array. More...

Public Member Functions

 __construct (string $langName, string $analyzerName='text')
 
 withCharFilters (array $charFilters)
 
 withFilters (array $filters)
 
 withCharMap (array $mappings)
 
 withNumberCharFilter (int $langZero)
 
 withElision (array $articles, bool $articleCase=true)
 
 withLangLowercase ()
 
 withStop ( $stop)
 
 withStemmerOverride (array $rules)
 
 withUnpackedAnalyzer ()
 
 insertFiltersBefore ( $beforeFilter, array $filterList)
 
 omitDottedI ()
 
 withWordBreakHelper ()
 
 withAggressiveSplitting ()
 
 withLightStemmer ()
 
 withAsciifoldingPreserve ()
 
 omitAsciifolding ()
 
 withRemoveEmpty ()
 
 build (array $config)
 Create a basic analyzer with support for various common options.
 

Static Public Member Functions

static mappingCharFilter (array $mappings)
 Create a mapping character filter with the mappings provided.
 
static numberCharFilter (int $langZero)
 Create a character filter that maps non-Arabic digits (e.g., ០-៩ or 0-9) to Arabic digits (0-9).
 
static elisionFilter (array $articles, bool $case=true)
 Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not.
 
static stopFilter ( $stopwords, bool $ignoreCase=null)
 Create a stop word filter with the provided config.
 
static stemmerFilter (string $stemmer)
 Create a stemmer filter with the provided config.
 

Public Attributes

const APPEND = 1
 Indicate that filters should be automatically appended or prepended, rather than inserted before a given filter.
 
const PREPEND = 2
 

Detailed Description

Builds one elasticsearch analyzer to add to an analysis config array.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. http://www.gnu.org/copyleft/gpl.html

Constructor & Destructor Documentation

◆ __construct()

CirrusSearch\Maintenance\AnalyzerBuilder::__construct ( string $langName,
string $analyzerName = 'text' )
Parameters
string$langName
string$analyzerName(default to 'text')

Member Function Documentation

◆ build()

CirrusSearch\Maintenance\AnalyzerBuilder::build ( array $config)

Create a basic analyzer with support for various common options.

Can create various filters and character filters as specified. None are automatically added to the char_filter or filter list because the best order for these basic analyzers depends on the details of various third-party plugins.

type: custom tokenizer: standard char_filter: as per $this->charFilters filter: as per $this->filters

Parameters
mixed[]$configto be updated
Returns
mixed[] updated config

◆ elisionFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::elisionFilter ( array $articles,
bool $case = true )
static

Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not.

Parameters
string[]$articles
bool$case
Returns
mixed[] token filter

◆ insertFiltersBefore()

CirrusSearch\Maintenance\AnalyzerBuilder::insertFiltersBefore ( $beforeFilter,
array $filterList )
Parameters
mixed$beforeFilterspecific filter to insert $filters before; use APPEND or PREPEND to always add to beginning or end of the list
string[]$filterListlist of additional filters to insert
Returns
self

◆ mappingCharFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::mappingCharFilter ( array $mappings)
static

Create a mapping character filter with the mappings provided.

Parameters
string[]$mappings
Returns
mixed[] character filter

◆ numberCharFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::numberCharFilter ( int $langZero)
static

Create a character filter that maps non-Arabic digits (e.g., ០-៩ or 0-9) to Arabic digits (0-9).

Since they are usually all in a row, we just need the starting digit (equal to 0)

Parameters
int$langZero
Returns
mixed[] character filter

◆ omitAsciifolding()

CirrusSearch\Maintenance\AnalyzerBuilder::omitAsciifolding ( )
Returns
self

◆ omitDottedI()

CirrusSearch\Maintenance\AnalyzerBuilder::omitDottedI ( )
Returns
self

◆ stemmerFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::stemmerFilter ( string $stemmer)
static

Create a stemmer filter with the provided config.

Parameters
string$stemmer
Returns
mixed[] token filter

◆ stopFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::stopFilter ( $stopwords,
bool $ignoreCase = null )
static

Create a stop word filter with the provided config.

The config can be an array of stop words, or a string like french that refers to a pre-defined list.

Parameters
mixed$stopwords
bool | null$ignoreCase
Returns
mixed[] token filter

◆ withAggressiveSplitting()

CirrusSearch\Maintenance\AnalyzerBuilder::withAggressiveSplitting ( )
Returns
self

◆ withAsciifoldingPreserve()

CirrusSearch\Maintenance\AnalyzerBuilder::withAsciifoldingPreserve ( )
Returns
self

◆ withCharFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::withCharFilters ( array $charFilters)
Parameters
string[]$charFilters
Returns
self

◆ withCharMap()

CirrusSearch\Maintenance\AnalyzerBuilder::withCharMap ( array $mappings)
Parameters
string[]$mappings
Returns
self

◆ withElision()

CirrusSearch\Maintenance\AnalyzerBuilder::withElision ( array $articles,
bool $articleCase = true )
Parameters
string[]$articles"articles" to be elided
bool$articleCasewhether elision is case insensitive
Returns
self

◆ withFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::withFilters ( array $filters)
Parameters
string[]$filters
Returns
self

◆ withLangLowercase()

CirrusSearch\Maintenance\AnalyzerBuilder::withLangLowercase ( )
Returns
self

◆ withLightStemmer()

CirrusSearch\Maintenance\AnalyzerBuilder::withLightStemmer ( )
Returns
self

◆ withNumberCharFilter()

CirrusSearch\Maintenance\AnalyzerBuilder::withNumberCharFilter ( int $langZero)
Parameters
int$langZero
Returns
self

◆ withRemoveEmpty()

CirrusSearch\Maintenance\AnalyzerBuilder::withRemoveEmpty ( )
Returns
self

◆ withStemmerOverride()

CirrusSearch\Maintenance\AnalyzerBuilder::withStemmerOverride ( array $rules)
Parameters
string[]$rulesstemmer override rules
Returns
self

◆ withStop()

CirrusSearch\Maintenance\AnalyzerBuilder::withStop ( $stop)
Parameters
mixed$stoppre-defined list like french or an array of stopwords
Returns
self

◆ withUnpackedAnalyzer()

CirrusSearch\Maintenance\AnalyzerBuilder::withUnpackedAnalyzer ( )
Returns
self

◆ withWordBreakHelper()

CirrusSearch\Maintenance\AnalyzerBuilder::withWordBreakHelper ( )
Returns
self

The documentation for this class was generated from the following file: