Builds one elasticsearch analyzer to add to an analysis config array. More...

Public Member Functions
	__construct (string $langName, string $analyzerName='text')

	withCharFilters (array $charFilters)

	withTokenizer (string $tokenizer)

	withFilters (array $filters)

	withCharMap (array $mappings, string $name=null, bool $limited=false)

	withLimitedCharMap (array $mappings, string $name=null)

	withReversedNumberCharFilter (int $langZero, string $name=null)

	withNumberCharFilter (int $langZero, string $name=null, bool $reversed=false)

	withElision (array $articles, bool $articleCase=true)

	withLangLowercase (string $name=null)

	withStop ( $stop, string $name=null)

	withExtraStop ( $stop, string $name, $beforeFilter=self::APPEND, bool $ignoreCase=null)

	withExtraStemmer (string $lang, string $name=null)

	withStemmerOverride ( $rules, string $name=null)
	Rules can be a single rule string, or an array of rules.

	withUnpackedAnalyzer ()

	insertFiltersBefore ( $beforeFilter, array $filterList)

	appendFilters (array $filterList)

	prependFilters (array $filterList)

	withLightStemmer ()

	omitStemmer ()

	withAsciifoldingPreserve ()

	omitAsciifolding ()

	withRemoveEmpty ()

	withDecimalDigit ()

	build (array $config)
	Create a basic analyzer with support for various common options.

Static Public Member Functions
static	patternFilter (string $pat, string $repl='')
	Create a pattern_replace filter/char_filter with the mappings provided.

static	mappingCharFilter (array $mappings, bool $limited)
	Create a mapping or limited_mapping character filter with the mappings provided.

static	numberCharFilter (int $langZero, bool $reversed=false)
	Create a character filter that maps non-Arabic digits (e.g., ០-៩ or ０-９) to Arabic digits (0-9).

static	elisionFilter (array $articles, bool $case=true)
	Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not.

static	stopFilterFromList ( $stopwords, bool $ignoreCase=null)
	Create a stop word filter with the provided config.

static	stemmerFilter (string $stemmer)
	Create a stemmer filter with the provided config.

Public Attributes
const	APPEND = 1
	Indicate that filters should be automatically appended or prepended, rather than inserted before a given filter.

const	PREPEND = 2

Detailed Description

Builds one elasticsearch analyzer to add to an analysis config array.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. http://www.gnu.org/copyleft/gpl.html

Constructor & Destructor Documentation

◆ __construct()

CirrusSearch\Maintenance\AnalyzerBuilder::__construct	(	string	$langName,
		string	$analyzerName = 'text' )

Parameters

string	$langName
string	$analyzerName	(default to 'text')

Member Function Documentation

◆ appendFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::appendFilters ( array $filterList )

Parameters

string[] $filterList list of additional filters to append

Returns: self

◆ build()

CirrusSearch\Maintenance\AnalyzerBuilder::build ( array $config )

Create a basic analyzer with support for various common options.

Can create various filters and character filters as specified. None are automatically added to the char_filter or filter list because the best order for these basic analyzers depends on the details of various third-party plugins.

type: custom tokenizer: standard char_filter: as per $this->charFilters filter: as per $this->filters

Parameters

mixed[] $config to be updated

Returns: mixed[] updated config

◆ elisionFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::elisionFilter	(	array	$articles,
		bool	$case = true )

static

Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not.

Parameters

string[]	$articles
bool	$case

Returns: mixed[] token filter

◆ insertFiltersBefore()

CirrusSearch\Maintenance\AnalyzerBuilder::insertFiltersBefore	(		$beforeFilter,
		array	$filterList )

Parameters

mixed	$beforeFilter	specific filter to insert $filters before; use APPEND or PREPEND to always add to beginning or end of the list
string[]	$filterList	list of additional filters to insert

Returns: self

◆ mappingCharFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::mappingCharFilter	(	array	$mappings,
		bool	$limited )

static

Create a mapping or limited_mapping character filter with the mappings provided.

Parameters

string[]	$mappings
bool	$limited

Returns: mixed[] character filter

◆ numberCharFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::numberCharFilter	(	int	$langZero,
		bool	$reversed = false )

static

Create a character filter that maps non-Arabic digits (e.g., ០-៩ or ０-９) to Arabic digits (0-9).

Since they are usually all in a row, we just need the starting digit (equal to 0).

Optionally reverse the mapping from Arabic to non-Arabic. For example, the ICU tokenizer works better on tokenizing Thai digits in Thai text than it does on Arabic digits.

Parameters

int	$langZero
bool	$reversed	reverse the mapping from Arabic to non-Arabic

Returns: mixed[] character filter

◆ omitAsciifolding()

CirrusSearch\Maintenance\AnalyzerBuilder::omitAsciifolding ( )

Returns: self

◆ omitStemmer()

CirrusSearch\Maintenance\AnalyzerBuilder::omitStemmer ( )

Returns: self

◆ patternFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::patternFilter	(	string	$pat,
		string	$repl = '' )

static

Create a pattern_replace filter/char_filter with the mappings provided.

Parameters

string	$pat
string	$repl

Returns: mixed[] filter

◆ prependFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::prependFilters ( array $filterList )

Parameters

string[] $filterList list of additional filters to prepend

Returns: self

◆ stemmerFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::stemmerFilter ( string $stemmer )

static

Create a stemmer filter with the provided config.

Parameters

string $stemmer

Returns: mixed[] token filter

◆ stopFilterFromList()

static CirrusSearch\Maintenance\AnalyzerBuilder::stopFilterFromList	(		$stopwords,
		bool	$ignoreCase = null )

static

Create a stop word filter with the provided config.

The config can be an array of stop words, or a string like french that refers to a pre-defined list.

Parameters

mixed	$stopwords
bool \| null	$ignoreCase

Returns: mixed[] token filter

◆ withAsciifoldingPreserve()

CirrusSearch\Maintenance\AnalyzerBuilder::withAsciifoldingPreserve ( )

Returns: self

◆ withCharFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::withCharFilters ( array $charFilters )

Parameters

string[] $charFilters

Returns: self

◆ withCharMap()

CirrusSearch\Maintenance\AnalyzerBuilder::withCharMap	(	array	$mappings,
		string	$name = null,
		bool	$limited = false )

Parameters

string[]	$mappings
string \| null	$name
bool	$limited

Returns: self

◆ withDecimalDigit()

CirrusSearch\Maintenance\AnalyzerBuilder::withDecimalDigit ( )

Returns: self

◆ withElision()

CirrusSearch\Maintenance\AnalyzerBuilder::withElision	(	array	$articles,
		bool	$articleCase = true )

Parameters

string[]	$articles	"articles" to be elided
bool	$articleCase	whether elision is case insensitive

Returns: self

◆ withExtraStemmer()

CirrusSearch\Maintenance\AnalyzerBuilder::withExtraStemmer	(	string	$lang,
		string	$name = null )

Parameters

string	$lang
string \| null	$name

Returns: self

◆ withExtraStop()

CirrusSearch\Maintenance\AnalyzerBuilder::withExtraStop	(		$stop,
		string	$name,
			$beforeFilter = self::APPEND,
		bool	$ignoreCase = null )

Parameters

mixed	$stop	pre-defined list like french or an array of stopwords
string	$name
mixed	$beforeFilter	filter to insert extra stop before
bool \| null	$ignoreCase

Returns: self

◆ withFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::withFilters ( array $filters )

Parameters

string[] $filters

Returns: self

◆ withLangLowercase()

CirrusSearch\Maintenance\AnalyzerBuilder::withLangLowercase ( string $name = null )

Parameters

string | null $name

Returns: self

◆ withLightStemmer()

CirrusSearch\Maintenance\AnalyzerBuilder::withLightStemmer ( )

Returns: self

◆ withLimitedCharMap()

CirrusSearch\Maintenance\AnalyzerBuilder::withLimitedCharMap	(	array	$mappings,
		string	$name = null )

Parameters

string[]	$mappings
string \| null	$name

Returns: self

◆ withNumberCharFilter()

CirrusSearch\Maintenance\AnalyzerBuilder::withNumberCharFilter	(	int	$langZero,
		string	$name = null,
		bool	$reversed = false )

Parameters

int	$langZero
string \| null	$name
bool	$reversed	reverse the mapping from Arabic to non-Arabic

Returns: self

◆ withRemoveEmpty()

CirrusSearch\Maintenance\AnalyzerBuilder::withRemoveEmpty ( )

Returns: self

◆ withReversedNumberCharFilter()

CirrusSearch\Maintenance\AnalyzerBuilder::withReversedNumberCharFilter	(	int	$langZero,
		string	$name = null )

Parameters

int	$langZero
string \| null	$name

Returns: self

◆ withStemmerOverride()

CirrusSearch\Maintenance\AnalyzerBuilder::withStemmerOverride	(		$rules,
		string	$name = null )

Rules can be a single rule string, or an array of rules.

Parameters

mixed	$rules	stemmer override rules
string \| null	$name

Returns: self

◆ withStop()

CirrusSearch\Maintenance\AnalyzerBuilder::withStop	(		$stop,
		string	$name = null )

Parameters

mixed	$stop	pre-defined list like french or an array of stopwords
string \| null	$name

Returns: self

◆ withTokenizer()

CirrusSearch\Maintenance\AnalyzerBuilder::withTokenizer ( string $tokenizer )

Parameters

string $tokenizer

Returns: self

◆ withUnpackedAnalyzer()

CirrusSearch\Maintenance\AnalyzerBuilder::withUnpackedAnalyzer ( )

Returns: self

The documentation for this class was generated from the following file:

includes/Maintenance/AnalyzerBuilder.php

Public Member Functions

Static Public Member Functions

Public Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __construct()

Member Function Documentation

◆ appendFilters()

◆ build()

◆ elisionFilter()

◆ insertFiltersBefore()

◆ mappingCharFilter()

◆ numberCharFilter()

◆ omitAsciifolding()

◆ omitStemmer()

◆ patternFilter()

◆ prependFilters()

◆ stemmerFilter()

◆ stopFilterFromList()

◆ withAsciifoldingPreserve()

◆ withCharFilters()

◆ withCharMap()

◆ withDecimalDigit()

◆ withElision()

◆ withExtraStemmer()

◆ withExtraStop()

◆ withFilters()

◆ withLangLowercase()

◆ withLightStemmer()

◆ withLimitedCharMap()

◆ withNumberCharFilter()

◆ withRemoveEmpty()

◆ withReversedNumberCharFilter()

◆ withStemmerOverride()

◆ withStop()

◆ withTokenizer()

◆ withUnpackedAnalyzer()