Builds one elasticsearch analyzer to add to an analysis config array. More...

Public Member Functions
	__construct (string $langName, string $analyzerName='text')

	withCharFilters (array $charFilters)

	withFilters (array $filters)

	withCharMap (array $mappings)

	withNumberCharFilter (int $langZero)

	withElision (array $articles, bool $articleCase=true)

	withLangLowercase ()

	withStop ( $stop)

	withStemmerOverride (array $rules)

	withUnpackedAnalyzer ()

	insertFiltersBefore ( $beforeFilter, array $filterList)

	omitDottedI ()

	withWordBreakHelper ()

	withAggressiveSplitting ()

	withLightStemmer ()

	withAsciifoldingPreserve ()

	omitAsciifolding ()

	withRemoveEmpty ()

	build (array $config)
	Create a basic analyzer with support for various common options.

Static Public Member Functions
static	mappingCharFilter (array $mappings)
	Create a mapping character filter with the mappings provided.

static	numberCharFilter (int $langZero)
	Create a character filter that maps non-Arabic digits (e.g., ០-៩ or ０-９) to Arabic digits (0-9).

static	elisionFilter (array $articles, bool $case=true)
	Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not.

static	stopFilter ( $stopwords, bool $ignoreCase=null)
	Create a stop word filter with the provided config.

static	stemmerFilter (string $stemmer)
	Create a stemmer filter with the provided config.

Public Attributes
const	APPEND = 1
	Indicate that filters should be automatically appended or prepended, rather than inserted before a given filter.

const	PREPEND = 2

Detailed Description

Builds one elasticsearch analyzer to add to an analysis config array.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. http://www.gnu.org/copyleft/gpl.html

Constructor & Destructor Documentation

◆ __construct()

CirrusSearch\Maintenance\AnalyzerBuilder::__construct	(	string	$langName,
		string	$analyzerName = 'text' )

Parameters

string	$langName
string	$analyzerName	(default to 'text')

Member Function Documentation

◆ build()

CirrusSearch\Maintenance\AnalyzerBuilder::build ( array $config )

Create a basic analyzer with support for various common options.

Can create various filters and character filters as specified. None are automatically added to the char_filter or filter list because the best order for these basic analyzers depends on the details of various third-party plugins.

type: custom tokenizer: standard char_filter: as per $this->charFilters filter: as per $this->filters

Parameters

mixed[] $config to be updated

Returns: mixed[] updated config

◆ elisionFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::elisionFilter	(	array	$articles,
		bool	$case = true )

static

Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not.

Parameters

string[]	$articles
bool	$case

Returns: mixed[] token filter

◆ insertFiltersBefore()

CirrusSearch\Maintenance\AnalyzerBuilder::insertFiltersBefore	(		$beforeFilter,
		array	$filterList )

Parameters

mixed	$beforeFilter	specific filter to insert $filters before; use APPEND or PREPEND to always add to beginning or end of the list
string[]	$filterList	list of additional filters to insert

Returns: self

◆ mappingCharFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::mappingCharFilter ( array $mappings )

static

Create a mapping character filter with the mappings provided.

Parameters

string[] $mappings

Returns: mixed[] character filter

◆ numberCharFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::numberCharFilter ( int $langZero )

static

Create a character filter that maps non-Arabic digits (e.g., ០-៩ or ０-９) to Arabic digits (0-9).

Since they are usually all in a row, we just need the starting digit (equal to 0)

Parameters

int $langZero

Returns: mixed[] character filter

◆ omitAsciifolding()

CirrusSearch\Maintenance\AnalyzerBuilder::omitAsciifolding ( )

Returns: self

◆ omitDottedI()

CirrusSearch\Maintenance\AnalyzerBuilder::omitDottedI ( )

Returns: self

◆ stemmerFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::stemmerFilter ( string $stemmer )

static

Create a stemmer filter with the provided config.

Parameters

string $stemmer

Returns: mixed[] token filter

◆ stopFilter()

static CirrusSearch\Maintenance\AnalyzerBuilder::stopFilter	(		$stopwords,
		bool	$ignoreCase = null )

static

Create a stop word filter with the provided config.

The config can be an array of stop words, or a string like french that refers to a pre-defined list.

Parameters

mixed	$stopwords
bool \| null	$ignoreCase

Returns: mixed[] token filter

◆ withAggressiveSplitting()

CirrusSearch\Maintenance\AnalyzerBuilder::withAggressiveSplitting ( )

Returns: self

◆ withAsciifoldingPreserve()

CirrusSearch\Maintenance\AnalyzerBuilder::withAsciifoldingPreserve ( )

Returns: self

◆ withCharFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::withCharFilters ( array $charFilters )

Parameters

string[] $charFilters

Returns: self

◆ withCharMap()

CirrusSearch\Maintenance\AnalyzerBuilder::withCharMap ( array $mappings )

Parameters

string[] $mappings

Returns: self

◆ withElision()

CirrusSearch\Maintenance\AnalyzerBuilder::withElision	(	array	$articles,
		bool	$articleCase = true )

Parameters

string[]	$articles	"articles" to be elided
bool	$articleCase	whether elision is case insensitive

Returns: self

◆ withFilters()

CirrusSearch\Maintenance\AnalyzerBuilder::withFilters ( array $filters )

Parameters

string[] $filters

Returns: self

◆ withLangLowercase()

CirrusSearch\Maintenance\AnalyzerBuilder::withLangLowercase ( )

Returns: self

◆ withLightStemmer()

CirrusSearch\Maintenance\AnalyzerBuilder::withLightStemmer ( )

Returns: self

◆ withNumberCharFilter()

CirrusSearch\Maintenance\AnalyzerBuilder::withNumberCharFilter ( int $langZero )

Parameters

int $langZero

Returns: self

◆ withRemoveEmpty()

CirrusSearch\Maintenance\AnalyzerBuilder::withRemoveEmpty ( )

Returns: self

◆ withStemmerOverride()

CirrusSearch\Maintenance\AnalyzerBuilder::withStemmerOverride ( array $rules )

Parameters

string[] $rules stemmer override rules

Returns: self

◆ withStop()

CirrusSearch\Maintenance\AnalyzerBuilder::withStop ( $stop )

Parameters

mixed $stop pre-defined list like french or an array of stopwords

Returns: self

◆ withUnpackedAnalyzer()

CirrusSearch\Maintenance\AnalyzerBuilder::withUnpackedAnalyzer ( )

Returns: self

◆ withWordBreakHelper()

CirrusSearch\Maintenance\AnalyzerBuilder::withWordBreakHelper ( )

Returns: self

The documentation for this class was generated from the following file:

includes/Maintenance/AnalyzerBuilder.php

Public Member Functions

Static Public Member Functions

Public Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __construct()

Member Function Documentation

◆ build()

◆ elisionFilter()

◆ insertFiltersBefore()

◆ mappingCharFilter()

◆ numberCharFilter()

◆ omitAsciifolding()

◆ omitDottedI()

◆ stemmerFilter()

◆ stopFilter()

◆ withAggressiveSplitting()

◆ withAsciifoldingPreserve()

◆ withCharFilters()

◆ withCharMap()

◆ withElision()

◆ withFilters()

◆ withLangLowercase()

◆ withLightStemmer()

◆ withNumberCharFilter()

◆ withRemoveEmpty()

◆ withStemmerOverride()

◆ withStop()

◆ withUnpackedAnalyzer()

◆ withWordBreakHelper()