CirrusSearch
Elasticsearch-powered search for MediaWiki
|
Builds one elasticsearch analyzer to add to an analysis config array. More...
Public Member Functions | |
__construct (string $langName, string $analyzerName='text') | |
withCharFilters (array $charFilters) | |
withFilters (array $filters) | |
withCharMap (array $mappings) | |
withNumberCharFilter (int $langZero) | |
withElision (array $articles, bool $articleCase=true) | |
withLangLowercase () | |
withStop ( $stop) | |
withStemmerOverride (array $rules) | |
withUnpackedAnalyzer () | |
insertFiltersBefore ( $beforeFilter, array $filterList) | |
omitDottedI () | |
withWordBreakHelper () | |
withAggressiveSplitting () | |
withLightStemmer () | |
withAsciifoldingPreserve () | |
omitAsciifolding () | |
withRemoveEmpty () | |
build (array $config) | |
Create a basic analyzer with support for various common options. | |
Static Public Member Functions | |
static | mappingCharFilter (array $mappings) |
Create a mapping character filter with the mappings provided. | |
static | numberCharFilter (int $langZero) |
Create a character filter that maps non-Arabic digits (e.g., ០-៩ or 0-9) to Arabic digits (0-9). | |
static | elisionFilter (array $articles, bool $case=true) |
Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not. | |
static | stopFilter ( $stopwords, bool $ignoreCase=null) |
Create a stop word filter with the provided config. | |
static | stemmerFilter (string $stemmer) |
Create a stemmer filter with the provided config. | |
Public Attributes | |
const | APPEND = 1 |
Indicate that filters should be automatically appended or prepended, rather than inserted before a given filter. | |
const | PREPEND = 2 |
Builds one elasticsearch analyzer to add to an analysis config array.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. http://www.gnu.org/copyleft/gpl.html
CirrusSearch\Maintenance\AnalyzerBuilder::__construct | ( | string | $langName, |
string | $analyzerName = 'text' ) |
string | $langName | |
string | $analyzerName | (default to 'text') |
CirrusSearch\Maintenance\AnalyzerBuilder::build | ( | array | $config | ) |
Create a basic analyzer with support for various common options.
Can create various filters and character filters as specified. None are automatically added to the char_filter or filter list because the best order for these basic analyzers depends on the details of various third-party plugins.
type: custom tokenizer: standard char_filter: as per $this->charFilters filter: as per $this->filters
mixed[] | $config | to be updated |
|
static |
Create an elision filter with the "articles" provided; $case determines whether stripping is case sensitive or not.
string[] | $articles | |
bool | $case |
CirrusSearch\Maintenance\AnalyzerBuilder::insertFiltersBefore | ( | $beforeFilter, | |
array | $filterList ) |
mixed | $beforeFilter | specific filter to insert $filters before; use APPEND or PREPEND to always add to beginning or end of the list |
string[] | $filterList | list of additional filters to insert |
|
static |
Create a mapping character filter with the mappings provided.
string[] | $mappings |
|
static |
Create a character filter that maps non-Arabic digits (e.g., ០-៩ or 0-9) to Arabic digits (0-9).
Since they are usually all in a row, we just need the starting digit (equal to 0)
int | $langZero |
CirrusSearch\Maintenance\AnalyzerBuilder::omitAsciifolding | ( | ) |
CirrusSearch\Maintenance\AnalyzerBuilder::omitDottedI | ( | ) |
|
static |
Create a stemmer filter with the provided config.
string | $stemmer |
|
static |
Create a stop word filter with the provided config.
The config can be an array of stop words, or a string like french that refers to a pre-defined list.
mixed | $stopwords | |
bool | null | $ignoreCase |
CirrusSearch\Maintenance\AnalyzerBuilder::withAggressiveSplitting | ( | ) |
CirrusSearch\Maintenance\AnalyzerBuilder::withAsciifoldingPreserve | ( | ) |
CirrusSearch\Maintenance\AnalyzerBuilder::withCharFilters | ( | array | $charFilters | ) |
string[] | $charFilters |
CirrusSearch\Maintenance\AnalyzerBuilder::withCharMap | ( | array | $mappings | ) |
string[] | $mappings |
CirrusSearch\Maintenance\AnalyzerBuilder::withElision | ( | array | $articles, |
bool | $articleCase = true ) |
string[] | $articles | "articles" to be elided |
bool | $articleCase | whether elision is case insensitive |
CirrusSearch\Maintenance\AnalyzerBuilder::withFilters | ( | array | $filters | ) |
string[] | $filters |
CirrusSearch\Maintenance\AnalyzerBuilder::withLangLowercase | ( | ) |
CirrusSearch\Maintenance\AnalyzerBuilder::withLightStemmer | ( | ) |
CirrusSearch\Maintenance\AnalyzerBuilder::withNumberCharFilter | ( | int | $langZero | ) |
int | $langZero |
CirrusSearch\Maintenance\AnalyzerBuilder::withRemoveEmpty | ( | ) |
CirrusSearch\Maintenance\AnalyzerBuilder::withStemmerOverride | ( | array | $rules | ) |
string[] | $rules | stemmer override rules |
CirrusSearch\Maintenance\AnalyzerBuilder::withStop | ( | $stop | ) |
mixed | $stop | pre-defined list like french or an array of stopwords |
CirrusSearch\Maintenance\AnalyzerBuilder::withUnpackedAnalyzer | ( | ) |
CirrusSearch\Maintenance\AnalyzerBuilder::withWordBreakHelper | ( | ) |