MediaWiki
1.30.2
|
Workaround for incorrect collation of Estonian language ('et') in ICU (T56168). More...
Public Member Functions | |
__construct () | |
getFirstLetter ( $string) | |
Given a string, return the logical "first letter" to be used for grouping on category pages and so on. More... | |
getSortKey ( $string) | |
Given a string, convert it to a (hopefully short) key that can be used for efficient sorting. More... | |
Public Member Functions inherited from IcuCollation | |
__construct ( $locale) | |
getFirstLetterCount () | |
getFirstLetterData () | |
getLetterByIndex ( $index) | |
getPrimarySortKey ( $string) | |
getSortKeyByLetterIndex ( $index) | |
Static Private Member Functions | |
static | mangle ( $string) |
static | unmangle ( $string) |
Additional Inherited Members | |
Static Public Member Functions inherited from IcuCollation | |
static | getICUVersion () |
Return the version of ICU library used by PHP's intl extension, or false when the extension is not installed of the version can't be determined. More... | |
static | getUnicodeVersionForICU () |
Return the version of Unicode appropriate for the version of ICU library currently in use, or false when it can't be determined. More... | |
static | isCjk ( $codepoint) |
Test if a code point is a CJK (Chinese, Japanese, Korean) character. More... | |
Static Public Member Functions inherited from Collation | |
static | factory ( $collationName) |
static | singleton () |
Public Attributes inherited from IcuCollation | |
const | FIRST_LETTER_VERSION = 3 |
const | RECORD_LENGTH = 14 |
Protected Attributes inherited from IcuCollation | |
Language | $digitTransformLanguage |
Workaround for incorrect collation of Estonian language ('et') in ICU (T56168).
'W' and 'V' should not be considered the same letter for the purposes of collation in modern Estonian. We work around this by replacing 'W' and 'w' with 'ᴡ' U+1D21 'LATIN LETTER SMALL CAPITAL W' for sortkey generation, which is collated like 'W' and is not tailored to have the same primary weight as 'V' in Estonian.
Definition at line 31 of file CollationEt.php.
CollationEt::__construct | ( | ) |
Definition at line 32 of file CollationEt.php.
CollationEt::getFirstLetter | ( | $string | ) |
Given a string, return the logical "first letter" to be used for grouping on category pages and so on.
This has to be coordinated carefully with convertToSortkey(), or else the sorted list might jump back and forth between the same "initial letters" or other pathological behavior. For instance, if you just return the first character, but "a" sorts the same as "A" based on getSortKey(), then you might get a list like
== A ==
== a ==
== A ==
etc., assuming for the sake of argument that $wgCapitalLinks is false.
string | $string | UTF-8 string |
Reimplemented from IcuCollation.
Definition at line 57 of file CollationEt.php.
References unmangle().
CollationEt::getSortKey | ( | $string | ) |
Given a string, convert it to a (hopefully short) key that can be used for efficient sorting.
A binary sort according to the sortkeys corresponds to a logical sort of the corresponding strings. Current code expects that a line feed character should sort before all others, but has no other particular expectations (and that one can be changed if necessary).
string | $string | UTF-8 string |
Reimplemented from IcuCollation.
Definition at line 53 of file CollationEt.php.
|
staticprivate |
Definition at line 36 of file CollationEt.php.
|
staticprivate |
Definition at line 44 of file CollationEt.php.
Referenced by getFirstLetter().