Namespace for all UnicodeJS classes, static methods and static properties.
Namespaces
Classes
Methods
charRangeArrayRegexp(ranges) → {string}static
#
Make a regexp string for an array of Unicode character ranges.
If either character in a range is above 0xFFFF, then the range will be encoded as multiple surrogate pair ranges. It is an error for a range to overlap with the surrogate range 0xD800-0xDFFF (as this would only match ill-formed strings).
Parameters:
Name | Type | Description |
---|---|---|
ranges |
Array | Array of ranges, each of which is a character or an interval |
- Source:
Returns:
Regexp string for the disjunction of the ranges.
- Type
- string
codeUnitRange(min, max, [bracket]) → {string}privatestatic
#
Return a regexp string for the code unit range min-max
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
min |
number | the minimum code unit in the range. |
|
max |
number | the maximum code unit in the range. |
|
bracket |
boolean |
optional |
If true, then wrap range in [ ... ] |
- Source:
Returns:
Regexp string which matches the range
- Type
- string
getCodeUnitBoxes(ch1, ch2) → {Array.<Object>}privatestatic
#
Get a list of boxes in hi-lo surrogate space, corresponding to the given character range
A box {hi: [x, y], lo: [z, w]} represents a regex [x-y][z-w] to match a surrogate pair
Suppose ch1 and ch2 have surrogate pairs (hi1, lo1) and (hi2, lo2). Then the range of chars from ch1 to ch2 can be represented as the disjunction of three code unit ranges:
[hi1 - hi1][lo1 - 0xDFFF]
|
[hi1+1 - hi2-1][0xDC00 - 0xDFFF]
|
[hi2 - hi2][0xD800 - lo2]
Often the notation can be optimised (e.g. when hi1 == hi2).
Parameters:
Name | Type | Description |
---|---|---|
ch1 |
number | The min character of the range; must be over 0xFFFF |
ch2 |
number | The max character of the range; must be at least ch1 |
- Source:
Returns:
A list of boxes where each box is an object with two properties: 'hi' and 'lo'. 'hi' is an array of two numbers representing the range of the high surrogate. 'lo' is an array of two numbers representing the range of the low surrogate.
- Type
- Array.<Object>
isLeadingSurrogate(unit) → {boolean}static
#
Check if a code unit is a the leading half of a surrogate pair
Parameters:
Name | Type | Description |
---|---|---|
unit |
string | Code unit |
- Source:
Returns:
- Type
- boolean
isTrailingSurrogate(unit) → {boolean}static
#
Check if a code unit is a the trailing half of a surrogate pair
Parameters:
Name | Type | Description |
---|---|---|
unit |
string | Code unit |
- Source:
Returns:
- Type
- boolean
uEsc(codeUnit) → {string}privatestatic
#
Write a UTF-16 code unit as a javascript string literal.
Parameters:
Name | Type | Description |
---|---|---|
codeUnit |
number | integer between 0x0000 and 0xFFFF |
- Source:
Returns:
String literal ('\u' followed by 4 hex digits)
- Type
- string