Parsoid
A bidirectional parser between wikitext and HTML5
Loading...
Searching...
No Matches
Wikimedia\Parsoid\Utils\PHPUtils Class Reference

Static Public Member Functions

static counterToBase64 (int $n)
 Convert a counter to a Base64 encoded string.
 
static jsonEncode ( $o)
 FIXME: Core has FormatJson::encode that does a more comprehensive job.
 
static jsonDecode (string $str, bool $assoc=true)
 FIXME: Core has FormatJson::parse that does a more comprehensive job json_decode wrapper function.
 
static makeSet (array $a)
 Convert array to associative array usable as a read-only Set.
 
static lastItem (array $a)
 Helper to get last item of the array.
 
static pushArray (array &$dest, array $source)
 Append an array to an accumulator using the most efficient method available.
 
static safeSubstr (string $s, int $start, ?int $length=null, bool $checkEntireString=false)
 Return a substring, asserting that it is valid UTF-8.
 
static assertValidUTF8 (string $s)
 Helper for verifying a valid UTF-8 encoding.
 
static reStrip (string $re, ?string $newDelimiter=null)
 Helper for joining pieces of regular expressions together.
 
static encodeURIComponent (string $str)
 JS-compatible encodeURIComponent function FIXME: See T221147 (for a post-port update)
 
static sortArray (&$array)
 Sort keys in an array, recursively, for better reproducibility.
 
static iterable_to_array (iterable $iterable)
 Convert an iterable to an array.
 
static unreachable (string $reason="should never happen")
 Indicate that the code which calls this function is intended to be unreachable.
 
static stripPrefix ( $subject, $prefix)
 If a string starts with a given prefix, remove the prefix.
 
static stripSuffix ( $subject, $suffix)
 If a string ends with a given suffix, remove the suffix.
 

Member Function Documentation

◆ assertValidUTF8()

static Wikimedia\Parsoid\Utils\PHPUtils::assertValidUTF8 ( string $s)
static

Helper for verifying a valid UTF-8 encoding.

Using safeSubstr() is a more efficient way of doing this check in most places, where you can assume that the original string was valid UTF-8. This function does a complete traversal of the string, in time proportional to the length of the string.

Parameters
string$sThe string to check.

◆ counterToBase64()

static Wikimedia\Parsoid\Utils\PHPUtils::counterToBase64 ( int $n)
static

Convert a counter to a Base64 encoded string.

Padding is stripped. /,+ are replaced with _,- respectively. Warning: Max integer is 2^31 - 1 for bitwise operations.

Parameters
int$n
Returns
string

◆ encodeURIComponent()

static Wikimedia\Parsoid\Utils\PHPUtils::encodeURIComponent ( string $str)
static

JS-compatible encodeURIComponent function FIXME: See T221147 (for a post-port update)

Parameters
string$str
Returns
string

◆ iterable_to_array()

static Wikimedia\Parsoid\Utils\PHPUtils::iterable_to_array ( iterable $iterable)
static

Convert an iterable to an array.

This function is similar to but not the same as the built-in iterator_to_array, because arrays are iterable but not Traversable!

This function is also present in the wmde/iterable-functions library, but it's short enough that we don't need to pull in an entire new dependency here.

See also
https://stackoverflow.com/questions/44587973/php-iterable-to-array-or-traversable
https://github.com/wmde/iterable-functions/blob/master/src/functions.php

@phan-template T

Parameters
iterable<T>$iterable
Returns
array<T>

◆ jsonDecode()

static Wikimedia\Parsoid\Utils\PHPUtils::jsonDecode ( string $str,
bool $assoc = true )
static

FIXME: Core has FormatJson::parse that does a more comprehensive job json_decode wrapper function.

Parameters
string$strString to decode into the json object
bool$assocControls whether to parse as an an associative array - defaults to true
Returns
mixed

◆ jsonEncode()

static Wikimedia\Parsoid\Utils\PHPUtils::jsonEncode ( $o)
static

FIXME: Core has FormatJson::encode that does a more comprehensive job.

json_encode wrapper function

  • unscapes slashes and unicode
Parameters
mixed$o
Returns
string

◆ lastItem()

static Wikimedia\Parsoid\Utils\PHPUtils::lastItem ( array $a)
static

Helper to get last item of the array.

Parameters
mixed[]$a
Returns
mixed

◆ makeSet()

static Wikimedia\Parsoid\Utils\PHPUtils::makeSet ( array $a)
static

Convert array to associative array usable as a read-only Set.

Parameters
array$a
Returns
array

◆ pushArray()

static Wikimedia\Parsoid\Utils\PHPUtils::pushArray ( array & $dest,
array $source )
static

Append an array to an accumulator using the most efficient method available.

Makes sure that accumulation is O(n).

See https://w.wiki/3zvE

Parameters
array&$destDestination array
array$sourceArray to merge

◆ reStrip()

static Wikimedia\Parsoid\Utils\PHPUtils::reStrip ( string $re,
?string $newDelimiter = null )
static

Helper for joining pieces of regular expressions together.

This safely strips delimiters from regular expression strings, while ensuring that the result is safely escaped for the new delimiter you plan to use (see the $delimiter argument to preg_quote). Note that using a meta-character for the new delimiter can lead to unexpected results; for example, if you use ! then escaping (?!foo) will break the regular expression.

Parameters
string$reThe regular expression to strip
?string$newDelimiterOptional delimiter which will be used when recomposing this stripped regular expression into a new regular expression.
Returns
string The regular expression without delimiters or flags

◆ safeSubstr()

static Wikimedia\Parsoid\Utils\PHPUtils::safeSubstr ( string $s,
int $start,
?int $length = null,
bool $checkEntireString = false )
static

Return a substring, asserting that it is valid UTF-8.

By default we assume the full string was valid UTF-8, which allows us to look at the first and last bytes to make this check. You can check the entire string if you are feeling paranoid; it will take O(N) time (where N is the length of the substring) but so does the substring operation.

If the substring would start beyond the end of the string or end before the start of the string, then this function will return the empty string (as would JavaScript); note that the native substr would return false in this case.

Using this helper instead of native substr is useful during the PHP port to verify that we don't break up Unicode codepoints by the switch from JavaScript UCS-2 offsets to PHP UTF-8 byte offsets.

Parameters
string$sThe (sub)string to check
int$startThe starting offset (in bytes). If negative, the offset is counted from the end of the string.
?int$length(optional) The maximum length of the returned string. If negative, the end position is counted from the end of the string.
bool$checkEntireStringWhether to do a slower verification of the entire string, not just the edges. Defaults to false.
Returns
string The checked substring

◆ sortArray()

static Wikimedia\Parsoid\Utils\PHPUtils::sortArray ( & $array)
static

Sort keys in an array, recursively, for better reproducibility.

(This is especially useful before serializing as JSON.)

Parameters
mixed&$array

◆ stripPrefix()

static Wikimedia\Parsoid\Utils\PHPUtils::stripPrefix ( $subject,
$prefix )
static

If a string starts with a given prefix, remove the prefix.

Otherwise, return the original string. Like preg_replace( "/^$prefix/", '', $subject ) except about 1.14x faster in the replacement case and 2x faster in the no-op case.

Note: adding type declarations to the parameters adds an overhead of 3%. The benchmark above was without type declarations.

Parameters
string$subject
string$prefix
Returns
string

◆ stripSuffix()

static Wikimedia\Parsoid\Utils\PHPUtils::stripSuffix ( $subject,
$suffix )
static

If a string ends with a given suffix, remove the suffix.

Otherwise, return the original string. Like preg_replace( "/$suffix$/", '', $subject ) except faster.

Parameters
string$subject
string$suffix
Returns
string

◆ unreachable()

static Wikimedia\Parsoid\Utils\PHPUtils::unreachable ( string $reason = "should never happen")
static

Indicate that the code which calls this function is intended to be unreachable.

This is a workaround for T247093; this has been moved upstream into wikimedia/assert.

Parameters
string$reason
Returns
never
Deprecated
Just throw an UnreachableException instead.

The documentation for this class was generated from the following file: