Parsoid
A bidirectional parser between wikitext and HTML5
Loading...
Searching...
No Matches
Wikimedia\Parsoid\Utils\PHPUtils Class Reference

This file contains Parsoid-independent PHP helper functions. More...

Static Public Member Functions

static jsonEncode ( $o)
 FIXME: Core has FormatJson::encode that does a more comprehensive job.
 
static jsonDecode (string $str, bool $assoc=true)
 FIXME: Core has FormatJson::parse that does a more comprehensive job json_decode wrapper function.
 
static makeSet (array $a)
 Convert array to associative array usable as a read-only Set.
 
static lastItem (array $a)
 Helper to get last item of the array.
 
static pushArray (array &$dest, array ... $sources)
 Append an array to an accumulator using the most efficient method available.
 
static safeSubstr (string $s, int $start, ?int $length=null, bool $checkEntireString=false)
 Return a substring, asserting that it is valid UTF-8.
 
static assertValidUTF8 (string $s)
 Helper for verifying a valid UTF-8 encoding.
 
static reStrip (string $re, ?string $newDelimiter=null)
 Helper for joining pieces of regular expressions together.
 
static encodeURIComponent (string $str)
 JS-compatible encodeURIComponent function FIXME: See T221147 (for a post-port update)
 
static sortArray (&$array)
 Sort keys in an array, recursively, for better reproducibility.
 
static arrayEquals (?array $arrA, ?array $arrB, callable $elementEquals)
 Compare the contents of two arrays for equality, given a equality comparison function for elements of the array.
 
static iterable_to_array (iterable $iterable)
 Convert an iterable to an array.
 
static stripPrefix ( $subject, $prefix)
 If a string starts with a given prefix, remove the prefix.
 
static stripSuffix ( $subject, $suffix)
 If a string ends with a given suffix, remove the suffix.
 
static deprecated (string $function, string $version, int $callerOffset=2)
 Logs a warning that a deprecated feature was used.
 
static filterDeprecationForTest (string $regex)
 Deprecation messages matching the supplied regex will be suppressed.
 
static clearDeprecationFilters ()
 Clear all deprecation filters.
 

Detailed Description

This file contains Parsoid-independent PHP helper functions.

Over time, more functions can be migrated out of various other files here.

Member Function Documentation

◆ arrayEquals()

static Wikimedia\Parsoid\Utils\PHPUtils::arrayEquals ( ?array $arrA,
?array $arrB,
callable $elementEquals )
static

Compare the contents of two arrays for equality, given a equality comparison function for elements of the array.

Arrays are equal if they have the same size and element values for equal keys are equal.

For convenience, null can be passed in as well, and the function only returns true if the other argument is also null. This avoids null checks in the caller in many cases.

Parameters
?array$arrA
?array$arrB
callable(mixed,mixed):bool$elementEquals A function to compare non-null elements of $arrA and $arrB for equality.
Returns
bool True if $arrA and $arrB are equal.

◆ assertValidUTF8()

static Wikimedia\Parsoid\Utils\PHPUtils::assertValidUTF8 ( string $s)
static

Helper for verifying a valid UTF-8 encoding.

Using safeSubstr() is a more efficient way of doing this check in most places, where you can assume that the original string was valid UTF-8. This function does a complete traversal of the string, in time proportional to the length of the string.

Parameters
string$sThe string to check.

◆ deprecated()

static Wikimedia\Parsoid\Utils\PHPUtils::deprecated ( string $function,
string $version,
int $callerOffset = 2 )
static

Logs a warning that a deprecated feature was used.

Where possible, SiteConfig::deprecated() should be used instead, which will use similar capabilities in the host environment.

Parameters
string$functionFeature that is deprecated.
string$versionVersion of Parsoid that the feature was deprecated in
int$callerOffsetHow far up the call stack is the original caller. 2 = function that called the function that called PHPUtils::deprecated()

◆ encodeURIComponent()

static Wikimedia\Parsoid\Utils\PHPUtils::encodeURIComponent ( string $str)
static

JS-compatible encodeURIComponent function FIXME: See T221147 (for a post-port update)

Parameters
string$str
Returns
string

◆ filterDeprecationForTest()

static Wikimedia\Parsoid\Utils\PHPUtils::filterDeprecationForTest ( string $regex)
static

Deprecation messages matching the supplied regex will be suppressed.

Use this to filter deprecation warnings when testing deprecated code.

Parameters
string$regex

◆ iterable_to_array()

static Wikimedia\Parsoid\Utils\PHPUtils::iterable_to_array ( iterable $iterable)
static

Convert an iterable to an array.

This function is similar to but not the same as the built-in iterator_to_array, because arrays are iterable but not Traversable!

This function is also present in the wmde/iterable-functions library, but it's short enough that we don't need to pull in an entire new dependency here.

See also
https://stackoverflow.com/questions/44587973/php-iterable-to-array-or-traversable
https://github.com/wmde/iterable-functions/blob/master/src/functions.php

@template T

Parameters
iterable<T>$iterable
Returns
array<T>

◆ jsonDecode()

static Wikimedia\Parsoid\Utils\PHPUtils::jsonDecode ( string $str,
bool $assoc = true )
static

FIXME: Core has FormatJson::parse that does a more comprehensive job json_decode wrapper function.

Parameters
string$strString to decode into the json object
bool$assocControls whether to parse as an an associative array - defaults to true
Returns
mixed

◆ jsonEncode()

static Wikimedia\Parsoid\Utils\PHPUtils::jsonEncode ( $o)
static

FIXME: Core has FormatJson::encode that does a more comprehensive job.

json_encode wrapper function

  • unscapes slashes and unicode
Parameters
mixed$o
Returns
string

◆ lastItem()

static Wikimedia\Parsoid\Utils\PHPUtils::lastItem ( array $a)
static

Helper to get last item of the array.

Parameters
mixed[]$a
Returns
mixed

◆ makeSet()

static Wikimedia\Parsoid\Utils\PHPUtils::makeSet ( array $a)
static

Convert array to associative array usable as a read-only Set.

Parameters
list<string>$a
Returns
array<string,true>

◆ pushArray()

static Wikimedia\Parsoid\Utils\PHPUtils::pushArray ( array & $dest,
array ... $sources )
static

Append an array to an accumulator using the most efficient method available.

Pushing N elements onto $dest is guaranteed to be O(N).

See https://w.wiki/3zvE

Parameters
array&$destDestination array
array...$sources Arrays to merge

◆ reStrip()

static Wikimedia\Parsoid\Utils\PHPUtils::reStrip ( string $re,
?string $newDelimiter = null )
static

Helper for joining pieces of regular expressions together.

This safely strips delimiters from regular expression strings, while ensuring that the result is safely escaped for the new delimiter you plan to use (see the $delimiter argument to preg_quote). Note that using a meta-character for the new delimiter can lead to unexpected results; for example, if you use ! then escaping (?!foo) will break the regular expression.

Parameters
string$reThe regular expression to strip
?string$newDelimiterOptional delimiter which will be used when recomposing this stripped regular expression into a new regular expression.
Returns
string The regular expression without delimiters or flags

◆ safeSubstr()

static Wikimedia\Parsoid\Utils\PHPUtils::safeSubstr ( string $s,
int $start,
?int $length = null,
bool $checkEntireString = false )
static

Return a substring, asserting that it is valid UTF-8.

By default we assume the full string was valid UTF-8, which allows us to look at the first and last bytes to make this check. You can check the entire string if you are feeling paranoid; it will take O(N) time (where N is the length of the substring) but so does the substring operation.

If the substring would start beyond the end of the string or end before the start of the string, then this function will return the empty string (as would JavaScript); note that the native substr would return false in this case.

Using this helper instead of native substr is useful during the PHP port to verify that we don't break up Unicode codepoints by the switch from JavaScript UCS-2 offsets to PHP UTF-8 byte offsets.

Parameters
string$sThe (sub)string to check
int$startThe starting offset (in bytes). If negative, the offset is counted from the end of the string.
?int$length(optional) The maximum length of the returned string. If negative, the end position is counted from the end of the string.
bool$checkEntireStringWhether to do a slower verification of the entire string, not just the edges. Defaults to false.
Returns
string The checked substring

◆ sortArray()

static Wikimedia\Parsoid\Utils\PHPUtils::sortArray ( & $array)
static

Sort keys in an array, recursively, for better reproducibility.

(This is especially useful before serializing as JSON.)

Parameters
mixed&$array

◆ stripPrefix()

static Wikimedia\Parsoid\Utils\PHPUtils::stripPrefix ( $subject,
$prefix )
static

If a string starts with a given prefix, remove the prefix.

Otherwise, return the original string. Like preg_replace( "/^$prefix/", '', $subject ) except about 1.14x faster in the replacement case and 2x faster in the no-op case.

Note: adding type declarations to the parameters adds an overhead of 3%. The benchmark above was without type declarations.

Parameters
string$subject
string$prefix
Returns
string

◆ stripSuffix()

static Wikimedia\Parsoid\Utils\PHPUtils::stripSuffix ( $subject,
$suffix )
static

If a string ends with a given suffix, remove the suffix.

Otherwise, return the original string. Like preg_replace( "/$suffix$/", '', $subject ) except faster.

Parameters
string$subject
string$suffix
Returns
string

The documentation for this class was generated from the following file: