MediaWiki master
MediaWiki\Utils\UrlUtils Class Reference

A service to expand, parse, and otherwise manipulate URLs. More...

Public Member Functions

 __construct (array $options=[])
 
 expand (string $url, $defaultProto=PROTO_FALLBACK)
 Expand a potentially local URL to a fully-qualified URL using $wgServer (or one of its alternatives).
 
 expandIRI (string $url)
 Take a URL, make sure it's expanded to fully qualified, and replace any encoded non-ASCII Unicode characters with their UTF-8 original forms for more compact display and legibility for local audiences.
 
 getCanonicalServer ()
 Get the canonical server, i.e.
 
 getServer ( $proto)
 Get the wiki's "server", i.e.
 
 matchesDomainList (string $url, array $domains)
 Check whether a given URL has a domain that occurs in a given set of domains.
 
 parse (string $url)
 Advanced and configurable version of parse_url().
 
 validAbsoluteProtocols ()
 Like validProtocols(), but excludes '//' from the protocol list.
 
 validProtocols ()
 Returns a partial regular expression of recognized URL protocols, e.g.
 

Static Public Member Functions

static assemble (array $urlParts)
 This function will reassemble a URL parsed with parse().
 
static removeDotSegments (string $urlPath)
 Remove all dot-segments in the provided URL path.
 

Public Attributes

const CANONICAL_SERVER = 'canonicalServer'
 
const FALLBACK_PROTOCOL = 'fallbackProtocol'
 
const HTTPS_PORT = 'httpsPort'
 
const INTERNAL_SERVER = 'internalServer'
 
const SERVER = 'server'
 
const VALID_PROTOCOLS = 'validProtocols'
 

Detailed Description

A service to expand, parse, and otherwise manipulate URLs.

Since
1.39
Stability: newable

Definition at line 16 of file UrlUtils.php.

Constructor & Destructor Documentation

◆ __construct()

MediaWiki\Utils\UrlUtils::__construct ( array $options = [])
Stability: stable
to call
Parameters
array$optionsAll keys are optional, but if you omit SERVER then calling expand() (and getServer(), expandIRI(), and matchesDomainList()) will throw. Recognized keys:
  • self::SERVER: The protocol and server portion of the URLs to expand, with no other parts (port, path, etc.). Example: 'https://example.com'. Protocol-relative URLs are allowed.
  • self::CANONICAL_SERVER: If SERVER is protocol-relative, this can be set to a fully-qualified version for use when PROTO_CANONICAL is passed to expand(). Defaults to SERVER, with 'http:' prepended if SERVER is protocol-relative.
  • self::INTERNAL_SERVER: An alternative to SERVER that's used when PROTO_INTERNAL is passed to expand(). It's intended for sites that have a different server name exposed to CDNs. Defaults to SERVER.
  • self::FALLBACK_PROTOCOL: Used by expand() when no $defaultProto parameter is provided. Defaults to 'http'. The instance created by ServiceWiring sets this to 'https' if the current request is detected to be via HTTPS, and 'http' otherwise.
  • self::HTTPS_PORT: Defaults to 443. Used when a protocol-relative URL is expanded to https.
  • self::VALID_PROTOCOLS: An array of recognized URL protocols. The default can be found in MainConfigSchema::UrlProtocols['default'].

Definition at line 68 of file UrlUtils.php.

References MediaWiki\Utils\UrlUtils\CANONICAL_SERVER, MediaWiki\Utils\UrlUtils\expand(), MediaWiki\Utils\UrlUtils\FALLBACK_PROTOCOL, MediaWiki\Utils\UrlUtils\HTTPS_PORT, MediaWiki\Utils\UrlUtils\INTERNAL_SERVER, PROTO_HTTP, MediaWiki\Utils\UrlUtils\SERVER, and MediaWiki\Utils\UrlUtils\VALID_PROTOCOLS.

Member Function Documentation

◆ assemble()

static MediaWiki\Utils\UrlUtils::assemble ( array $urlParts)
static

This function will reassemble a URL parsed with parse().

This is useful if you need to edit part of a URL and put it back together.

This is the basic structure used (brackets contain keys for $urlParts): [scheme][delimiter][user]:[pass]@[host]:[port][path]?[query]#[fragment]

Since
1.41
Parameters
array$urlPartsURL parts, as output from parse()
Returns
string URL assembled from its component parts

Definition at line 233 of file UrlUtils.php.

◆ expand()

MediaWiki\Utils\UrlUtils::expand ( string $url,
$defaultProto = PROTO_FALLBACK )

Expand a potentially local URL to a fully-qualified URL using $wgServer (or one of its alternatives).

The meaning of the PROTO_* constants is as follows: PROTO_HTTP: Output a URL starting with http:// PROTO_HTTPS: Output a URL starting with https:// PROTO_RELATIVE: Output a URL starting with // (protocol-relative URL) PROTO_FALLBACK: Output a URL starting with the FALLBACK_PROTOCOL option PROTO_CURRENT: Legacy alias for PROTO_FALLBACK PROTO_CANONICAL: For URLs without a domain, like /w/index.php, use CANONICAL_SERVER. For protocol-relative URLs, use the protocol of CANONICAL_SERVER PROTO_INTERNAL: Like PROTO_CANONICAL, but uses INTERNAL_SERVER instead of CANONICAL_SERVER

If $url specifies a protocol, or $url is domain-relative and $wgServer specifies a protocol, PROTO_HTTP, PROTO_HTTPS, PROTO_RELATIVE and PROTO_CURRENT do not change that.

Parent references (/../) in the path are resolved (as in ::removeDotSegments).

Todo
this won't work with current-path-relative URLs like "subdir/foo.html", etc.
Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string$urlAn URL; can be absolute (e.g. http://example.com/foo/bar), protocol-relative (//example.com/foo/bar) or domain-relative (/foo/bar).
string | int | null$defaultProtoOne of the PROTO_* constants, as described above.
Returns
?string Fully-qualified URL, current-path-relative URL or null if no valid URL can be constructed

Definition at line 124 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct(), and wfExpandUrl().

◆ expandIRI()

MediaWiki\Utils\UrlUtils::expandIRI ( string $url)

Take a URL, make sure it's expanded to fully qualified, and replace any encoded non-ASCII Unicode characters with their UTF-8 original forms for more compact display and legibility for local audiences.

Todo
handle punycode domains too
Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string$url
Returns
?string

Definition at line 472 of file UrlUtils.php.

◆ getCanonicalServer()

MediaWiki\Utils\UrlUtils::getCanonicalServer ( )

Get the canonical server, i.e.

the canonical protocol and host part of the wiki's URL.

Returns
string

Definition at line 217 of file UrlUtils.php.

◆ getServer()

MediaWiki\Utils\UrlUtils::getServer ( $proto)

Get the wiki's "server", i.e.

the protocol and host part of the URL, with a protocol specified using a PROTO_* constant as in expand()

Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string | int | null$protoOne of the PROTO_* constants.
Returns
?string The URL, or null on failure

Definition at line 204 of file UrlUtils.php.

◆ matchesDomainList()

MediaWiki\Utils\UrlUtils::matchesDomainList ( string $url,
array $domains )

Check whether a given URL has a domain that occurs in a given set of domains.

Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string$url
array$domainsArray of domains (strings)
Returns
bool True if the host part of $url ends in one of the strings in $domains

Definition at line 492 of file UrlUtils.php.

◆ parse()

MediaWiki\Utils\UrlUtils::parse ( string $url)

Advanced and configurable version of parse_url().

1) Add a "delimiter" element to the array, which helps permits to blindly re-assemble any URL regardless of protocol, including those that don't use ://, such as "mailto:" and "news:". 2) Reject URLs with protocols not in $wgUrlProtocols. 3) Reject relative or incomplete URLs that parse_url would return a partial array for.

If all you need is to extract parts of an HTTP or HTTPS URL (i.e. not specific to site-configurable extra protocols, or user input) then parse_url() can be used directly instead.

Parameters
string$urlA URL to parse
Returns
?string[] Bits of the URL in an associative array, or null on failure. Possible fields:
  • scheme: URI scheme (protocol), e.g. 'http', 'mailto'. Lowercase, always present, but can be an empty string for protocol-relative URLs.
  • delimiter: either '://', ':' or '//'. Always present.
  • host: domain name / IP. Always present, but could be an empty string, e.g. for file: URLs.
  • port: port number. Will be missing when port is not explicitly specified.
  • user: user name, e.g. for HTTP Basic auth URLs such as http://user:pass@example.com/ Missing when there is no username.
  • pass: password, same as above.
  • path: path including the leading /. Will be missing when empty (e.g. 'http://example.com')
  • query: query string (as a string; see wfCgiToArray() for parsing it), can be missing.
  • fragment: the part after #, can be missing.

Definition at line 421 of file UrlUtils.php.

References $url.

◆ removeDotSegments()

static MediaWiki\Utils\UrlUtils::removeDotSegments ( string $urlPath)
static

Remove all dot-segments in the provided URL path.

For example, '/a/./b/../c/' becomes '/a/c/'. For details on the algorithm, please see RFC3986 section 5.2.4.

Since
1.41
Parameters
string$urlPathURL path, potentially containing dot-segments
Returns
string URL path with all dot-segments removed

Definition at line 283 of file UrlUtils.php.

◆ validAbsoluteProtocols()

MediaWiki\Utils\UrlUtils::validAbsoluteProtocols ( )

Like validProtocols(), but excludes '//' from the protocol list.

Use this if you need a regex that matches all URL protocols but does not match protocol-relative URLs

Returns
string

Definition at line 365 of file UrlUtils.php.

◆ validProtocols()

MediaWiki\Utils\UrlUtils::validProtocols ( )

Returns a partial regular expression of recognized URL protocols, e.g.

"http:\\/\\/|https:\\/\\/"

Returns
string

Definition at line 354 of file UrlUtils.php.

Referenced by MediaWiki\Parser\Parser\__construct().

Member Data Documentation

◆ CANONICAL_SERVER

const MediaWiki\Utils\UrlUtils::CANONICAL_SERVER = 'canonicalServer'

Definition at line 18 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ FALLBACK_PROTOCOL

const MediaWiki\Utils\UrlUtils::FALLBACK_PROTOCOL = 'fallbackProtocol'

Definition at line 20 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ HTTPS_PORT

const MediaWiki\Utils\UrlUtils::HTTPS_PORT = 'httpsPort'

Definition at line 21 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ INTERNAL_SERVER

const MediaWiki\Utils\UrlUtils::INTERNAL_SERVER = 'internalServer'

Definition at line 19 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ SERVER

const MediaWiki\Utils\UrlUtils::SERVER = 'server'

Definition at line 17 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ VALID_PROTOCOLS

const MediaWiki\Utils\UrlUtils::VALID_PROTOCOLS = 'validProtocols'

Definition at line 22 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().


The documentation for this class was generated from the following file: