MediaWiki  master
MediaWiki\Utils\UrlUtils Class Reference

A service to expand, parse, and otherwise manipulate URLs. More...

Public Member Functions

 __construct (array $options=[])
 
 assemble (array $urlParts)
 This function will reassemble a URL parsed with parse(). More...
 
 expand (string $url, $defaultProto=PROTO_FALLBACK)
 Expand a potentially local URL to a fully-qualified URL. More...
 
 expandIRI (string $url)
 Take a URL, make sure it's expanded to fully qualified, and replace any encoded non-ASCII Unicode characters with their UTF-8 original forms for more compact display and legibility for local audiences. More...
 
 getServer ( $proto)
 Get the wiki's "server", i.e. More...
 
 matchesDomainList (string $url, array $domains)
 Check whether a given URL has a domain that occurs in a given set of domains. More...
 
 parse (string $url)
 parse_url() work-alike, but non-broken. More...
 
 removeDotSegments (string $urlPath)
 Remove all dot-segments in the provided URL path. More...
 
 validAbsoluteProtocols ()
 Like validProtocols(), but excludes '//' from the protocol list. More...
 
 validProtocols ()
 Returns a regular expression of recognized URL protocols. More...
 

Public Attributes

const CANONICAL_SERVER = 'canonicalServer'
 
const FALLBACK_PROTOCOL = 'fallbackProtocol'
 
const HTTPS_PORT = 'httpsPort'
 
const INTERNAL_SERVER = 'internalServer'
 
const SERVER = 'server'
 
const VALID_PROTOCOLS = 'validProtocols'
 

Detailed Description

A service to expand, parse, and otherwise manipulate URLs.

Since
1.39
Stability: newable

Definition at line 17 of file UrlUtils.php.

Constructor & Destructor Documentation

◆ __construct()

MediaWiki\Utils\UrlUtils::__construct ( array  $options = [])
Stability: stable
to call
Parameters
array$optionsAll keys are optional, but if you omit SERVER then calling expand() (and getServer(), expandIRI(), and matchesDomainList()) will throw. Recognized keys:
  • self::SERVER: The protocol and server portion of the URLs to expand, with no other parts (port, path, etc.). Example: 'https://example.com'. Protocol-relative URLs are allowed.
  • self::CANONICAL_SERVER: If SERVER is protocol-relative, this can be set to a fully-qualified version for use when PROTO_CANONICAL is passed to expand(). Defaults to SERVER, with 'http:' prepended if SERVER is protocol-relative.
  • self::INTERNAL_SERVER: An alternative to SERVER that's used when PROTO_INTERNAL is passed to expand(). It's intended for sites that have a different server name exposed to CDNs. Defaults to SERVER.
  • self::FALLBACK_PROTOCOL: Used by expand() when no $defaultProto parameter is provided. Defaults to 'http'. The instance created by ServiceWiring sets this to 'https' if the current request is detected to be via HTTPS, and 'http' otherwise.
  • self::HTTPS_PORT: Defaults to 443. Used when a protocol-relative URL is expanded to https.
  • self::VALID_PROTOCOLS: An array of recognized URL protocols. The default can be found in MainConfigSchema::UrlProtocols['default'].

Definition at line 69 of file UrlUtils.php.

References MediaWiki\Utils\UrlUtils\CANONICAL_SERVER, MediaWiki\Utils\UrlUtils\expand(), MediaWiki\Utils\UrlUtils\FALLBACK_PROTOCOL, MediaWiki\Utils\UrlUtils\HTTPS_PORT, MediaWiki\Utils\UrlUtils\INTERNAL_SERVER, PROTO_HTTP, MediaWiki\Utils\UrlUtils\SERVER, and MediaWiki\Utils\UrlUtils\VALID_PROTOCOLS.

Member Function Documentation

◆ assemble()

MediaWiki\Utils\UrlUtils::assemble ( array  $urlParts)

This function will reassemble a URL parsed with parse().

This is useful if you need to edit part of a URL and put it back together.

This is the basic structure used (brackets contain keys for $urlParts): [scheme][delimiter][user]:[pass]@[host]:[port][path]?[query]#[fragment]

Todo:
Need to integrate this into expand() (see T34168)
Parameters
array$urlPartsURL parts, as output from parse()
Returns
string URL assembled from its component parts

Definition at line 218 of file UrlUtils.php.

◆ expand()

MediaWiki\Utils\UrlUtils::expand ( string  $url,
  $defaultProto = PROTO_FALLBACK 
)

Expand a potentially local URL to a fully-qualified URL.

The meaning of the PROTO_* constants is as follows: PROTO_HTTP: Output a URL starting with http:// PROTO_HTTPS: Output a URL starting with https:// PROTO_RELATIVE: Output a URL starting with // (protocol-relative URL) PROTO_FALLBACK: Output a URL starting with the FALLBACK_PROTOCOL option PROTO_CURRENT: Legacy alias for PROTO_FALLBACK PROTO_CANONICAL: For URLs without a domain, like /w/index.php, use CANONICAL_SERVER. For protocol-relative URLs, use the protocol of CANONICAL_SERVER PROTO_INTERNAL: Like PROTO_CANONICAL, but uses INTERNAL_SERVER instead of CANONICAL_SERVER

Todo:
this won't work with current-path-relative URLs like "subdir/foo.html", etc.
Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string$urlEither fully-qualified or a local path + query
string | int | null$defaultProtoOne of the PROTO_* constants. Determines the protocol to use if $url or SERVER is protocol-relative
Returns
?string Fully-qualified URL, current-path-relative URL or null if no valid URL can be constructed

Definition at line 118 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ expandIRI()

MediaWiki\Utils\UrlUtils::expandIRI ( string  $url)

Take a URL, make sure it's expanded to fully qualified, and replace any encoded non-ASCII Unicode characters with their UTF-8 original forms for more compact display and legibility for local audiences.

Todo:
handle punycode domains too
Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string$url
Returns
?string

Definition at line 476 of file UrlUtils.php.

◆ getServer()

MediaWiki\Utils\UrlUtils::getServer (   $proto)

Get the wiki's "server", i.e.

the protocol and host part of the URL, with a protocol specified using a PROTO_* constant as in expand()

Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string | int | null$protoOne of the PROTO_* constants.
Returns
?string The URL, or null on failure

Definition at line 198 of file UrlUtils.php.

◆ matchesDomainList()

MediaWiki\Utils\UrlUtils::matchesDomainList ( string  $url,
array  $domains 
)

Check whether a given URL has a domain that occurs in a given set of domains.

Exceptions
BadMethodCallExceptionif no server was passed to the constructor
Parameters
string$url
array$domainsArray of domains (strings)
Returns
bool True if the host part of $url ends in one of the strings in $domains

Definition at line 498 of file UrlUtils.php.

◆ parse()

MediaWiki\Utils\UrlUtils::parse ( string  $url)

parse_url() work-alike, but non-broken.

Differences:

1) Handles protocols that don't use :// (e.g., mailto: and news:, as well as protocol-relative URLs) correctly. 2) Adds a "delimiter" element to the array (see (2)). 3) Verifies that the protocol is on the UrlProtocols allowed list. 4) Rejects some invalid URLs that parse_url doesn't, e.g. the empty string or URLs starting with a line feed character.

Parameters
string$urlA URL to parse
Returns
?string[] Bits of the URL in an associative array, or null on failure. Possible fields:
  • scheme: URI scheme (protocol), e.g. 'http', 'mailto'. Lowercase, always present, but can be an empty string for protocol-relative URLs.
  • delimiter: either '://', ':' or '//'. Always present.
  • host: domain name / IP. Always present, but could be an empty string, e.g. for file: URLs.
  • port: port number. Will be missing when port is not explicitly specified.
  • user: user name, e.g. for HTTP Basic auth URLs such as http://user:pass@example.com/ Missing when there is no username.
  • pass: password, same as above.
  • path: path including the leading /. Will be missing when empty (e.g. 'http://example.com')
  • query: query string (as a string; see wfCgiToArray() for parsing it), can be missing.
  • fragment: the part after #, can be missing.

Definition at line 410 of file UrlUtils.php.

◆ removeDotSegments()

MediaWiki\Utils\UrlUtils::removeDotSegments ( string  $urlPath)

Remove all dot-segments in the provided URL path.

For example, '/a/./b/../c/' becomes '/a/c/'. For details on the algorithm, please see RFC3986 section 5.2.4.

Todo:
Need to integrate this into expand() (see T34168)
Parameters
string$urlPathURL path, potentially containing dot-segments
Returns
string URL path with all dot-segments removed

Definition at line 269 of file UrlUtils.php.

◆ validAbsoluteProtocols()

MediaWiki\Utils\UrlUtils::validAbsoluteProtocols ( )

Like validProtocols(), but excludes '//' from the protocol list.

Use this if you need a regex that matches all URL protocols but does not match protocol-relative URLs

Returns
string

Definition at line 354 of file UrlUtils.php.

◆ validProtocols()

MediaWiki\Utils\UrlUtils::validProtocols ( )

Returns a regular expression of recognized URL protocols.

Returns
string

Definition at line 340 of file UrlUtils.php.

Referenced by Parser\__construct().

Member Data Documentation

◆ CANONICAL_SERVER

const MediaWiki\Utils\UrlUtils::CANONICAL_SERVER = 'canonicalServer'

Definition at line 19 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ FALLBACK_PROTOCOL

const MediaWiki\Utils\UrlUtils::FALLBACK_PROTOCOL = 'fallbackProtocol'

Definition at line 21 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ HTTPS_PORT

const MediaWiki\Utils\UrlUtils::HTTPS_PORT = 'httpsPort'

Definition at line 22 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ INTERNAL_SERVER

const MediaWiki\Utils\UrlUtils::INTERNAL_SERVER = 'internalServer'

Definition at line 20 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ SERVER

const MediaWiki\Utils\UrlUtils::SERVER = 'server'

Definition at line 18 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().

◆ VALID_PROTOCOLS

const MediaWiki\Utils\UrlUtils::VALID_PROTOCOLS = 'validProtocols'

Definition at line 23 of file UrlUtils.php.

Referenced by MediaWiki\Utils\UrlUtils\__construct().


The documentation for this class was generated from the following file: