MediaWiki  1.23.14
LinkFilter.php
Go to the documentation of this file.
1 <?php
33 class LinkFilter {
34 
42  static function matchEntry( Content $content, $filterEntry ) {
43  if ( !( $content instanceof TextContent ) ) {
44  //TODO: handle other types of content too.
45  // Maybe create ContentHandler::matchFilter( LinkFilter ).
46  // Think about a common base class for LinkFilter and MagicWord.
47  return 0;
48  }
49 
50  $text = $content->getNativeData();
51 
52  $regex = LinkFilter::makeRegex( $filterEntry );
53  return preg_match( $regex, $text );
54  }
55 
63  private static function makeRegex( $filterEntry ) {
64  $regex = '!http://';
65  if ( substr( $filterEntry, 0, 2 ) == '*.' ) {
66  $regex .= '(?:[A-Za-z0-9.-]+\.|)';
67  $filterEntry = substr( $filterEntry, 2 );
68  }
69  $regex .= preg_quote( $filterEntry, '!' ) . '!Si';
70  return $regex;
71  }
72 
94  public static function makeLikeArray( $filterEntry, $protocol = 'http://' ) {
95  $db = wfGetDB( DB_MASTER );
96 
97  $target = $protocol . $filterEntry;
98  $bits = wfParseUrl( $target );
99 
100  if ( $bits == false ) {
101  // Unknown protocol?
102  return false;
103  }
104 
105  if ( substr( $bits['host'], 0, 2 ) == '*.' ) {
106  $subdomains = true;
107  $bits['host'] = substr( $bits['host'], 2 );
108  if ( $bits['host'] == '' ) {
109  // We don't want to make a clause that will match everything,
110  // that could be dangerous
111  return false;
112  }
113  } else {
114  $subdomains = false;
115  }
116 
117  // Reverse the labels in the hostname, convert to lower case
118  // For emails reverse domainpart only
119  if ( $bits['scheme'] === 'mailto' && strpos( $bits['host'], '@' ) ) {
120  // complete email address
121  $mailparts = explode( '@', $bits['host'] );
122  $domainpart = strtolower( implode( '.', array_reverse( explode( '.', $mailparts[1] ) ) ) );
123  $bits['host'] = $domainpart . '@' . $mailparts[0];
124  } elseif ( $bits['scheme'] === 'mailto' ) {
125  // domainpart of email address only, do not add '.'
126  $bits['host'] = strtolower( implode( '.', array_reverse( explode( '.', $bits['host'] ) ) ) );
127  } else {
128  $bits['host'] = strtolower( implode( '.', array_reverse( explode( '.', $bits['host'] ) ) ) );
129  if ( substr( $bits['host'], -1, 1 ) !== '.' ) {
130  $bits['host'] .= '.';
131  }
132  }
133 
134  $like[] = $bits['scheme'] . $bits['delimiter'] . $bits['host'];
135 
136  if ( $subdomains ) {
137  $like[] = $db->anyString();
138  }
139 
140  if ( isset( $bits['port'] ) ) {
141  $like[] = ':' . $bits['port'];
142  }
143  if ( isset( $bits['path'] ) ) {
144  $like[] = $bits['path'];
145  } elseif ( !$subdomains ) {
146  $like[] = '/';
147  }
148  if ( isset( $bits['query'] ) ) {
149  $like[] = '?' . $bits['query'];
150  }
151  if ( isset( $bits['fragment'] ) ) {
152  $like[] = '#' . $bits['fragment'];
153  }
154 
155  // Check for stray asterisks: asterisk only allowed at the start of the domain
156  foreach ( $like as $likepart ) {
157  if ( !( $likepart instanceof LikeMatch ) && strpos( $likepart, '*' ) !== false ) {
158  return false;
159  }
160  }
161 
162  if ( !( $like[count( $like ) - 1] instanceof LikeMatch ) ) {
163  // Add wildcard at the end if there isn't one already
164  $like[] = $db->anyString();
165  }
166 
167  return $like;
168  }
169 
177  public static function keepOneWildcard( $arr ) {
178  if ( !is_array( $arr ) ) {
179  return $arr;
180  }
181 
182  foreach ( $arr as $key => $value ) {
183  if ( $value instanceof LikeMatch ) {
184  return array_slice( $arr, 0, $key + 1 );
185  }
186  }
187 
188  return $arr;
189  }
190 }
DB_MASTER
const DB_MASTER
Definition: Defines.php:56
php
skin txt MediaWiki includes four core it has been set as the default in MediaWiki since the replacing Monobook it had been been the default skin since before being replaced by Vector largely rewritten in while keeping its appearance Several legacy skins were removed in the as the burden of supporting them became too heavy to bear Those in etc for skin dependent CSS etc for skin dependent JavaScript These can also be customised on a per user by etc This feature has led to a wide variety of user styles becoming that gallery is a good place to ending in php
Definition: skin.txt:62
wfGetDB
& wfGetDB( $db, $groups=array(), $wiki=false)
Get a Database object.
Definition: GlobalFunctions.php:3714
LinkFilter\keepOneWildcard
static keepOneWildcard( $arr)
Filters an array returned by makeLikeArray(), removing everything past first pattern placeholder.
Definition: LinkFilter.php:177
LinkFilter\matchEntry
static matchEntry(Content $content, $filterEntry)
Check whether $content contains a link to $filterEntry.
Definition: LinkFilter.php:42
LinkFilter\makeRegex
static makeRegex( $filterEntry)
Builds a regex pattern for $filterEntry.
Definition: LinkFilter.php:63
LikeMatch
Used by DatabaseBase::buildLike() to represent characters that have special meaning in SQL LIKE claus...
Definition: DatabaseUtility.php:296
wfParseUrl
wfParseUrl( $url)
parse_url() work-alike, but non-broken.
Definition: GlobalFunctions.php:802
LinkFilter\makeLikeArray
static makeLikeArray( $filterEntry, $protocol='http://')
Make an array to be used for calls to DatabaseBase::buildLike(), which will match the specified strin...
Definition: LinkFilter.php:94
$value
$value
Definition: styleTest.css.php:45
LinkFilter
Some functions to help implement an external link filter for spam control.
Definition: LinkFilter.php:33
Content\getNativeData
getNativeData()
Returns native representation of the data.
TextContent
Content object implementation for representing flat text.
Definition: TextContent.php:35
Content
Base interface for content objects.
Definition: Content.php:34
as
This document is intended to provide useful advice for parties seeking to redistribute MediaWiki to end users It s targeted particularly at maintainers for Linux since it s been observed that distribution packages of MediaWiki often break We ve consistently had to recommend that users seeking support use official tarballs instead of their distribution s and this often solves whatever problem the user is having It would be nice if this could such as
Definition: distributors.txt:9