MediaWiki  master
ApiQueryDuplicateFiles.php
Go to the documentation of this file.
1 <?php
24 
31 
32  public function __construct( ApiQuery $query, $moduleName ) {
33  parent::__construct( $query, $moduleName, 'df' );
34  }
35 
36  public function execute() {
37  $this->run();
38  }
39 
40  public function getCacheMode( $params ) {
41  return 'public';
42  }
43 
44  public function executeGenerator( $resultPageSet ) {
45  $this->run( $resultPageSet );
46  }
47 
51  private function run( $resultPageSet = null ) {
52  $params = $this->extractRequestParams();
53  $namespaces = $this->getPageSet()->getGoodAndMissingTitlesByNamespace();
54  if ( empty( $namespaces[NS_FILE] ) ) {
55  return;
56  }
57  $images = $namespaces[NS_FILE];
58 
59  if ( $params['dir'] == 'descending' ) {
60  $images = array_reverse( $images );
61  }
62 
63  $skipUntilThisDup = false;
64  if ( isset( $params['continue'] ) ) {
65  $cont = explode( '|', $params['continue'] );
66  $this->dieContinueUsageIf( count( $cont ) != 2 );
67  $fromImage = $cont[0];
68  $skipUntilThisDup = $cont[1];
69  // Filter out any images before $fromImage
70  foreach ( $images as $image => $pageId ) {
71  if ( $image < $fromImage ) {
72  unset( $images[$image] );
73  } else {
74  break;
75  }
76  }
77  }
78 
79  $filesToFind = array_keys( $images );
80  $repoGroup = MediaWikiServices::getInstance()->getRepoGroup();
81  if ( $params['localonly'] ) {
82  $files = $repoGroup->getLocalRepo()->findFiles( $filesToFind );
83  } else {
84  $files = $repoGroup->findFiles( $filesToFind );
85  }
86 
87  $fit = true;
88  $count = 0;
89  $titles = [];
90 
91  $sha1s = [];
92  foreach ( $files as $file ) {
94  $sha1s[$file->getName()] = $file->getSha1();
95  }
96 
97  // find all files with the hashes, result format is:
98  // [ hash => [ dup1, dup2 ], hash1 => ... ]
99  $filesToFindBySha1s = array_unique( array_values( $sha1s ) );
100  if ( $params['localonly'] ) {
101  $filesBySha1s = $repoGroup->getLocalRepo()->findBySha1s( $filesToFindBySha1s );
102  } else {
103  $filesBySha1s = $repoGroup->findBySha1s( $filesToFindBySha1s );
104  }
105 
106  // iterate over $images to handle continue param correct
107  foreach ( $images as $image => $pageId ) {
108  if ( !isset( $sha1s[$image] ) ) {
109  continue; // file does not exist
110  }
111  $sha1 = $sha1s[$image];
112  $dupFiles = $filesBySha1s[$sha1];
113  if ( $params['dir'] == 'descending' ) {
114  $dupFiles = array_reverse( $dupFiles );
115  }
117  foreach ( $dupFiles as $dupFile ) {
118  $dupName = $dupFile->getName();
119  if ( $image == $dupName && $dupFile->isLocal() ) {
120  continue; // ignore the local file itself
121  }
122  if ( $skipUntilThisDup !== false && $dupName < $skipUntilThisDup ) {
123  continue; // skip to pos after the image from continue param
124  }
125  $skipUntilThisDup = false;
126  if ( ++$count > $params['limit'] ) {
127  $fit = false; // break outer loop
128  // We're one over limit which shows that
129  // there are additional images to be had. Stop here...
130  $this->setContinueEnumParameter( 'continue', $image . '|' . $dupName );
131  break;
132  }
133  if ( $resultPageSet !== null ) {
134  $titles[] = $dupFile->getTitle();
135  } else {
136  $r = [
137  'name' => $dupName,
138  'user' => $dupFile->getUser( 'text' ),
139  'timestamp' => wfTimestamp( TS_ISO_8601, $dupFile->getTimestamp() ),
140  'shared' => !$dupFile->isLocal(),
141  ];
142  $fit = $this->addPageSubItem( $pageId, $r );
143  if ( !$fit ) {
144  $this->setContinueEnumParameter( 'continue', $image . '|' . $dupName );
145  break;
146  }
147  }
148  }
149  if ( !$fit ) {
150  break;
151  }
152  }
153  if ( $resultPageSet !== null ) {
154  $resultPageSet->populateFromTitles( $titles );
155  }
156  }
157 
158  public function getAllowedParams() {
159  return [
160  'limit' => [
161  ApiBase::PARAM_DFLT => 10,
162  ApiBase::PARAM_TYPE => 'limit',
163  ApiBase::PARAM_MIN => 1,
166  ],
167  'continue' => [
168  ApiBase::PARAM_HELP_MSG => 'api-help-param-continue',
169  ],
170  'dir' => [
171  ApiBase::PARAM_DFLT => 'ascending',
173  'ascending',
174  'descending'
175  ]
176  ],
177  'localonly' => false,
178  ];
179  }
180 
181  protected function getExamplesMessages() {
182  return [
183  'action=query&titles=File:Albert_Einstein_Head.jpg&prop=duplicatefiles'
184  => 'apihelp-query+duplicatefiles-example-simple',
185  'action=query&generator=allimages&prop=duplicatefiles'
186  => 'apihelp-query+duplicatefiles-example-generated',
187  ];
188  }
189 
190  public function getHelpUrls() {
191  return 'https://www.mediawiki.org/wiki/Special:MyLanguage/API:Duplicatefiles';
192  }
193 }
ApiQuery
This is the main query class.
Definition: ApiQuery.php:37
MediaWiki\MediaWikiServices
MediaWikiServices is the service locator for the application scope of MediaWiki.
Definition: MediaWikiServices.php:154
ApiBase\PARAM_HELP_MSG
const PARAM_HELP_MSG
(string|array|Message) Specify an alternative i18n documentation message for this parameter.
Definition: ApiBase.php:107
ApiQueryDuplicateFiles\getExamplesMessages
getExamplesMessages()
Returns usage examples for this module.
Definition: ApiQueryDuplicateFiles.php:181
wfTimestamp
wfTimestamp( $outputtype=TS_UNIX, $ts=0)
Get a timestamp string in one of various formats.
Definition: GlobalFunctions.php:1808
ApiBase\PARAM_TYPE
const PARAM_TYPE
(boolean) Inverse of IntegerDef::PARAM_IGNORE_RANGE
Definition: ApiBase.php:71
NS_FILE
const NS_FILE
Definition: Defines.php:75
$file
if(PHP_SAPI !='cli-server') if(!isset( $_SERVER['SCRIPT_FILENAME'])) $file
Item class for a filearchive table row.
Definition: router.php:42
ApiQueryDuplicateFiles\__construct
__construct(ApiQuery $query, $moduleName)
Definition: ApiQueryDuplicateFiles.php:32
ApiQueryGeneratorBase\setContinueEnumParameter
setContinueEnumParameter( $paramName, $paramValue)
Overridden to set the generator param if in generator mode.
Definition: ApiQueryGeneratorBase.php:83
ApiBase\PARAM_MIN
const PARAM_MIN
(boolean) Inverse of IntegerDef::PARAM_IGNORE_RANGE
Definition: ApiBase.php:74
File
Implements some public methods and some protected utility functions which are required by multiple ch...
Definition: File.php:63
ApiQueryGeneratorBase\getPageSet
getPageSet()
Get the PageSet object to work on.
Definition: ApiQueryGeneratorBase.php:57
ApiQueryDuplicateFiles\getCacheMode
getCacheMode( $params)
Get the cache mode for the data generated by this module.
Definition: ApiQueryDuplicateFiles.php:40
ApiBase\LIMIT_BIG1
const LIMIT_BIG1
Fast query, standard limit.
Definition: ApiBase.php:165
ApiQueryDuplicateFiles\run
run( $resultPageSet=null)
Definition: ApiQueryDuplicateFiles.php:51
ApiBase\PARAM_MAX
const PARAM_MAX
(boolean) Inverse of IntegerDef::PARAM_IGNORE_RANGE
Definition: ApiBase.php:72
ApiBase\extractRequestParams
extractRequestParams( $options=[])
Using getAllowedParams(), this function makes an array of the values provided by the user,...
Definition: ApiBase.php:717
ApiQueryDuplicateFiles\getAllowedParams
getAllowedParams()
Returns an array of allowed parameters (parameter name) => (default value) or (parameter name) => (ar...
Definition: ApiQueryDuplicateFiles.php:158
ApiBase\dieContinueUsageIf
dieContinueUsageIf( $condition)
Die with the 'badcontinue' error.
Definition: ApiBase.php:1562
ApiQueryDuplicateFiles\executeGenerator
executeGenerator( $resultPageSet)
Execute this module as a generator.
Definition: ApiQueryDuplicateFiles.php:44
ApiQueryGeneratorBase
Stable to extend.
Definition: ApiQueryGeneratorBase.php:28
ApiQueryDuplicateFiles\execute
execute()
Evaluates the parameters, performs the requested query, and sets up the result.
Definition: ApiQueryDuplicateFiles.php:36
ApiBase\LIMIT_BIG2
const LIMIT_BIG2
Fast query, apihighlimits limit.
Definition: ApiBase.php:167
ApiBase\PARAM_DFLT
const PARAM_DFLT
(boolean) Inverse of IntegerDef::PARAM_IGNORE_RANGE
Definition: ApiBase.php:69
ApiBase\PARAM_MAX2
const PARAM_MAX2
(boolean) Inverse of IntegerDef::PARAM_IGNORE_RANGE
Definition: ApiBase.php:73
ApiQueryDuplicateFiles\getHelpUrls
getHelpUrls()
Return links to more detailed help pages about the module.
Definition: ApiQueryDuplicateFiles.php:190
ApiQueryDuplicateFiles
A query module to list duplicates of the given file(s)
Definition: ApiQueryDuplicateFiles.php:30
ApiQueryBase\addPageSubItem
addPageSubItem( $pageId, $item, $elemname=null)
Same as addPageSubItems(), but one element of $data at a time.
Definition: ApiQueryBase.php:498