MediaWiki REL1_35
ApiQueryDuplicateFiles.php
Go to the documentation of this file.
1<?php
24
31
32 public function __construct( ApiQuery $query, $moduleName ) {
33 parent::__construct( $query, $moduleName, 'df' );
34 }
35
36 public function execute() {
37 $this->run();
38 }
39
40 public function getCacheMode( $params ) {
41 return 'public';
42 }
43
44 public function executeGenerator( $resultPageSet ) {
45 $this->run( $resultPageSet );
46 }
47
51 private function run( $resultPageSet = null ) {
52 $params = $this->extractRequestParams();
53 $namespaces = $this->getPageSet()->getGoodAndMissingTitlesByNamespace();
54 if ( empty( $namespaces[NS_FILE] ) ) {
55 return;
56 }
57 $images = $namespaces[NS_FILE];
58
59 if ( $params['dir'] == 'descending' ) {
60 $images = array_reverse( $images );
61 }
62
63 $skipUntilThisDup = false;
64 if ( isset( $params['continue'] ) ) {
65 $cont = explode( '|', $params['continue'] );
66 $this->dieContinueUsageIf( count( $cont ) != 2 );
67 $fromImage = $cont[0];
68 $skipUntilThisDup = $cont[1];
69 // Filter out any images before $fromImage
70 foreach ( $images as $image => $pageId ) {
71 if ( $image < $fromImage ) {
72 unset( $images[$image] );
73 } else {
74 break;
75 }
76 }
77 }
78
79 $filesToFind = array_keys( $images );
80 $repoGroup = MediaWikiServices::getInstance()->getRepoGroup();
81 if ( $params['localonly'] ) {
82 $files = $repoGroup->getLocalRepo()->findFiles( $filesToFind );
83 } else {
84 $files = $repoGroup->findFiles( $filesToFind );
85 }
86
87 $fit = true;
88 $count = 0;
89 $titles = [];
90
91 $sha1s = [];
92 foreach ( $files as $file ) {
94 $sha1s[$file->getName()] = $file->getSha1();
95 }
96
97 // find all files with the hashes, result format is:
98 // [ hash => [ dup1, dup2 ], hash1 => ... ]
99 $filesToFindBySha1s = array_unique( array_values( $sha1s ) );
100 if ( $params['localonly'] ) {
101 $filesBySha1s = $repoGroup->getLocalRepo()->findBySha1s( $filesToFindBySha1s );
102 } else {
103 $filesBySha1s = $repoGroup->findBySha1s( $filesToFindBySha1s );
104 }
105
106 // iterate over $images to handle continue param correct
107 foreach ( $images as $image => $pageId ) {
108 if ( !isset( $sha1s[$image] ) ) {
109 continue; // file does not exist
110 }
111 $sha1 = $sha1s[$image];
112 $dupFiles = $filesBySha1s[$sha1];
113 if ( $params['dir'] == 'descending' ) {
114 $dupFiles = array_reverse( $dupFiles );
115 }
117 foreach ( $dupFiles as $dupFile ) {
118 $dupName = $dupFile->getName();
119 if ( $image == $dupName && $dupFile->isLocal() ) {
120 continue; // ignore the local file itself
121 }
122 if ( $skipUntilThisDup !== false && $dupName < $skipUntilThisDup ) {
123 continue; // skip to pos after the image from continue param
124 }
125 $skipUntilThisDup = false;
126 if ( ++$count > $params['limit'] ) {
127 $fit = false; // break outer loop
128 // We're one over limit which shows that
129 // there are additional images to be had. Stop here...
130 $this->setContinueEnumParameter( 'continue', $image . '|' . $dupName );
131 break;
132 }
133 if ( $resultPageSet !== null ) {
134 $titles[] = $dupFile->getTitle();
135 } else {
136 $r = [
137 'name' => $dupName,
138 'user' => $dupFile->getUser( 'text' ),
139 'timestamp' => wfTimestamp( TS_ISO_8601, $dupFile->getTimestamp() ),
140 'shared' => !$dupFile->isLocal(),
141 ];
142 $fit = $this->addPageSubItem( $pageId, $r );
143 if ( !$fit ) {
144 $this->setContinueEnumParameter( 'continue', $image . '|' . $dupName );
145 break;
146 }
147 }
148 }
149 if ( !$fit ) {
150 break;
151 }
152 }
153 if ( $resultPageSet !== null ) {
154 $resultPageSet->populateFromTitles( $titles );
155 }
156 }
157
158 public function getAllowedParams() {
159 return [
160 'limit' => [
162 ApiBase::PARAM_TYPE => 'limit',
166 ],
167 'continue' => [
168 ApiBase::PARAM_HELP_MSG => 'api-help-param-continue',
169 ],
170 'dir' => [
171 ApiBase::PARAM_DFLT => 'ascending',
173 'ascending',
174 'descending'
175 ]
176 ],
177 'localonly' => false,
178 ];
179 }
180
181 protected function getExamplesMessages() {
182 return [
183 'action=query&titles=File:Albert_Einstein_Head.jpg&prop=duplicatefiles'
184 => 'apihelp-query+duplicatefiles-example-simple',
185 'action=query&generator=allimages&prop=duplicatefiles'
186 => 'apihelp-query+duplicatefiles-example-generated',
187 ];
188 }
189
190 public function getHelpUrls() {
191 return 'https://www.mediawiki.org/wiki/Special:MyLanguage/API:Duplicatefiles';
192 }
193}
wfTimestamp( $outputtype=TS_UNIX, $ts=0)
Get a timestamp string in one of various formats.
const PARAM_MAX2
Definition ApiBase.php:86
const PARAM_MAX
Definition ApiBase.php:82
dieContinueUsageIf( $condition)
Die with the 'badcontinue' error.
Definition ApiBase.php:1617
const PARAM_TYPE
Definition ApiBase.php:78
const PARAM_DFLT
Definition ApiBase.php:70
const PARAM_MIN
Definition ApiBase.php:90
const LIMIT_BIG1
Fast query, standard limit.
Definition ApiBase.php:220
extractRequestParams( $options=[])
Using getAllowedParams(), this function makes an array of the values provided by the user,...
Definition ApiBase.php:772
const PARAM_HELP_MSG
(string|array|Message) Specify an alternative i18n documentation message for this parameter.
Definition ApiBase.php:162
const LIMIT_BIG2
Fast query, apihighlimits limit.
Definition ApiBase.php:222
addPageSubItem( $pageId, $item, $elemname=null)
Same as addPageSubItems(), but one element of $data at a time.
A query module to list duplicates of the given file(s)
execute()
Evaluates the parameters, performs the requested query, and sets up the result.
executeGenerator( $resultPageSet)
Execute this module as a generator.
__construct(ApiQuery $query, $moduleName)
getAllowedParams()
Returns an array of allowed parameters (parameter name) => (default value) or (parameter name) => (ar...
getCacheMode( $params)
Get the cache mode for the data generated by this module.
getExamplesMessages()
Returns usage examples for this module.
getHelpUrls()
Return links to more detailed help pages about the module.
setContinueEnumParameter( $paramName, $paramValue)
Overridden to set the generator param if in generator mode.
getPageSet()
Get the PageSet object to work on.
This is the main query class.
Definition ApiQuery.php:37
Implements some public methods and some protected utility functions which are required by multiple ch...
Definition File.php:63
getSha1()
Get the SHA-1 base 36 hash of the file.
Definition File.php:2273
MediaWikiServices is the service locator for the application scope of MediaWiki.
const NS_FILE
Definition Defines.php:76
if(PHP_SAPI !='cli-server') if(!isset( $_SERVER['SCRIPT_FILENAME'])) $file
Item class for a filearchive table row.
Definition router.php:42