Code Coverage |
||||||||||
Lines |
Functions and Methods |
Classes and Traits |
||||||||
Total | |
0.00% |
0 / 5 |
n/a |
0 / 0 |
CRAP | n/a |
0 / 0 |
1 | <?php |
2 | declare( strict_types = 1 ); |
3 | |
4 | namespace Wikimedia\Parsoid\Core; |
5 | |
6 | /** |
7 | * Interface for collecting the results of a parse. |
8 | * |
9 | * This class is used by Parsoid to record metainformation about a |
10 | * particular bit of parsed content which is extracted during the |
11 | * parse. This includes (for example) table of contents information, |
12 | * and lists of links/categories/templates/images present in the |
13 | * content. Expected cache lifetime of this parsed content is also |
14 | * recorded here, as it is influenced by certain things which may |
15 | * be encountered during the parse. |
16 | * |
17 | * In core this is implemented by ParserOutput. Core uses |
18 | * ParserOutput to record the rendered HTML (and rendered table of |
19 | * contents HTML), but on the Parsoid side we're going to keep |
20 | * rendered HTML DOM out of this interface (we use PageBundle for |
21 | * this). |
22 | */ |
23 | interface ContentMetadataCollector { |
24 | /* |
25 | * Internal implementation notes: |
26 | * This class was refactored out of ParserOutput in core. |
27 | * |
28 | * == Deliberately omitted == |
29 | * ::get*()/::has*() and other getters |
30 | * This is a builder-only interface. This also avoids ordering |
31 | * issues if/when Parsoid passes this class to sub-parses/extensions. |
32 | * ::setSpeculativeRevIdUsed() |
33 | * ::setRevisionTimestampUsed() |
34 | * ::setRevisionUsedSha1Base36() |
35 | * ::setSpeculativePageIdUsed() |
36 | * T292865: these should be plumbed through direct from ParserOptions |
37 | * or use the ::setOutputFlag() or addOutputData() mechanism. |
38 | * ::setTimestamp() |
39 | * This is used by ParserCache and is a little optimization used to |
40 | * show the correct 'article was last edited on blablablah' box on |
41 | * page views. Parsoid shouldn't need to worry about this; probably |
42 | * part of T292865. |
43 | * ::addCacheMessage() |
44 | * This is marked @internal in core. |
45 | * Not clear yet whether Parsoid needs this. |
46 | * ::getText()/::setText() |
47 | * T293512: rendered HTML doesn't belong in ParserOutput |
48 | * ::addWrapperDivClass()/::clearWrapperDivClass() |
49 | * Has to do with ::getText() implementation, see above |
50 | * ::setTitleText() |
51 | * Omited because it contains rendered HTML |
52 | * (should become a method which takes a DOM tree instead?) |
53 | * ::setTOCHTML() |
54 | * Omitted because it contains rendered HTML. |
55 | * T293513 will remove this method from ParserOutput |
56 | * ::addOutputHook() |
57 | * T292321 will remove this |
58 | * ::addHeadItem() |
59 | * Not clear this is needed by Parsoid (but maybe some of the stuff |
60 | * Parsoid adds to head could be refactored to use this interface). |
61 | * Should be DOM not string data! |
62 | * ::addOutputPageMetadata() |
63 | * OutputPage isn't a Parsoid interface, so this shouldn't be needed |
64 | * by Parsoid. |
65 | * ::setDisplayTitle() |
66 | * T293514: This desugars to calls to two other methods in |
67 | * ContentOutputBuilder; callers can refactor to invoke those directly. |
68 | * ::unsetPageProperty() |
69 | * If parse fragment A is setting a property |
70 | * and parse fragment B is unsetting the property, we've introduced |
71 | * an ordering dependency. We'd like to avoid that code pattern. |
72 | * ::resetParseStartTime()/::getTimeSinceStart() |
73 | * Not needed by parsoid? |
74 | * ::finalizeAdaptiveCacheExpiry() |
75 | * Same as above, can probably be invoked by caller of parsoid, |
76 | * doesn't need to be in Parsoid library code. |
77 | * ::mergeInternalMetaDataFrom() |
78 | * ::mergeHtmlMetaDataFrom() |
79 | * ::mergeTrackingMetaDataFrom() |
80 | * Rather than explicitly merging ContentMetadataCollectors, we'd |
81 | * prefer to pass a single ContentOutputBuilder around to accumulate |
82 | * results. We're going to wait and see to what extent methods like |
83 | * this are necessary. |
84 | * (ParserOutput will implement a ::mergeTo(ContentMetadataCollector) |
85 | * method, as it has read access to its own contents.) |
86 | * ::setNoGallery()/::setEnableOOUI()/::setNewSection()/::setHideNewSection() |
87 | * ::setPreventClickjacking()/::setIndexPolicy()/ |
88 | * Available via ::setOutputFlag() (see T292868) |
89 | * ::setCategories() |
90 | * Doesn't seem necessary, we have ::addCategory(). |
91 | * (And adding the ability to overwrite categories would be bad.) |
92 | * ::addTrackingCategory() |
93 | * This was moved to Parser / the TrackingCategories service, but |
94 | * perhaps it would be helpful if we had a version of this available |
95 | * from SiteConfig or something. |
96 | * ::isLinkInternal() |
97 | * T296036: Should be non-public or at least @internal? |
98 | * |
99 | * == Temporarily omitted == |
100 | * ::addLink()/::addInterwikiLink()/::addTrackingCategory() |
101 | * ::addImage() |
102 | * T296023: Takes a LinkTarget as a parameter; need alternative using a |
103 | * Parsoid-available type. (eg ::addImage() takes 'Title dbKey'; see |
104 | * T296037 to make it consistent) |
105 | * (Does ::addInterwikiLink() really need the internal test for |
106 | * $link->isExternal(), or should that be hoisted to the caller?) |
107 | * ::addTemplate() |
108 | * T296038: See above re Title-related types. In addition, this |
109 | * interacts with user hooks. The MediaWiki side should probably be |
110 | * responsible for updating the Template dependencies not Parsoid. |
111 | * OTOH, we need to return *something* like a Title back because |
112 | * eventually Parsoid has to fetch the template to expand it. |
113 | * ::setLanguageLinks() / ::addLanguageLink() |
114 | * T296019: This *should* accept an array of LinkTargets; see above re: |
115 | * Title-related types. |
116 | * ::setTitleText() |
117 | * T293514: This contains the title in HTML and is redundant with |
118 | * ::setDisplayTitle() |
119 | * ::setSections() |
120 | * T296025: Should be more structured |
121 | * ::setIndicator() |
122 | * Probably should be 'appendIndicator' for consistency? The `content` |
123 | * parameter is a string, but we'd probably want a DOM? If it's a |
124 | * DOM object we need to be able to JSON serialize and unserialize |
125 | * it for ParserCache. (T300980) |
126 | * ::addExtraCSPDefaultSrc() |
127 | * ::addExtraCSPStyleSrc() |
128 | * ::addExtraCSPScriptSrc() |
129 | * ::updateRuntimeAdaptiveExpiry() |
130 | * T296345: export a uniform interface for accumulator methods |
131 | */ |
132 | |
133 | /** |
134 | * Merge strategy to use for ContentMetadataCollector |
135 | * accumulators: "union" means that values are strings, stored as |
136 | * a set, and exposed as a PHP associative array mapping from |
137 | * values to `true`. |
138 | * |
139 | * This constant should be treated as @internal until we expose |
140 | * alternative merge strategies for external use. |
141 | * @internal |
142 | */ |
143 | public const MERGE_STRATEGY_UNION = 'union'; |
144 | |
145 | /** |
146 | * Add a category, with the given sort key. |
147 | * @note Note that titles frequently get stored as array keys, and when |
148 | * that happens in PHP, array_keys() will recover strings like '0' as |
149 | * integers (instead of strings). To avoid corner case bugs, we allow |
150 | * both integers and strings as titles (and sort keys). |
151 | * @note In the future, we might consider accepting a LinkTarget (or |
152 | * similar proxy) for $c instead of a string. |
153 | * |
154 | * @param string|int $c Category name |
155 | * @param string|int $sort Sort key (pass the empty string to use the default) |
156 | */ |
157 | public function addCategory( $c, $sort = '' ): void; |
158 | |
159 | /** |
160 | * Add a warning to the output for this page. |
161 | * @param string $msg The localization message key for the warning |
162 | * @param mixed ...$args Optional arguments for the message |
163 | */ |
164 | public function addWarningMsg( string $msg, ...$args ): void; |
165 | |
166 | /** |
167 | * @param string $url External link URL |
168 | */ |
169 | public function addExternalLink( string $url ): void; |
170 | |
171 | /** |
172 | * Provides a uniform interface to various boolean flags stored |
173 | * in the content metadata. Flags internal to MediaWiki core should |
174 | * have names which are constants in ParserOutputFlags. Extensions |
175 | * should use ::setExtensionData() rather than creating new flags |
176 | * with ::setOutputFlag() in order to prevent namespace conflicts. |
177 | * |
178 | * @param string $name A flag name |
179 | * @param bool $val |
180 | */ |
181 | public function setOutputFlag( string $name, bool $val = true ): void; |
182 | |
183 | /** |
184 | * Set a property to be stored in the page_props database table. |
185 | * |
186 | * page_props is a key value store indexed by the page ID. This allows |
187 | * the parser to set a property on a page which can then be quickly |
188 | * retrieved given the page ID or via a DB join when given the page |
189 | * title. |
190 | * |
191 | * page_props is also indexed by numeric value, to allow |
192 | * for efficient "top k" queries of pages wrt a given property. |
193 | * |
194 | * setPageProperty() is thus used to propagate properties from the parsed |
195 | * page to request contexts other than a page view of the currently parsed |
196 | * article. |
197 | * |
198 | * Some applications examples: |
199 | * |
200 | * * To implement hidden categories, hiding pages from category listings |
201 | * by storing a property. |
202 | * |
203 | * * Overriding the displayed article title |
204 | * (ContentMetadataCollector::setDisplayTitle()). |
205 | * |
206 | * * To implement image tagging, for example displaying an icon on an |
207 | * image thumbnail to indicate that it is listed for deletion on |
208 | * Wikimedia Commons. |
209 | * This is not actually implemented, yet but would be pretty cool. |
210 | * |
211 | * @note Do not use setPageProperty() to set a property which is only used |
212 | * in a context where the content metadata itself is already available, |
213 | * for example a normal page view. There is no need to save such a property |
214 | * in the database since the text is already parsed. You can just hook |
215 | * OutputPageParserOutput and get your data out of the ParserOutput object. |
216 | * |
217 | * If you are writing an extension where you want to set a property in the |
218 | * parser which is used by an OutputPageParserOutput hook, you have to |
219 | * associate the extension data directly with the ParserOutput object. |
220 | * Since MediaWiki 1.21, you can use setExtensionData() to do this: |
221 | * |
222 | * @par Example: |
223 | * @code |
224 | * $parser->getOutput()->setExtensionData( 'my_ext_foo', '...' ); |
225 | * @endcode |
226 | * |
227 | * And then later, in OutputPageParserOutput or similar: |
228 | * |
229 | * @par Example: |
230 | * @code |
231 | * $output->getExtensionData( 'my_ext_foo' ); |
232 | * @endcode |
233 | * |
234 | * @note Only scalar values like numbers and strings are supported |
235 | * as a value. Attempt to use an object or array will |
236 | * not work properly with LinksUpdate. |
237 | * |
238 | * @note As with ::setJsConfigVar(), setting a page property to multiple |
239 | * conflicting values during the parse is not supported. |
240 | * |
241 | * @param string $name |
242 | * @param int|float|string|bool|null $value |
243 | */ |
244 | public function setPageProperty( string $name, $value ): void; |
245 | |
246 | /** |
247 | * Attaches arbitrary data to this content. This can be used to |
248 | * store some information for later use during page output. The |
249 | * data will be cached along with the parsed page, but unlike data |
250 | * set using setPageProperty(), it is not recorded in the |
251 | * database. |
252 | * |
253 | * To use setExtensionData() to pass extension information from a |
254 | * hook inside the parser to a hook in the page output, use this |
255 | * in the parser hook: |
256 | * |
257 | * @par Example: |
258 | * @code |
259 | * $parser->getOutput()->setExtensionData( 'my_ext_foo', '...' ); |
260 | * @endcode |
261 | * |
262 | * And then later, in OutputPageParserOutput or similar: |
263 | * |
264 | * @par Example: |
265 | * @code |
266 | * $output->getExtensionData( 'my_ext_foo' ); |
267 | * @endcode |
268 | * |
269 | * @note Only scalar values, e.g. numbers, strings, arrays or |
270 | * MediaWiki\Json\JsonUnserializable instances are supported as a |
271 | * value. Attempt to set other class instance as a extension data |
272 | * will break ParserCache for the page. |
273 | * |
274 | * @note As with ::setJsConfigVar(), setting a page property to multiple |
275 | * conflicting values during the parse is not supported. |
276 | * |
277 | * @param string $key The key for accessing the data. Extensions |
278 | * should take care to avoid conflicts in naming keys. It is |
279 | * suggested to use the extension's name as a prefix. Keys |
280 | * beginning with `mw-` are reserved for use by mediawiki core. |
281 | * |
282 | * @param mixed $value The value to set. |
283 | * Setting a value to null is equivalent to removing the value. |
284 | */ |
285 | public function setExtensionData( string $key, $value ): void; |
286 | |
287 | /** |
288 | * Appends arbitrary data to this ParserObject. This can be used |
289 | * to store some information in the ParserOutput object for later |
290 | * use during page output. The data will be cached along with the |
291 | * ParserOutput object, but unlike data set using |
292 | * setPageProperty(), it is not recorded in the database. |
293 | * |
294 | * See ::setExtensionData() for more details on rationale and use. |
295 | * |
296 | * In order to provide for out-of-order/asynchronous/incremental |
297 | * parsing, this method appends values to a set. See |
298 | * ::setExtensionData() for the flag-like version of this method. |
299 | * |
300 | * @note Only values which can be array keys are currently supported |
301 | * as values. Be aware that array keys which 'look like' numbers are |
302 | * converted to ints by PHP, and so if you put in `"0"` as a value you |
303 | * will get `[0=>true]` out. |
304 | * |
305 | * @param string $key The key for accessing the data. Extensions should take care to avoid |
306 | * conflicts in naming keys. It is suggested to use the extension's name as a prefix. |
307 | * |
308 | * @param int|string $value The value to append to the list. |
309 | * @param string $strategy Merge strategy: |
310 | * only MW_MERGE_STRATEGY_UNION is currently supported and external callers |
311 | * should treat this parameter as @internal at this time and omit it. |
312 | */ |
313 | public function appendExtensionData( |
314 | string $key, |
315 | $value, |
316 | string $strategy = self::MERGE_STRATEGY_UNION |
317 | ): void; |
318 | |
319 | /** |
320 | * Add a variable to be set in mw.config in JavaScript. |
321 | * |
322 | * In order to ensure the result is independent of the parse order, the values |
323 | * set here must be unique -- that is, you can pass the same $key |
324 | * multiple times but ONLY if the $value is identical each time. |
325 | * If you want to collect multiple pieces of data under a single key, |
326 | * use ::appendJsConfigVar(). |
327 | * |
328 | * @param string $key Key to use under mw.config |
329 | * @param mixed|null $value Value of the configuration variable. |
330 | */ |
331 | public function setJsConfigVar( string $key, $value ): void; |
332 | |
333 | /** |
334 | * Append a value to a variable to be set in mw.config in JavaScript. |
335 | * |
336 | * In order to ensure the result is independent of the parse order, |
337 | * the value of this key will be an associative array, mapping all of |
338 | * the values set under that key to true. (The array is implicitly |
339 | * ordered in PHP, but you should treat it as unordered.) |
340 | * If you want a non-array type for the key, and can ensure that only |
341 | * a single value will be set, you should use ::setJsConfigVar() instead. |
342 | * |
343 | * @note Only values which can be array keys are currently supported |
344 | * as values. Be aware that array keys which 'look like' numbers are |
345 | * converted to ints by PHP, and so if you put in `"0"` as a value you |
346 | * will get `[0=>true]` out. |
347 | * |
348 | * @param string $key Key to use under mw.config |
349 | * @param string $value Value to append to the configuration variable. |
350 | * @param string $strategy Merge strategy: |
351 | * only MW_MERGE_STRATEGY_UNION is currently supported and external callers |
352 | * should treat this parameter as @internal at this time and omit it. |
353 | */ |
354 | public function appendJsConfigVar( |
355 | string $key, |
356 | string $value, |
357 | string $strategy = self::MERGE_STRATEGY_UNION |
358 | ): void; |
359 | |
360 | /** |
361 | * @see OutputPage::addModules |
362 | * @param string[] $modules |
363 | */ |
364 | public function addModules( array $modules ): void; |
365 | |
366 | /** |
367 | * @see OutputPage::addModuleStyles |
368 | * @param string[] $modules |
369 | */ |
370 | public function addModuleStyles( array $modules ): void; |
371 | |
372 | /** |
373 | * Sets parser limit report data for a key |
374 | * |
375 | * The key is used as the prefix for various messages used for formatting: |
376 | * - $key: The label for the field in the limit report |
377 | * - $key-value-text: Message used to format the value in the "NewPP limit |
378 | * report" HTML comment. If missing, uses $key-format. |
379 | * - $key-value-html: Message used to format the value in the preview |
380 | * limit report table. If missing, uses $key-format. |
381 | * - $key-value: Message used to format the value. If missing, uses "$1". |
382 | * |
383 | * Note that all values are interpreted as wikitext, and so should be |
384 | * encoded with htmlspecialchars() as necessary, but should avoid complex |
385 | * HTML for sanity of display in the "NewPP limit report" comment. |
386 | * |
387 | * @param string $key Message key |
388 | * @param mixed $value Appropriate for Message::params() |
389 | */ |
390 | public function setLimitReportData( string $key, $value ): void; |
391 | |
392 | /** |
393 | * Sets Table of Contents data for this page. |
394 | * |
395 | * Note that merging of TOCData is not supported; exactly one fragment |
396 | * should set TOCData. |
397 | * |
398 | * @param TOCData $tocData |
399 | */ |
400 | public function setTOCData( TOCData $tocData ): void; |
401 | } |