Code Coverage |
||||||||||
Lines |
Functions and Methods |
Classes and Traits |
||||||||
Total | n/a |
0 / 0 |
n/a |
0 / 0 |
CRAP | n/a |
0 / 0 |
1 | <?php |
2 | declare( strict_types = 1 ); |
3 | |
4 | namespace Wikimedia\Parsoid\Core; |
5 | |
6 | /** |
7 | * Interface for collecting the results of a parse. |
8 | * |
9 | * This class is used by Parsoid to record metainformation about a |
10 | * particular bit of parsed content which is extracted during the |
11 | * parse. This includes (for example) table of contents information, |
12 | * and lists of links/categories/templates/images present in the |
13 | * content. Expected cache lifetime of this parsed content is also |
14 | * recorded here, as it is influenced by certain things which may |
15 | * be encountered during the parse. |
16 | * |
17 | * In core this is implemented by ParserOutput. Core uses |
18 | * ParserOutput to record the rendered HTML (and rendered table of |
19 | * contents HTML), but on the Parsoid side we're going to keep |
20 | * rendered HTML DOM out of this interface (we use PageBundle for |
21 | * this). |
22 | */ |
23 | interface ContentMetadataCollector { |
24 | /* |
25 | * Internal implementation notes: |
26 | * This class was refactored out of ParserOutput in core. |
27 | * |
28 | * == Deliberately omitted == |
29 | * ::get*()/::has*() and other getters |
30 | * This is a builder-only interface. This also avoids ordering |
31 | * issues if/when Parsoid passes this class to sub-parses/extensions. |
32 | * ::setSpeculativeRevIdUsed() |
33 | * ::setRevisionTimestampUsed() |
34 | * ::setRevisionUsedSha1Base36() |
35 | * ::setSpeculativePageIdUsed() |
36 | * T292865: these should be plumbed through direct from ParserOptions |
37 | * or use the ::setOutputFlag() or addOutputData() mechanism. |
38 | * ::setTimestamp() |
39 | * This is used by ParserCache and is a little optimization used to |
40 | * show the correct 'article was last edited on blablablah' box on |
41 | * page views. Parsoid shouldn't need to worry about this; probably |
42 | * part of T292865. |
43 | * ::addCacheMessage() |
44 | * This is marked @internal in core. |
45 | * Not clear yet whether Parsoid needs this. |
46 | * ::getText()/::setText() |
47 | * T293512: rendered HTML doesn't belong in ParserOutput |
48 | * ::addWrapperDivClass()/::clearWrapperDivClass() |
49 | * Has to do with ::getText() implementation, see above |
50 | * ::setTitleText() |
51 | * Omited because it contains rendered HTML |
52 | * (should become a method which takes a DOM tree instead?) |
53 | * ::setTOCHTML() |
54 | * Omitted because it contains rendered HTML. |
55 | * T293513 will remove this method from ParserOutput |
56 | * ::addHeadItem() |
57 | * Not clear this is needed by Parsoid (but maybe some of the stuff |
58 | * Parsoid adds to head could be refactored to use this interface). |
59 | * Should be DOM not string data! |
60 | * ::addOutputPageMetadata() |
61 | * OutputPage isn't a Parsoid interface, so this shouldn't be needed |
62 | * by Parsoid. |
63 | * ::setDisplayTitle() |
64 | * T293514: This desugars to calls to two other methods in |
65 | * ContentOutputBuilder; callers can refactor to invoke those directly. |
66 | * ::unsetPageProperty() |
67 | * If parse fragment A is setting a property |
68 | * and parse fragment B is unsetting the property, we've introduced |
69 | * an ordering dependency. We'd like to avoid that code pattern. |
70 | * ::resetParseStartTime()/::getTimeSinceStart() |
71 | * Not needed by parsoid? |
72 | * ::finalizeAdaptiveCacheExpiry() |
73 | * Same as above, can probably be invoked by caller of parsoid, |
74 | * doesn't need to be in Parsoid library code. |
75 | * ::mergeInternalMetaDataFrom() |
76 | * ::mergeHtmlMetaDataFrom() |
77 | * ::mergeTrackingMetaDataFrom() |
78 | * Rather than explicitly merging ContentMetadataCollectors, we'd |
79 | * prefer to pass a single ContentOutputBuilder around to accumulate |
80 | * results. We're going to wait and see to what extent methods like |
81 | * this are necessary. |
82 | * (ParserOutput will implement a ::mergeTo(ContentMetadataCollector) |
83 | * method, as it has read access to its own contents.) |
84 | * ::setNoGallery()/::setEnableOOUI()/::setNewSection()/::setHideNewSection() |
85 | * ::setPreventClickjacking()/::setIndexPolicy()/ |
86 | * Available via ::setOutputFlag() (see T292868) |
87 | * ::setCategories() |
88 | * Doesn't seem necessary, we have ::addCategory(). |
89 | * (And adding the ability to overwrite categories would be bad.) |
90 | * ::addTrackingCategory() |
91 | * This was moved to Parser / the TrackingCategories service, but |
92 | * perhaps it would be helpful if we had a version of this available |
93 | * from SiteConfig or something. |
94 | * ::isLinkInternal() |
95 | * T296036: Should be non-public or at least @internal? |
96 | * ::addInterwikiLink() |
97 | * invoked from ::addLink() if the link is external, we don't |
98 | * need a separate entry point. |
99 | * ::setSections() |
100 | * T296025: replaced with ::setTOCData() |
101 | * ::setLanguageLinks() |
102 | * Deprecated; replaced with ::addLanguageLink() |
103 | * ::addExtraCSPDefaultSrc() |
104 | * ::addExtraCSPStyleSrc() |
105 | * ::addExtraCSPScriptSrc() |
106 | * ::updateRuntimeAdaptiveExpiry() |
107 | * T296345: handled through ::appendOutputStrings() |
108 | * |
109 | * == Temporarily omitted == |
110 | * ::addTemplate() |
111 | * T296038: Requires page id and revision id. In addition, this |
112 | * interacts with user hooks. The MediaWiki side should probably be |
113 | * responsible for updating the Template dependencies not Parsoid. |
114 | * OTOH, we need to return *something* like a Title back because |
115 | * eventually Parsoid has to fetch the template to expand it. |
116 | * ::setTitleText() |
117 | * T293514: This contains the title in HTML and is redundant with |
118 | * ::setDisplayTitle() |
119 | */ |
120 | |
121 | /** |
122 | * Merge strategy to use for ContentMetadataCollector |
123 | * accumulators: "union" means that values are strings, stored as |
124 | * a set, and exposed as a PHP associative array mapping from |
125 | * values to `true`. |
126 | * |
127 | * This constant should be treated as @internal until we expose |
128 | * alternative merge strategies for external use. |
129 | * @internal |
130 | */ |
131 | public const MERGE_STRATEGY_UNION = 'union'; |
132 | |
133 | /** |
134 | * Add a category, with the given sort key. |
135 | * |
136 | * @param LinkTarget $c Category name |
137 | * @param string $sort Sort key (pass the empty string to use the default) |
138 | */ |
139 | public function addCategory( $c, $sort = '' ): void; |
140 | |
141 | /** |
142 | * Record a local or interwiki inline link for saving in future link tables. |
143 | * |
144 | * @param LinkTarget $link (used to require Title until 1.38) |
145 | * @param int|null $id Optional known page_id so we can skip the lookup |
146 | * (generally not used by Parsoid) |
147 | */ |
148 | public function addLink( LinkTarget $link, $id = null ): void; |
149 | |
150 | /** |
151 | * Register a file dependency for this output |
152 | * @param LinkTarget $name Title dbKey |
153 | * @param string|false|null $timestamp MW timestamp of file creation (or false if non-existing) |
154 | * @param string|false|null $sha1 Base 36 SHA-1 of file (or false if non-existing) |
155 | */ |
156 | public function addImage( LinkTarget $name, $timestamp = null, $sha1 = null ): void; |
157 | |
158 | /** |
159 | * Add a language link. |
160 | * @param LinkTarget $lt |
161 | */ |
162 | public function addLanguageLink( LinkTarget $lt ): void; |
163 | |
164 | /** |
165 | * Add a warning to the output for this page. |
166 | * @param string $msg The localization message key for the warning |
167 | * @param mixed ...$args Optional arguments for the message |
168 | */ |
169 | public function addWarningMsg( string $msg, ...$args ): void; |
170 | |
171 | /** |
172 | * @param string $url External link URL |
173 | */ |
174 | public function addExternalLink( string $url ): void; |
175 | |
176 | /** |
177 | * Provides a uniform interface to various boolean flags stored |
178 | * in the content metadata. Flags internal to MediaWiki core should |
179 | * have names which are constants in ParserOutputFlags. Extensions |
180 | * should use ::setExtensionData() rather than creating new flags |
181 | * with ::setOutputFlag() in order to prevent namespace conflicts. |
182 | * |
183 | * @param string $name A flag name |
184 | * @param bool $val |
185 | */ |
186 | public function setOutputFlag( string $name, bool $val = true ): void; |
187 | |
188 | /** |
189 | * Provides a uniform interface to various appendable lists of strings |
190 | * stored in the content metadata. Strings internal to MediaWiki core should |
191 | * have names which are constants in ParserOutputStrings. Extensions |
192 | * should use ::setExtensionData() rather than creating new keys here |
193 | * in order to prevent namespace conflicts. |
194 | * |
195 | * @param string $name A string name |
196 | * @param string[] $value |
197 | */ |
198 | public function appendOutputStrings( string $name, array $value ): void; |
199 | |
200 | /** |
201 | * Set a page property to be stored in the page_props database table. |
202 | * |
203 | * page_props is a key-value store indexed by the page ID. This allows |
204 | * the parser to set a property on a page which can then be quickly |
205 | * retrieved given the page ID or via a DB join when given the page |
206 | * title. |
207 | * |
208 | * Since 1.23, page_props are also indexed by numeric value, to allow |
209 | * for efficient "top k" queries of pages wrt a given property. |
210 | * This only works if the value is passed as a int, float, or |
211 | * bool. Since 1.42 you should use ::setNumericPageProperty() |
212 | * if you want your page property value to be indexed, which will ensure |
213 | * that the value is of the proper type. |
214 | * |
215 | * setPageProperty() is thus used to propagate properties from the parsed |
216 | * page to request contexts other than a page view of the currently parsed |
217 | * article. |
218 | * |
219 | * Some applications examples: |
220 | * |
221 | * * To implement hidden categories, hiding pages from category listings |
222 | * by storing a page property. |
223 | * |
224 | * * Overriding the displayed article title (ParserOutput::setDisplayTitle()). |
225 | * |
226 | * * To implement image tagging, for example displaying an icon on an |
227 | * image thumbnail to indicate that it is listed for deletion on |
228 | * Wikimedia Commons. |
229 | * This is not actually implemented, yet but would be pretty cool. |
230 | * |
231 | * @note Use of non-scalar values (anything other than |
232 | * `string|int|float|bool`) has been deprecated in 1.42. |
233 | * Although any JSON-serializable value can be stored/fetched in |
234 | * ParserOutput, when the values are stored to the database |
235 | * (in `deferred/LinksUpdate/PagePropsTable.php`) they will be |
236 | * converted: booleans will be converted to '0' and '1', null |
237 | * will become '', and everything else will be cast to string |
238 | * (not JSON-serialized). Page properties obtained from the |
239 | * PageProps service will thus always be strings. |
240 | * |
241 | * @note The sort key stored in the database *will be NULL* unless |
242 | * the value passed here is an `int|float|bool`. If you *do not* |
243 | * want your property *value* indexed and sorted (for example, the |
244 | * value is a title string which can be numeric but only |
245 | * incidentally, like when it gets retrieved from an array key) |
246 | * be sure to cast to string or use |
247 | * `::setUnsortedPageProperty()`. If you *do* want your property |
248 | * *value* indexed and sorted, you should use |
249 | * `::setNumericPageProperty()` instead as this will ensure the |
250 | * value type is correct. Note that either way it is possible to |
251 | * efficiently look up all the pages with a certain property; we |
252 | * are only talking about sorting the *values* assigned to the |
253 | * property, for example for a "top N values of the property" |
254 | * query. |
255 | * |
256 | * @note Note that `::getPageProperty()`/`::setPageProperty()` do |
257 | * not do any conversions themselves; you should therefore be |
258 | * careful to distinguish values returned from the PageProp |
259 | * service (always strings) from values retrieved from a |
260 | * ParserOutput. |
261 | * |
262 | * @note Do not use setPageProperty() to set a property which is only used |
263 | * in a context where the ParserOutput object itself is already available, |
264 | * for example a normal page view. There is no need to save such a property |
265 | * in the database since the text is already parsed; use |
266 | * ::setExtensionData() instead. |
267 | * |
268 | * @par Example: |
269 | * @code |
270 | * $parser->getOutput()->setExtensionData( 'my_ext_foo', '...' ); |
271 | * @endcode |
272 | * |
273 | * And then later, in OutputPageParserOutput or similar: |
274 | * |
275 | * @par Example: |
276 | * @code |
277 | * $output->getExtensionData( 'my_ext_foo' ); |
278 | * @endcode |
279 | * |
280 | * @note The use of `null` as a value is deprecated since 1.42; use |
281 | * the empty string instead if you need a placeholder value, or |
282 | * ::unsetPageProperty() if you mean to remove a page property. |
283 | * |
284 | * @note The use of non-string values is deprecated since 1.42; if you |
285 | * need an page property value with a sort index |
286 | * use ::setNumericPageProperty(). |
287 | * |
288 | * @param string $name |
289 | * @param int|float|string|bool|null $value |
290 | * @since 1.38 |
291 | */ |
292 | public function setPageProperty( string $name, $value ): void; |
293 | |
294 | /** |
295 | * Set a numeric page property whose *value* is intended to be sorted |
296 | * and indexed. The sort key used for the property will be the value, |
297 | * coerced to a number. |
298 | * |
299 | * See `::setPageProperty()` for details. |
300 | * |
301 | * In the future, we may allow the value to be specified independent |
302 | * of sort key (T357783). |
303 | * |
304 | * @param string $propName The name of the page property |
305 | * @param int|float|string $numericValue the numeric value |
306 | * @since 1.42 |
307 | */ |
308 | public function setNumericPageProperty( string $propName, $numericValue ): void; |
309 | |
310 | /** |
311 | * Set a page property whose *value* is not intended to be sorted and |
312 | * indexed. |
313 | * |
314 | * See `::setPageProperty()` for details. It is recommended to |
315 | * use the empty string if you need a placeholder value (ie, if |
316 | * it is the *presence* of the property which is important, not |
317 | * the *value* the property is set to). |
318 | * |
319 | * It is still possible to efficiently look up all the pages with |
320 | * a certain property (the "presence" of it *is* indexed; see |
321 | * Special:PagesWithProp, list=pageswithprop). |
322 | * |
323 | * @param string $propName The name of the page property |
324 | * @param string $value Optional value; defaults to the empty string. |
325 | * @since 1.42 |
326 | */ |
327 | public function setUnsortedPageProperty( string $propName, string $value = '' ): void; |
328 | |
329 | /** |
330 | * Attaches arbitrary data to this content. This can be used to |
331 | * store some information for later use during page output. The |
332 | * data will be cached along with the parsed page, but unlike data |
333 | * set using setPageProperty(), it is not recorded in the |
334 | * database. |
335 | * |
336 | * To use setExtensionData() to pass extension information from a |
337 | * hook inside the parser to a hook in the page output, use this |
338 | * in the parser hook: |
339 | * |
340 | * @par Example: |
341 | * @code |
342 | * $parser->getOutput()->setExtensionData( 'my_ext_foo', '...' ); |
343 | * @endcode |
344 | * |
345 | * And then later, in OutputPageParserOutput or similar: |
346 | * |
347 | * @par Example: |
348 | * @code |
349 | * $output->getExtensionData( 'my_ext_foo' ); |
350 | * @endcode |
351 | * |
352 | * @note Only scalar values, e.g. numbers, strings, arrays or |
353 | * MediaWiki\Json\JsonUnserializable instances are supported as a |
354 | * value. Attempt to set other class instance as a extension data |
355 | * will break ParserCache for the page. |
356 | * |
357 | * @note As with ::setJsConfigVar(), setting a page property to multiple |
358 | * conflicting values during the parse is not supported. |
359 | * |
360 | * @param string $key The key for accessing the data. Extensions |
361 | * should take care to avoid conflicts in naming keys. It is |
362 | * suggested to use the extension's name as a prefix. Keys |
363 | * beginning with `mw-` are reserved for use by mediawiki core. |
364 | * |
365 | * @param mixed $value The value to set. |
366 | * Setting a value to null is equivalent to removing the value. |
367 | */ |
368 | public function setExtensionData( string $key, $value ): void; |
369 | |
370 | /** |
371 | * Appends arbitrary data to this ParserObject. This can be used |
372 | * to store some information in the ParserOutput object for later |
373 | * use during page output. The data will be cached along with the |
374 | * ParserOutput object, but unlike data set using |
375 | * setPageProperty(), it is not recorded in the database. |
376 | * |
377 | * See ::setExtensionData() for more details on rationale and use. |
378 | * |
379 | * In order to provide for out-of-order/asynchronous/incremental |
380 | * parsing, this method appends values to a set. See |
381 | * ::setExtensionData() for the flag-like version of this method. |
382 | * |
383 | * @note Only values which can be array keys are currently supported |
384 | * as values. Be aware that array keys which 'look like' numbers are |
385 | * converted to ints by PHP, and so if you put in `"0"` as a value you |
386 | * will get `[0=>true]` out. |
387 | * |
388 | * @param string $key The key for accessing the data. Extensions should take care to avoid |
389 | * conflicts in naming keys. It is suggested to use the extension's name as a prefix. |
390 | * |
391 | * @param int|string $value The value to append to the list. |
392 | * @param string $strategy Merge strategy: |
393 | * only MW_MERGE_STRATEGY_UNION is currently supported and external callers |
394 | * should treat this parameter as @internal at this time and omit it. |
395 | */ |
396 | public function appendExtensionData( |
397 | string $key, |
398 | $value, |
399 | string $strategy = self::MERGE_STRATEGY_UNION |
400 | ): void; |
401 | |
402 | /** |
403 | * Add a variable to be set in mw.config in JavaScript. |
404 | * |
405 | * In order to ensure the result is independent of the parse order, the values |
406 | * set here must be unique -- that is, you can pass the same $key |
407 | * multiple times but ONLY if the $value is identical each time. |
408 | * If you want to collect multiple pieces of data under a single key, |
409 | * use ::appendJsConfigVar(). |
410 | * |
411 | * @param string $key Key to use under mw.config |
412 | * @param mixed|null $value Value of the configuration variable. |
413 | */ |
414 | public function setJsConfigVar( string $key, $value ): void; |
415 | |
416 | /** |
417 | * Append a value to a variable to be set in mw.config in JavaScript. |
418 | * |
419 | * In order to ensure the result is independent of the parse order, |
420 | * the value of this key will be an associative array, mapping all of |
421 | * the values set under that key to true. (The array is implicitly |
422 | * ordered in PHP, but you should treat it as unordered.) |
423 | * If you want a non-array type for the key, and can ensure that only |
424 | * a single value will be set, you should use ::setJsConfigVar() instead. |
425 | * |
426 | * @note Only values which can be array keys are currently supported |
427 | * as values. Be aware that array keys which 'look like' numbers are |
428 | * converted to ints by PHP, and so if you put in `"0"` as a value you |
429 | * will get `[0=>true]` out. |
430 | * |
431 | * @param string $key Key to use under mw.config |
432 | * @param string $value Value to append to the configuration variable. |
433 | * @param string $strategy Merge strategy: |
434 | * only MW_MERGE_STRATEGY_UNION is currently supported and external callers |
435 | * should treat this parameter as @internal at this time and omit it. |
436 | */ |
437 | public function appendJsConfigVar( |
438 | string $key, |
439 | string $value, |
440 | string $strategy = self::MERGE_STRATEGY_UNION |
441 | ): void; |
442 | |
443 | /** |
444 | * @see OutputPage::addModules |
445 | * @param string[] $modules |
446 | */ |
447 | public function addModules( array $modules ): void; |
448 | |
449 | /** |
450 | * @see OutputPage::addModuleStyles |
451 | * @param string[] $modules |
452 | */ |
453 | public function addModuleStyles( array $modules ): void; |
454 | |
455 | /** |
456 | * Sets parser limit report data for a key |
457 | * |
458 | * The key is used as the prefix for various messages used for formatting: |
459 | * - $key: The label for the field in the limit report |
460 | * - $key-value-text: Message used to format the value in the "NewPP limit |
461 | * report" HTML comment. If missing, uses $key-format. |
462 | * - $key-value-html: Message used to format the value in the preview |
463 | * limit report table. If missing, uses $key-format. |
464 | * - $key-value: Message used to format the value. If missing, uses "$1". |
465 | * |
466 | * Note that all values are interpreted as wikitext, and so should be |
467 | * encoded with htmlspecialchars() as necessary, but should avoid complex |
468 | * HTML for sanity of display in the "NewPP limit report" comment. |
469 | * |
470 | * @param string $key Message key |
471 | * @param mixed $value Appropriate for Message::params() |
472 | */ |
473 | public function setLimitReportData( string $key, $value ): void; |
474 | |
475 | /** |
476 | * Sets Table of Contents data for this page. |
477 | * |
478 | * Note that merging of TOCData is not supported; exactly one fragment |
479 | * should set TOCData. |
480 | * |
481 | * @param TOCData $tocData |
482 | */ |
483 | public function setTOCData( TOCData $tocData ): void; |
484 | |
485 | /** |
486 | * Set the content for an indicator. |
487 | * |
488 | * @param string $name |
489 | * @param string $content |
490 | * @param-taint $content exec_html |
491 | */ |
492 | public function setIndicator( $name, $content ): void; |
493 | |
494 | /** |
495 | * @return array<string,string> |
496 | */ |
497 | public function getIndicators(): array; |
498 | } |