Code Coverage |
||||||||||
Lines |
Functions and Methods |
Classes and Traits |
||||||||
Total | |
0.00% |
0 / 12 |
|
0.00% |
0 / 2 |
CRAP | |
0.00% |
0 / 1 |
RunExtensionProcessors | |
0.00% |
0 / 12 |
|
0.00% |
0 / 2 |
30 | |
0.00% |
0 / 1 |
initialize | |
0.00% |
0 / 9 |
|
0.00% |
0 / 1 |
12 | |||
run | |
0.00% |
0 / 3 |
|
0.00% |
0 / 1 |
6 |
1 | <?php |
2 | declare( strict_types = 1 ); |
3 | |
4 | namespace Wikimedia\Parsoid\Wt2Html\DOM\Processors; |
5 | |
6 | use Wikimedia\Parsoid\Config\Env; |
7 | use Wikimedia\Parsoid\DOM\Node; |
8 | use Wikimedia\Parsoid\Ext\DOMProcessor as ExtDOMProcessor; |
9 | use Wikimedia\Parsoid\Wt2Html\Wt2HtmlDOMProcessor; |
10 | |
11 | /** |
12 | * A wrapper to call extension-specific DOM processors. |
13 | * |
14 | * FIXME: There are two potential ordering problems here. |
15 | * |
16 | * 1. unpackDOMFragment should always run immediately |
17 | * before these extensionPostProcessors, which we do currently. |
18 | * This ensures packed content get processed correctly by extensions |
19 | * before additional transformations are run on the DOM. |
20 | * |
21 | * This ordering issue is handled through documentation. |
22 | * |
23 | * 2. This has existed all along (in the PHP parser as well as Parsoid |
24 | * which is probably how the ref-in-ref hack works - because of how |
25 | * parser functions and extension tags are procesed, #tag:ref doesn't |
26 | * see a nested ref anymore) and this patch only exposes that problem |
27 | * more clearly with the unpackOutput property. |
28 | * |
29 | * * Consider the set of extensions that |
30 | * (a) process wikitext |
31 | * (b) provide an extensionPostProcessor |
32 | * (c) run the extensionPostProcessor only on the top-level |
33 | * As of today, there is exactly one extension (Cite) that has all |
34 | * these properties, so the problem below is a speculative problem |
35 | * for today. But, this could potentially be a problem in the future. |
36 | * |
37 | * * Let us say there are at least two of them, E1 and E2 that |
38 | * support extension tags <e1> and <e2> respectively. |
39 | * |
40 | * * Let us say in an instance of <e1> on the page, <e2> is present |
41 | * and in another instance of <e2> on the page, <e1> is present. |
42 | * |
43 | * * In what order should E1's and E2's extensionPostProcessors be |
44 | * run on the top-level? Depending on what these handlers do, you |
45 | * could get potentially different results. You can see this quite |
46 | * starkly with the unpackOutput flag. |
47 | * |
48 | * * The ideal solution to this problem is to require that every extension's |
49 | * extensionPostProcessor be idempotent which lets us run these |
50 | * post processors repeatedly till the DOM stabilizes. But, this |
51 | * still doesn't necessarily guarantee that ordering doesn't matter. |
52 | * It just guarantees that with the unpackOutput flag set to false |
53 | * multiple extensions, all sealed fragments get fully processed. |
54 | * So, we still need to worry about that problem. |
55 | * |
56 | * But, idempotence *could* potentially be a sufficient property in most cases. |
57 | * To see this, consider that there is a Footnotes extension which is similar |
58 | * to the Cite extension in that they both extract inline content in the |
59 | * page source to a separate section of output and leave behind pointers to |
60 | * the global section in the output DOM. Given this, the Cite and Footnote |
61 | * extension post processors would essentially walk the dom and |
62 | * move any existing inline content into that global section till it is |
63 | * done. So, even if a <footnote> has a <ref> and a <ref> has a <footnote>, |
64 | * we ultimately end up with all footnote content in the footnotes section |
65 | * and all ref content in the references section and the DOM stabilizes. |
66 | * Ordering is irrelevant here. |
67 | * |
68 | * So, perhaps one way of catching these problems would be in code review |
69 | * by analyzing what the DOM postprocessor does and see if it introduces |
70 | * potential ordering issues. |
71 | */ |
72 | class RunExtensionProcessors implements Wt2HtmlDOMProcessor { |
73 | private ?array $extProcessors = null; |
74 | |
75 | /** |
76 | * FIXME: We've lost the ability to dump dom pre/post individual |
77 | * extension processors. Need to fix RunExtensionProcessors to |
78 | * reintroduce that granularity |
79 | */ |
80 | private function initialize( Env $env ): array { |
81 | $extProcessors = []; |
82 | foreach ( $env->getSiteConfig()->getExtDOMProcessors() as $extName => $domProcs ) { |
83 | foreach ( $domProcs as $i => $classNameOrSpec ) { |
84 | // Extension post processor, object factory spec given |
85 | $objectFactory = $env->getSiteConfig()->getObjectFactory(); |
86 | $extProcessors[] = $objectFactory->createObject( $classNameOrSpec, [ |
87 | 'allowClassName' => true, |
88 | 'assertClass' => ExtDOMProcessor::class, |
89 | ] ); |
90 | } |
91 | } |
92 | |
93 | return $extProcessors; |
94 | } |
95 | |
96 | /** |
97 | * @inheritDoc |
98 | */ |
99 | public function run( |
100 | Env $env, Node $root, array $options = [], bool $atTopLevel = false |
101 | ): void { |
102 | $this->extProcessors ??= $this->initialize( $env ); |
103 | foreach ( $this->extProcessors as $ep ) { |
104 | $ep->wtPostprocess( $options['extApi'], $root, $options ); |
105 | } |
106 | } |
107 | } |