MediaWiki fundraising/REL1_35
|
Readahead helper for making large MediaWiki data dumps; reads in a previous XML dump to sequentially prefetch text records already normalized and decompressed. More...
Public Member Functions | |
__construct ( $infile) | |
close () | |
prefetch ( $page, $rev, $slot=SlotRecord::MAIN) | |
Attempts to fetch the text of a particular page revision from the dump stream. | |
Protected Member Functions | |
debug ( $str) | |
Protected Attributes | |
$atEnd = false | |
$atPageEnd = false | |
$infiles = null | |
$lastPage = 0 | |
$lastRev = 0 | |
XMLReader | $reader = null |
Private Member Functions | |
nextPage () | |
nextRev () | |
nextText () | |
nodeContents () | |
Shouldn't something like this be built-in to XMLReader? Fetches text contents of the current element, assuming no sub-elements or such scary things. | |
skipTo ( $name, $parent='page') | |
Readahead helper for making large MediaWiki data dumps; reads in a previous XML dump to sequentially prefetch text records already normalized and decompressed.
This can save load on the external database servers, hopefully.
Assumes that dumps will be recorded in the canonical order:
Definition at line 44 of file BaseDump.php.
BaseDump::__construct | ( | $infile | ) |
Definition at line 53 of file BaseDump.php.
BaseDump::close | ( | ) |
Definition at line 217 of file BaseDump.php.
Referenced by nextPage(), nodeContents(), and skipTo().
|
protected |
Definition at line 113 of file BaseDump.php.
References wfDebug().
Referenced by prefetch(), and skipTo().
|
private |
Definition at line 119 of file BaseDump.php.
References close(), nodeContents(), and skipTo().
Referenced by prefetch().
|
private |
Definition at line 136 of file BaseDump.php.
References nodeContents(), and skipTo().
Referenced by prefetch().
|
private |
Definition at line 149 of file BaseDump.php.
References nodeContents(), and skipTo().
Referenced by prefetch().
|
private |
Shouldn't something like this be built-in to XMLReader? Fetches text contents of the current element, assuming no sub-elements or such scary things.
Definition at line 191 of file BaseDump.php.
References close().
Referenced by nextPage(), nextRev(), nextText(), and prefetch().
BaseDump::prefetch | ( | $page, | |
$rev, | |||
$slot = SlotRecord::MAIN ) |
Attempts to fetch the text of a particular page revision from the dump stream.
May return null if the page is unavailable.
int | $page | ID number of page to read |
int | $rev | ID number of revision to read |
string | $slot | Role name of the slot to read |
Definition at line 70 of file BaseDump.php.
References debug(), nextPage(), nextRev(), nextText(), nodeContents(), and skipTo().
|
private |
string | $name | |
string | $parent |
Definition at line 162 of file BaseDump.php.
References close(), and debug().
Referenced by nextPage(), nextRev(), nextText(), and prefetch().
|
protected |
Definition at line 47 of file BaseDump.php.
|
protected |
Definition at line 48 of file BaseDump.php.
|
protected |
Definition at line 51 of file BaseDump.php.
|
protected |
Definition at line 49 of file BaseDump.php.
|
protected |
Definition at line 50 of file BaseDump.php.
|
protected |
Definition at line 46 of file BaseDump.php.