MediaWiki  1.30.0
BaseDump Class Reference

Readahead helper for making large MediaWiki data dumps; reads in a previous XML dump to sequentially prefetch text records already normalized and decompressed. More...

Public Member Functions

 __construct ( $infile)
 
 close ()
 
 debug ( $str)
 
 nextPage ()
 
 nextRev ()
 
 nextText ()
 
 nodeContents ()
 Shouldn't something like this be built-in to XMLReader? Fetches text contents of the current element, assuming no sub-elements or such scary things. More...
 
 prefetch ( $page, $rev)
 Attempts to fetch the text of a particular page revision from the dump stream. More...
 
 skipTo ( $name, $parent='page')
 

Protected Attributes

 $atEnd = false
 
 $atPageEnd = false
 
 $infiles = null
 
 $lastPage = 0
 
 $lastRev = 0
 
XMLReader $reader = null
 

Detailed Description

Readahead helper for making large MediaWiki data dumps; reads in a previous XML dump to sequentially prefetch text records already normalized and decompressed.

This can save load on the external database servers, hopefully.

Assumes that dumps will be recorded in the canonical order:

  • ascending by page_id
  • ascending by rev_id within each page
  • text contents are immutable and should not change once recorded, so the previous dump is a reliable source

Definition at line 42 of file backupPrefetch.inc.

Constructor & Destructor Documentation

◆ __construct()

BaseDump::__construct (   $infile)

Definition at line 51 of file backupPrefetch.inc.

Member Function Documentation

◆ close()

BaseDump::close ( )
Access:\n private
Returns
null

Definition at line 213 of file backupPrefetch.inc.

Referenced by nextPage(), nodeContents(), and skipTo().

◆ debug()

BaseDump::debug (   $str)

Definition at line 101 of file backupPrefetch.inc.

References wfDebug().

Referenced by prefetch(), and skipTo().

◆ nextPage()

BaseDump::nextPage ( )
Access:\n private

Definition at line 110 of file backupPrefetch.inc.

References close(), captcha-old\count, nodeContents(), and skipTo().

Referenced by prefetch().

◆ nextRev()

BaseDump::nextRev ( )
Access:\n private

Definition at line 130 of file backupPrefetch.inc.

References nodeContents(), and skipTo().

Referenced by prefetch().

◆ nextText()

BaseDump::nextText ( )
Access:\n private
Returns
string

Definition at line 144 of file backupPrefetch.inc.

References nodeContents(), and skipTo().

Referenced by prefetch().

◆ nodeContents()

BaseDump::nodeContents ( )

Shouldn't something like this be built-in to XMLReader? Fetches text contents of the current element, assuming no sub-elements or such scary things.

Returns
string
Access:\n private

Definition at line 186 of file backupPrefetch.inc.

References $buffer, and close().

Referenced by nextPage(), nextRev(), and nextText().

◆ prefetch()

BaseDump::prefetch (   $page,
  $rev 
)

Attempts to fetch the text of a particular page revision from the dump stream.

May return null if the page is unavailable.

Parameters
int$pageID number of page to read
int$revID number of revision to read
Returns
string|null

Definition at line 71 of file backupPrefetch.inc.

References $rev, debug(), nextPage(), nextRev(), and nextText().

◆ skipTo()

BaseDump::skipTo (   $name,
  $parent = 'page' 
)
Access:\n private
Parameters
string$name
string$parent
Returns
bool|null

Definition at line 156 of file backupPrefetch.inc.

References $name, close(), and debug().

Referenced by nextPage(), nextRev(), and nextText().

Member Data Documentation

◆ $atEnd

BaseDump::$atEnd = false
protected

Definition at line 45 of file backupPrefetch.inc.

◆ $atPageEnd

BaseDump::$atPageEnd = false
protected

Definition at line 46 of file backupPrefetch.inc.

◆ $infiles

BaseDump::$infiles = null
protected

Definition at line 49 of file backupPrefetch.inc.

◆ $lastPage

BaseDump::$lastPage = 0
protected

Definition at line 47 of file backupPrefetch.inc.

◆ $lastRev

BaseDump::$lastRev = 0
protected

Definition at line 48 of file backupPrefetch.inc.

◆ $reader

XMLReader BaseDump::$reader = null
protected

Definition at line 44 of file backupPrefetch.inc.


The documentation for this class was generated from the following file: