MediaWiki master
BacklinkJobUtils Class Reference

Helper for a Job that updates links to a given page title. More...

Static Public Member Functions

static partitionBacklinkJob (Job $job, $bSize, $cSize, $opts=[])
 Break down $job into approximately ($bSize/$cSize) leaf jobs and a single partition job that covers the remaining backlink range (if needed).
 

Detailed Description

Helper for a Job that updates links to a given page title.

When an asset changes, a base job can be inserted to update all assets that depend on it. The base job splits into per-title "leaf" jobs and a "remnant" job to handle the remaining range of backlinks. This recurs until the remnant job's backlink range is small enough that only leaf jobs are created from it.

For example, if templates A and B are edited (at the same time) the queue will have: (A base, B base) When these jobs run, the queue will have per-title and remnant partition jobs: (titleX,titleY,titleZ,...,A remnant,titleM,titleN,titleO,...,B remnant)

This works best when the queue is FIFO, for several reasons:

  • a) Since the remnant jobs are enqueued after the leaf jobs, the slower leaf jobs have to get popped prior to the fast remnant jobs. This avoids flooding the queue with leaf jobs for every single backlink of widely used assets (which can be millions).
  • b) Other jobs going in the queue still get a chance to run after a widely used asset changes. This is due to the large remnant job pushing to the end of the queue with each division.

The size of the queues used in this manner depend on the number of assets changes and the number of workers. Also, with FIFO-per-partition queues, the queue size can be somewhat larger, depending on the number of queue partitions.

Since
1.23

Definition at line 51 of file BacklinkJobUtils.php.

Member Function Documentation

◆ partitionBacklinkJob()

static BacklinkJobUtils::partitionBacklinkJob ( Job $job,
$bSize,
$cSize,
$opts = [] )
static

Break down $job into approximately ($bSize/$cSize) leaf jobs and a single partition job that covers the remaining backlink range (if needed).

Jobs for the first $bSize titles are collated ($cSize per job) into leaf jobs to do actual work. All the resulting jobs are of the same class as $job. No partition job is returned if the range covered by $job was less than $bSize, as the leaf jobs have full coverage.

The leaf jobs have the 'pages' param set to a (<page ID>:(<namespace>,<DB key>),...) map so that the run() function knows what pages to act on. The leaf jobs will keep the same job title as the parent job (e.g. $job).

The partition jobs have the 'range' parameter set to a map of the format (start:<integer>, end:<integer>, batchSize:<integer>, subranges:((<start>,<end>),...)), the 'table' parameter set to that of $job, and the 'recursive' parameter set to true. This method can be called on the resulting job to repeat the process again.

The job provided ($job) must have the 'recursive' parameter set to true and the 'table' parameter must be set to a backlink table. The job title will be used as the title to find backlinks for. Any 'range' parameter must follow the same format as mentioned above. This should be managed by recursive calls to this method.

The first jobs return are always the leaf jobs. This lets the caller use push() to put them directly into the queue and works well if the queue is FIFO. In such a queue, the leaf jobs have to get finished first before anything can resolve the next partition job, which keeps the queue very small.

$opts includes:

  • params : extra job parameters to include in each job
Parameters
Job$job
int$bSizeBacklinkCache partition size; usually $wgUpdateRowsPerJob
int$cSizeMax titles per leaf job; Usually 1 or a modest value
array$optsOptional parameter map
Returns
Job[]

Definition at line 87 of file BacklinkJobUtils.php.

References $job, $params, and wfWarn().

Referenced by HTMLCacheUpdateJob\run(), and RefreshLinksJob\run().


The documentation for this class was generated from the following file: