MediaWiki  master
JobQueue
Collaboration diagram for JobQueue:

Files

file  ActivityUpdateJob.php
 This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
 
file  BacklinkJobUtils.php
 Job to update links for a given title.
 
file  CdnPurgeJob.php
 Job to purge a set of URLs from CDN.
 
file  ClearWatchlistNotificationsJob.php
 This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
 
file  DeleteLinksJob.php
 Job to update link tables for pages.
 
file  DoubleRedirectJob.php
 Job to fix double redirects after moving a page.
 
file  DuplicateJob.php
 No-op job that does nothing.
 
file  EmaillingJob.php
 Old job for notification emails.
 
file  EnotifNotifyJob.php
 Job for notification emails.
 
file  EnqueueJob.php
 Router job that takes jobs and enqueues them.
 
file  HTMLCacheUpdateJob.php
 HTML cache invalidation of all pages linking to a given title.
 
file  JobRunner.php
 Job queue runner utility methods.
 
file  NullJob.php
 Degenerate job that does nothing.
 
file  PublishStashedFileJob.php
 Upload a file from the upload stash into the local file repo.
 
file  RecentChangesUpdateJob.php
 This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
 
file  RefreshLinksJob.php
 Job to update link tables for pages.
 
file  ThumbnailRenderJob.php
 Job for asynchronous rendering of thumbnails.
 
file  UserGroupExpiryJob.php
 Job that purges expired user group memberships.
 
file  UserOptionsUpdateJob.php
 Job that updates a user's preferences.
 

Classes

class  ActivityUpdateJob
 Job for updating user activity like "last viewed" timestamps. More...
 
class  BacklinkJobUtils
 Class with Backlink related Job helper methods. More...
 
class  CdnPurgeJob
 Job to purge a set of URLs from CDN. More...
 
class  ClearUserWatchlistJob
 Job to clear a users watchlist in batches. More...
 
class  ClearWatchlistNotificationsJob
 Job for clearing all of the "last viewed" timestamps for a user's watchlist, or setting them all to the same value. More...
 
class  DeleteLinksJob
 Job to prune link tables for pages that were deleted. More...
 
class  DoubleRedirectJob
 Job to fix double redirects after moving a page. More...
 
class  DuplicateJob
 No-op job that does nothing. More...
 
class  EmaillingJob
 Old job used for sending single notification emails; kept for backwards-compatibility. More...
 
class  EnotifNotifyJob
 Job for email notification mails. More...
 
class  EnqueueJob
 Router job that takes jobs and enqueues them to their proper queues. More...
 
interface  GenericParameterJob
 Interface for generic jobs only uses the parameters field and are JSON serializable. More...
 
class  HTMLCacheUpdateJob
 Job to purge the cache for all pages that link to or use another page or file. More...
 
interface  IJobSpecification
 Interface for serializable objects that describe a job queue task. More...
 
class  Job
 Class to both describe a background job and handle jobs. More...
 
class  JobQueue
 Class to handle enqueueing and running of background jobs. More...
 
class  JobQueueConnectionError
 
class  JobQueueDB
 Class to handle job queues stored in the DB. More...
 
class  JobQueueEnqueueUpdate
 Enqueue lazy-pushed jobs that have accumulated from JobQueueGroup. More...
 
class  JobQueueError
 
class  JobQueueFederated
 Class to handle enqueueing and running of background jobs for federated queues. More...
 
class  JobQueueGroup
 Class to handle enqueueing of background jobs. More...
 
class  JobQueueMemory
 Class to handle job queues stored in PHP memory for testing. More...
 
class  JobQueueReadOnlyError
 
class  JobQueueRedis
 Class to handle job queues stored in Redis. More...
 
class  JobRunner
 Job queue runner utility methods. More...
 
class  JobSpecification
 Job queue task description base code. More...
 
class  NullJob
 Degenerate job that does nothing, but can optionally replace itself in the queue and/or sleep for a brief time period. More...
 
class  PublishStashedFileJob
 Upload a file from the upload stash into the local file repo. More...
 
class  RecentChangesUpdateJob
 Job for pruning recent changes. More...
 
class  RefreshLinksJob
 Job to update link tables for pages. More...
 
interface  RunnableJob
 Job that has a run() method and metadata accessors for JobQueue::pop() and JobQueue::ack() More...
 
class  ThumbnailRenderJob
 Job for asynchronous rendering of thumbnails. More...
 

Detailed Description

README

JobQueue Architecture

Notes on the Job queuing system architecture.

Introduction

The data model consist of the following main components:

  • The Job object represents a particular deferred task that happens in the background. All jobs subclass the Job object and put the main logic in the function called run().
  • The JobQueue object represents a particular queue of jobs of a certain type. For example there may be a queue for email jobs and a queue for CDN purge jobs.

Job queues

Each job type has its own queue and is associated to a storage medium. One queue might save its jobs in redis while another one uses would use a database.

Storage medium are defined in a queue class. Before using it, you must define in $wgJobTypeConf a mapping of the job type to a queue class.

The factory class JobQueueGroup provides helper functions:

  • getting the queue for a given job
  • route new job insertions to the proper queue

The following queue classes are available:

All queue classes support some basic operations (though some may be no-ops):

  • enqueueing a batch of jobs
  • dequeueing a single job
  • acknowledging a job is completed
  • checking if the queue is empty

Some queue classes (like JobQueueDB) may dequeue jobs in random order while other queues might dequeue jobs in exact FIFO order. Callers should thus not assume jobs are executed in FIFO order.

Also note that not all queue classes will have the same reliability guarantees. In-memory queues may lose data when restarted depending on snapshot and journal settings (including journal fsync() frequency). Some queue types may totally remove jobs when dequeued while leaving the ack() function as a no-op; if a job is dequeued by a job runner, which crashes before completion, the job will be lost. Some jobs, like purging CDN caches after a template change, may not require durable queues, whereas other jobs might be more important.

Job queue aggregator

The aggregators are used by nextJobDB.php, which is a script that will return a random ready queue (on any wiki in the farm) that can be used with runJobs.php. This can be used in conjunction with any scripts that handle wiki farm job queues. Note that $wgLocalDatabases defines what wikis are in the wiki farm.

Since each job type has its own queue, and wiki-farms may have many wikis, there might be a large number of queues to keep track of. To avoid wasting large amounts of time polling empty queues, aggregators exists to keep track of which queues are ready.

The following queue aggregator classes are available:

  • JobQueueAggregatorRedis (uses a redis server to track ready queues)

Some aggregators cache data for a few minutes while others may be always up to date. This can be an important factor for jobs that need a low pickup time (or latency).

Jobs

Callers should also try to make jobs maintain correctness when executed twice. This is useful for queues that actually implement ack(), since they may recycle dequeued but un-acknowledged jobs back into the queue to be attempted again. If a runner dequeues a job, runs it, but then crashes before calling ack(), the job may be returned to the queue and run a second time. Jobs like cache purging can happen several times without any correctness problems. However, a pathological case would be if a bug causes the problem to systematically keep repeating. For example, a job may always throw a DB error at the end of run(). This problem is trickier to solve and more obnoxious for things like email jobs, for example. For such jobs, it might be useful to use a queue that does not retry jobs.