Miscellaneous helper functions (not wiki-dependent).


Bases: KeyError, IndexError

An error that gets caught by both KeyError and IndexError.

__module__ = ''

Bases: object

Mixin class to allow comparing to other objects which are comparable.


Compare if self is equal to other.


Compare if self is greater equals other.


Compare if self is greater than other.

__hash__ = None

Compare if self is less equals other.


Compare if self is less than other.

__module__ = ''

Compare if self is not equal to other.

class, flags=0, name=None, instead=None, since=None)[source]


Regex object that issues a deprecation notice.


Issue deprecation warning.

__init__(pattern, flags=0, name=None, instead=None, since=None)[source]


If name is None, the regex pattern will be used as part of the deprecation warning.

  • name (str or None) – name of the object that is deprecated

  • instead (str) – if provided, will be used to specify the replacement of the deprecated name

__module__ = ''

Bases:, collections.deque

A generator that allows items to be added during generating.

__abstractmethods__ = frozenset({})
__module__ = ''

Iterator method.


Bases: object

Parent class of Revision() and FileInfo().

Provide: __getitem__() and __repr__().


Give access to class values by key.

Revision class may also give access to its values by keys e.g. revid parameter may be assigned by revision[‘revid’] as well as revision.revid. This makes formatting strings with % operator easier.

__module__ = ''

Return a more complete string representation.


Bases: str,

A default for a not existing siteinfo property.

It should be chosen if there is no better default known. It acts like an empty collections, so it can be iterated through it safely if treated as a list, tuple, set or dictionary. It is also basically an empty string.

Accessing a value via __getitem__ will result in a combined KeyError and IndexError.

__abstractmethods__ = frozenset({})

Raise always a CombinedError.


Initialise the default as an empty string.


An iterator which does nothing and drops the argument.

__module__ = ''
class, flags=0)[source]

Bases: object

Regex object that obtains and compiles the regex on usage.

Instances behave like the object created using re.compile.


Compile the regex and delegate all attribute to the regex.

__init__(pattern, flags=0)[source]


  • pattern (str or callable) – re regex pattern

  • flags (int) – re.compile flags

__module__ = ''
property flags

The flags property.

property raw

The raw property.


Bases: distutils.version.Version

Version object to allow comparing ‘wmf’ versions with normal ones.

The version mainly consist of digits separated by periods. After that is a suffix which may only be ‘wmf<number>’, ‘alpha’, ‘beta<number>’ or ‘-rc.<number>’ (the - and . are optional). They are considered from old to new in that order with a version number without suffix is considered the newest. This secondary difference is stored in an internal _dev_version attribute.

Two versions are equal if their normal version and dev version are equal. A version is greater if the normal version or dev version is greater. For .. admonition:: Example

1.24 < 1.24.1 < 1.25wmf1 < 1.25alpha < 1.25beta1 < 1.25beta2 < 1.25-rc-1 < 1.25-rc.2 < 1.25

Any other suffixes are considered invalid.

MEDIAWIKI_VERSION = re.compile('(\\d+(?:\\.\\d+)+)(-?wmf\\.?(\\d+)|alpha|beta(\\d+)|-?rc\\.?(\\d+)|.*)?$')
__module__ = ''

Return version number with optional suffix.

classmethod from_generator(generator)[source]

Create instance using the generator string.


Parse version string.


Bases: module

A wrapper for a module to deprecate classes or variables of it.


Return the attribute with a deprecation warning if required.


Initialise the wrapper.

It will automatically overwrite the module with this instance in sys.modules.


module (str or module) – The module name or instance

__module__ = ''
__setattr__(attr, value)[source]

Set the value of the wrapped module.


Bases:, dict

Dict with SelfCallMixin.

__module__ = ''

Bases: object

Return self when called.

When ‘_own_desc’ is defined it’ll also issue a deprecation warning using issue_deprecation_warning(‘Calling ‘ + _own_desc, ‘it directly’).


Do nothing and just return itself.

__module__ = ''

Bases:, str

String with SelfCallMixin.

__module__ = ''
class, wait_time=2, *args)[source]

Bases: list

A simple threadpool class to limit the number of simultaneous threads.

Any threading.Thread object can be added to the pool using the append() method. If the maximum number of simultaneous threads has not been reached, the Thread object will be started immediately; if not, the append() call will block until the thread is able to start.

>>> pool = ThreadList(limit=10)
>>> def work():
...     time.sleep(1)
>>> for x in range(20):
...     pool.append(threading.Thread(target=work))
__init__(limit=128, wait_time=2, *args)[source]


  • limit (int) – the number of simultaneous threads

  • wait_time (int or float) – how long to wait if active threads exceeds limit

__module__ = ''

Return the number of alive threads and delete all non-alive ones.


Add a thread to the pool and start it.


Stop all threads the pool.

class, target=None, name='GeneratorThread', args=(), kwargs=None, qsize=65536)[source]

Bases: threading.Thread

Look-ahead generator class.

Runs a generator in a separate thread and queues the results; can be called like a regular generator.

Subclasses should override self.generator, I{not}

Important: the generator thread will stop itself if the generator’s internal queue is exhausted; but, if the calling program does not use all the generated values, it must call the generator’s stop() method to stop the background thread. Example usage:

>>> gen = ThreadedGenerator(target=range, args=(20,))
>>> try:
...     data = list(gen)
... finally:
...     gen.stop()
>>> data
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
__init__(group=None, target=None, name='GeneratorThread', args=(), kwargs=None, qsize=65536)[source]

Initializer. Takes same keyword arguments as threading.Thread.

target must be a generator function (or other callable that returns an iterable object).


qsize (int) – The size of the lookahead queue. The larger the qsize, the more values will be computed in advance of use (which can eat up memory and processor time).


Iterate results from the queue.

__module__ = ''

Run the generator and store the results on the queue.


Stop the background thread., stacklevel=1)[source]

Extract full object name, including class, and store in __full_name__.

This must be done on all decorators that are chained together, otherwise the second decorator will have the wrong full name.

  • obj (object) – A object being decorated

  • stacklevel (int) – level to use[source]

A decorator to add __full_name__ to the function being decorated.

This should be done for all decorators used in pywikibot, as any decorator that does not add __full_name__ will prevent other decorators in the same chain from being able to obtain it.

This can be used to monkey-patch decorators in other modules. e.g. <xyz>.foo = add_full_name(<xyz>.foo)


obj (callable) – The function to decorate


decorating function

Return type



Bases: object

Descriptor class to access a class method as a property.

This class may be used as a decorator:

class Foo:

    _bar = 'baz'  # a class property

    def bar(cls):  # a class property method
        return cls._bar gives ‘baz’.

__get__(instance, owner)[source]

Get the attribute of the owner class by its method.


Hold the class method.

__module__ = '' str, sha='sha1', bytes_to_read=None)[source]

Compute file hash.

Result is expressed as hexdigest().

  • filename – filename path

  • sha (str) – hashing function among the following in hashlib: md5(), sha1(), sha224(), sha256(), sha384(), and sha512() function name shall be passed as string, e.g. ‘sha1’.

  • bytes_to_read (None or int) – only the first bytes_to_read will be considered; if file size is smaller, the whole file will be considered. str, new_arg)[source]

Decorator to declare old_arg deprecated and replace it with new_arg.


@deprecate_arg(‘foo’, ‘bar’) def my_function(bar=’baz’): pass # replaces ‘foo’ keyword by ‘bar’ used by my_function

@deprecare_arg(‘foo’, None) def my_function(): pass # ignores ‘foo’ keyword no longer used by my_function

deprecated_args decorator should be used in favour of this deprecate_arg decorator but it is held to deprecate args which become a reserved word in future Python releases and to prevent syntax errors.

  • old_arg – old keyword

  • new_arg (str or None or bool) – new keyword*outer_args, **outer_kwargs)[source]

Outer wrapper.

The outer wrapper may be the replacement function if the decorated decorator was called without arguments, or the replacement decorator if the decorated decorator was called without arguments.

  • outer_args – args

  • outer_kwargs – kwargs**arg_pairs)[source]

Decorator to declare multiple args deprecated.


:deprecated_args(foo=’bar’, baz=None) def my_function(bar=’baz’): pass # replaces ‘foo’ keyword by ‘bar’ and ignores ‘baz’ keyword


arg_pairs – Each entry points to the new argument name. If an argument is to be removed, the value may be one of the following: - None: shows a DeprecationWarning - False: shows a PendingDeprecationWarning - True: shows a FutureWarning (only once) - empty string: no warning is printed[source]

An iterator which does nothing. str, mode=384, quiet=False, create=False)[source]

Check file mode and update it, if needed.

  • filename – filename path

  • mode (int) – requested file mode

  • quiet (bool) – warn about file mode change if False.

  • create (bool) – create the file if it does not exist already


IOError – The file does not exist and create is False., container=None, key=None, add=None)[source]

Yield unique items from an iterable, omitting duplicates.

By default, to provide uniqueness, it puts the generated items into a set created as a local variable. It only yields items which are not already present in the local set.

For large collections, this is not memory efficient, as a strong reference to every item is kept in a local set which cannot be cleared.

Also, the local set can’t be re-used when chaining unique operations on multiple generators.

To avoid these issues, it is advisable for the caller to provide their own container and set the key parameter to be the function hash, or use a weakref as the key.

The container can be any object that supports __contains__. If the container is a set or dict, the method add or __setitem__ will be used automatically. Any other method may be provided explicitly using the add parameter.

Beware that key=id is only useful for cases where id() is not unique.

Note: This is not thread safe.

  • iterable ( – the source iterable

  • container (type) – storage of seen items

  • key (callable) – function to convert the item to a key

  • add (callable) – function to add an item to the container str) → str[source]

Return a string with the first character uncapitalized.

Empty strings are supported. The original string is not changed. str) → str[source]

Return a string with the first character capitalized.

Empty strings are supported. The original string is not changed.


MediaWiki doesn’t capitalize some characters the same way as Python. This function tries to be close to MediaWiki’s capitalize function in title.php. See T179115 and T200357.

class, **kwargs)[source]


Frozen mapping, preventing write after initialisation.

__abstractmethods__ = frozenset({})
__init__(data=(), **kwargs)[source]

Initialize data in same ways like a dict.

__module__ = ''

Return repr(self).[source]

Return depth of wrapper function., version=None)[source]

Check whether a module can be imported., allow_duplicates=False)[source]

Intersect generators listed in genlist.

Yield items only if they are yielded by all generators in genlist. Threads (via ThreadedGenerator) are used in order to run generators in parallel, so that items can be yielded before generators are exhausted.

Threads are stopped when they are either exhausted or Ctrl-C is pressed. Quitting before all generators are finished is attempted if there is no more chance of finding an item in all queues.

  • genlist (list) – list of page generators

  • allow_duplicates (bool) – allow duplicates if present in all generators str) → bool[source]

Verify the IP address provided is valid.

No logging is performed. Use ip_address instead to catch errors.


IP – IP address, *args, marker='…')[source]

Generator which yields the first n elements of the iterable.

If more elements are available and marker is True, it returns an extra string marker as continuation mark.

Function takes the and the additional keyword marker.

  • iterable (iterable) – the iterable to work on

  • args – same args as: - itertools.islice(iterable, stop) - itertools.islice(iterable, start, stop[, step])

  • marker (str) – element to yield if iterable still contains elements after showing the required number. Default value: ‘…’ str, instead=None, depth=2, warning_class=None, since=None)[source]

Issue a deprecation warning.

  • name – the name of the deprecated object

  • instead (str or None) – suggested replacement for the deprecated object

  • depth (int) – depth + 1 will be used as stacklevel for the warnings

  • warning_class (type) – a warning class (category) to be used, defaults to DeprecationWarning

  • since (str or None) – a timestamp string of the date when the method was deprecated (form ‘YYYYMMDD’) or a version string., size: int)[source]

Make an iterator that returns lists of (up to) size items from iterable.


>>> i = itergroup(range(25), 10)
>>> print(next(i))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print(next(i))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> print(next(i))
[20, 21, 22, 23, 24]
>>> print(next(i))
Traceback (most recent call last):
StopIteration, obj)[source]

Add attributes to wrapper and wrapped functions.*args, **kwargs)[source]

Return a merged dict and make sure that the original dicts keys are unique.

The positional arguments are the dictionaries to be merged. It is also possible to define an additional dict using the keyword arguments. → Optional[str][source]

Normalize the username., mode='rb', use_extension=True)[source]

Open a file and uncompress it if needed.

This function supports bzip2, gzip, 7zip, lzma, and xz as compression containers. It uses the packages available in the standard library for bzip2, gzip, lzma, and xz so they are always available. 7zip is only available when a 7za program is available and only supports reading from it.

The compression is either selected via the magic number or file ending.

  • filename (str) – The filename.

  • use_extension (bool) – Use the file extension instead of the magic number to determine the type of compression (default True). Must be True when writing or appending.

  • mode (str) – The mode in which the file should be opened. It may either be ‘r’, ‘rb’, ‘a’, ‘ab’, ‘w’ or ‘wb’. All modes open the file in binary mode. It defaults to ‘rb’.

  • ValueError – When 7za is not available or the opening mode is unknown or it tries to write a 7z archive.

  • FileNotFoundError – When the filename doesn’t exist and it tries to read from it or it tries to determine the compression algorithm.

  • OSError – When it’s not a 7z archive but the file extension is 7z. It is also raised by bz2 when its content is invalid. gzip does not immediately raise that error but only on reading it.

  • lzma.LZMAError – When error occurs during compression or decompression or when initializing the state with lzma or xz.

  • ImportError – When file is compressed with bz2 but neither bz2 nor bz2file is importable, or when file is compressed with lzma or xz but lzma is not importable.


A file-like object returning the uncompressed data in binary mode.

Return type

file-like object, source_module: Optional[str] = None, target_module: Optional[str] = None, old_name: Optional[str] = None, class_name: Optional[str] = None, since: Optional[str] = None, future_warning=False)[source]

Return a function which can be used to redirect to ‘target’.

It also acts like marking that function deprecated and copies all parameters.

  • target (callable) – The targeted function which is to be executed.

  • source_module – The module of the old function. If ‘.’ defaults to target_module. If ‘None’ (default) it tries to guess it from the executing function.

  • target_module – The module of the target function. If ‘None’ (default) it tries to get it from the target. Might not work with nested classes.

  • old_name – The old function name. If None it uses the name of the new function.

  • class_name – The name of the class. It’s added to the target and source module (separated by a ‘.’).

  • since – a timestamp string of the date when the method was deprecated (form ‘YYYYMMDD’) or a version string.

  • future_warning (bool) – if True a FutureWarning will be thrown, otherwise it defaults to DeprecationWarning


A new function which adds a warning prior to each execution.

Return type


Decorator to declare all args additionally provided deprecated.

All positional arguments appearing after the normal arguments are marked deprecated. It marks also all keyword arguments present in arg_names as deprecated. Any arguments (positional or keyword) which are not present in arg_names are forwarded. For example a call with 3 parameters and the original function requests one and arg_names contain one name will result in an error, because the function got called with 2 parameters.

The decorated function may not use *args or **kwargs.


arg_names (iterable; for the most explanatory message it should retain the given order (so not a set for example)) – The names of all arguments.*iterables)[source]

Yield simultaneous from each iterable.

Sample: >>> tuple(roundrobin_generators(‘ABC’, range(5))) (‘A’, 0, ‘B’, 1, ‘C’, 2, 3, 4)


iterables (iterable) – any iterable to combine in roundrobin way


the combined generator of iterables

Return type


class'', category=<class 'Warning'>, filename='')[source]

Bases: warnings.catch_warnings

A decorator/context manager that temporarily suppresses warnings.

Those suppressed warnings that do not match the parameters will be raised shown upon exit.


Decorate func to suppress warnings.


Catch all warnings and store them in self.log.

__exit__(exc_type, exc_val, exc_tb)[source]

Stop logging warnings and show those that do not match to params.

__init__(message='', category=<class 'Warning'>, filename='')[source]

Initialize the object.

The parameter semantics are similar to those of warnings.filterwarnings.

  • message (str) – A string containing a regular expression that the start of the warning message must match. (case-insensitive)

  • category (type) – A class (a subclass of Warning) of which the warning category must be a subclass in order to match.

  • filename (str) – A string containing a regular expression that the start of the path to the warning module must match. (case-sensitive)

__module__ = ''

Submodules module

Character based helper functions (not wiki-dependent).[source]

Return True if the text contain any of the invisible characters.[source]

Replace invisible characters by ‘<codepoint>’. module

Wrapper around djvulibre to access djvu files properties and content.

class str, file_djvu='[deprecated name of file]')[source]

Bases: object

Wrapper around djvulibre to access djvu files properties and content.

Perform file existence checks.

Control characters in djvu text-layer are converted for convenience (see for control chars details).

__init__(file: str, file_djvu='[deprecated name of file]')[source]



file – filename (including path) to djvu file

__module__ = ''
__repr__() → str[source]

Return a more complete string representation.

__str__() → str[source]

Return a string representation.

static check_cache(fn)[source]

Decorator to check if cache shall be cleared.

static check_page_number(fn)[source]

Decorator to check if page number is valid.

:raises ValueError

delete_page(*args, **kwargs)[source]

Return most common size and dpi for pages in djvu file.

get_page(*args, **kwargs)[source]
has_text(*args, **kwargs)[source]
number_of_images(*args, **kwargs)[source]
page_info(*args, **kwargs)[source]
whiten_page(*args, **kwargs)[source] module

Module containing various formatting related utilities.


Bases: object

A class formatting a list of items.

It is possible to customize the appearance by changing format_string which is used by str.format with index, width and item. Each line is joined by the separator and the complete text is surrounded by the prefix and the suffix. All three are by default a new line. The index starts at 1 and for the width it’s using the width of the sequence’s length written as a decimal number. So a length of 100 will result in a with of 3 and a length of 99 in a width of 2.

It is iterating over self.sequence to generate the text. That sequence can be any iterator but the result is better when it has an order.


Create a new instance with a reference to the sequence.

__module__ = ''

Create the text with one item on each line.

format_string = ' {index:>{width}} - {item}'

Output the text of the current sequence.

prefix = '\n'
separator = '\n'
suffix = '\n' str, *args, **kwargs) → str[source]

Do str.format without having to worry about colors.

It is automatically adding 03 in front of color fields so it’s unnecessary to add them manually. Any other 03 in the text is disallowed.

You may use a variant {color} by assigning a valid color to a named parameter color.


text – The format template string


The formatted string