tools — Miscellaneous Helper Functions#

Miscellaneous helper functions (not wiki-dependent).

class tools.ComparableMixin[source]#

Bases: object

Mixin class to allow comparing to other objects which are comparable.

New in version 3.0.

class tools.MediaWikiVersion(version_str)[source]#

Bases: object

Version object to allow comparing ‘wmf’ versions with normal ones.

The version mainly consist of digits separated by periods. After that is a suffix which may only be ‘wmf<number>’, ‘alpha’, ‘beta<number>’ or ‘-rc.<number>’ (the - and . are optional). They are considered from old to new in that order with a version number without suffix is considered the newest. This secondary difference is stored in an internal _dev_version attribute.

Two versions are equal if their normal version and dev version are equal. A version is greater if the normal version or dev version is greater. For example:

1.34 < 1.34.1 < 1.35wmf1 < 1.35alpha < 1.35beta1 < 1.35beta2
< 1.35-rc-1 < 1.35-rc.2 < 1.35

Any other suffixes are considered invalid.

New in version 3.0.

Changed in version 6.1: Dependency of distutils was dropped because the package will be removed with Python 3.12.

Parameters:

version_str (str) – version to parse

MEDIAWIKI_VERSION = re.compile('(\\d+(?:\\.\\d+)+)(-?wmf\\.?(\\d+)|alpha|beta(\\d+)|-?rc\\.?(\\d+)|.*)?$')#
static from_generator(generator)[source]#

Create instance from a site’s generator attribute.

Parameters:

generator (str) –

Return type:

MediaWikiVersion

class tools.ModuleDeprecationWrapper(module)[source]#

Bases: module

A wrapper for a module to deprecate classes or variables of it.

Initialise the wrapper.

It will automatically overwrite the module with this instance in sys.modules.

Parameters:

module (str or module) – The module name or instance

add_deprecated_attr(name, replacement=None, *, replacement_name=None, warning_message=None, since='', future_warning=True)[source]#

Add the name to the local deprecated names dict.

Changed in version 7.0: since parameter must be a release number, not a timestamp.

Parameters:
  • name (str) – The name of the deprecated class or variable. It may not be already deprecated.

  • replacement (Optional[Any]) – The replacement value which should be returned instead. If the name is already an attribute of that module this must be None. If None it’ll return the attribute of the module.

  • replacement_name (Optional[str]) – The name of the new replaced value. Required if replacement is not None and it has no __name__ attribute. If it contains a ‘.’, it will be interpreted as a Python dotted object name, and evaluated when the deprecated object is needed.

  • warning_message (Optional[str]) – The warning to display, with positional variables: {0} = module, {1} = attribute name, {2} = replacement.

  • since (str) – a version string string when the method was deprecated

  • future_warning (bool) – if True a FutureWarning will be thrown, otherwise it provides a DeprecationWarning

class tools.SelfCallDict[source]#

Bases: SelfCallMixin, dict

Dict with SelfCallMixin.

New in version 3.0.

Deprecated since version 6.2.

class tools.SelfCallMixin[source]#

Bases: object

Return self when called.

When ‘_own_desc’ is defined it’ll also issue a deprecation warning using issue_deprecation_warning(‘Calling ‘ + _own_desc, ‘it directly’).

New in version 3.0.

Deprecated since version 6.2.

class tools.SelfCallString[source]#

Bases: SelfCallMixin, str

String with SelfCallMixin.

New in version 3.0.

Deprecated since version 6.2.

class tools.Version(version)[source]#

Bases: Version

Version from pkg_resouce vendor package.

This Version provides propreties of vendor package 20.4 shipped with setuptools 49.4.0.

New in version 6.4.

Parameters:

version (str) –

tools.add_decorated_full_name(obj, stacklevel=1)[source]#

Extract full object name, including class, and store in __full_name__.

This must be done on all decorators that are chained together, otherwise the second decorator will have the wrong full name.

Parameters:
  • obj (object) – An object being decorated

  • stacklevel (int) – level to use

Return type:

None

tools.add_full_name(obj)[source]#

A decorator to add __full_name__ to the function being decorated.

This should be done for all decorators used in pywikibot, as any decorator that does not add __full_name__ will prevent other decorators in the same chain from being able to obtain it.

This can be used to monkey-patch decorators in other modules. e.g. <xyz>.foo = add_full_name(<xyz>.foo)

Parameters:

obj (callable) – The function to decorate

Returns:

decorating function

Return type:

function

tools.cached(*arg)[source]#

Decorator to cache information of an object.

The wrapper adds an attribute to the instance which holds the result of the decorated method. The attribute’s name is the method name with preleading underscore.

Usage:

@cached
def this_method(self)

@cached
def that_method(self, force=False)

No parameter may be used with this decorator. Only a force parameter may be used with the decorated method. All other parameters are discarded and lead to a TypeError.

Note

A property must be decorated on top of the property method below other decorators. This decorator must not be used with functions.

New in version 7.3.

Raises:

TypeError – decorator must be used without arguments

Parameters:

arg (Callable) –

Return type:

Any

class tools.classproperty(cls_method)[source]#

Bases: object

Descriptor class to access a class method as a property.

This class may be used as a decorator:

class Foo:

    _bar = 'baz'  # a class property

    @classproperty
    def bar(cls):  # a class property method
        return cls._bar

Foo.bar gives ‘baz’.

New in version 3.0.

Hold the class method.

tools.compute_file_hash(filename, sha='sha1', bytes_to_read=None)[source]#

Compute file hash.

Result is expressed as hexdigest().

Parameters:
  • filename (str) – filename path

  • sha (str) – hashing function among the following in hashlib: md5(), sha1(), sha224(), sha256(), sha384(), and sha512() function name shall be passed as string, e.g. ‘sha1’.

  • bytes_to_read (None or int) – only the first bytes_to_read will be considered; if file size is smaller, the whole file will be considered.

tools.deprecate_arg(old_arg, new_arg)[source]#

Decorator to declare old_arg deprecated and replace it with new_arg.

Usage:

@deprecate_arg(‘foo’, ‘bar’) def my_function(bar=’baz’): pass # replaces ‘foo’ keyword by ‘bar’ used by my_function

@deprecare_arg(‘foo’, None) def my_function(): pass # ignores ‘foo’ keyword no longer used by my_function

deprecated_args decorator should be used in favour of this deprecate_arg decorator but it is held to deprecate args which become a reserved word in future Python releases and to prevent syntax errors.

Parameters:
  • old_arg (str) – old keyword

  • new_arg (str or None or bool) – new keyword

tools.deprecated(*outer_args, **outer_kwargs)[source]#

Outer wrapper.

The outer wrapper may be the replacement function if the decorated decorator was called without arguments, or the replacement decorator if the decorated decorator was called without arguments.

Parameters:
  • outer_args – args

  • outer_kwargs – kwargs

tools.deprecated_args(**arg_pairs)[source]#

Decorator to declare multiple args deprecated.

Usage:

@deprecated_args(foo=’bar’, baz=None) def my_function(bar=’baz’): pass # replaces ‘foo’ keyword by ‘bar’ and ignores ‘baz’ keyword

Parameters:

arg_pairs – Each entry points to the new argument name. If an argument is to be removed, the value may be one of the following: - None: shows a DeprecationWarning - False: shows a PendingDeprecationWarning - True: shows a FutureWarning (only once) - empty string: no warning is printed

tools.file_mode_checker(filename, mode=384, quiet=False, create=False)[source]#

Check file mode and update it, if needed.

Parameters:
  • filename (str) – filename path

  • mode (int) – requested file mode

  • quiet (bool) – warn about file mode change if False.

  • create (bool) – create the file if it does not exist already

Raises:

IOError – The file does not exist and create is False.

tools.first_lower(string)[source]#

Return a string with the first character uncapitalized.

Empty strings are supported. The original string is not changed.

New in version 3.0.

Parameters:

string (str) –

Return type:

str

tools.first_upper(string)[source]#

Return a string with the first character capitalized.

Empty strings are supported. The original string is not changed.

New in version 3.0.

Note

MediaWiki doesn’t capitalize some characters the same way as Python. This function tries to be close to MediaWiki’s capitalize function in title.php. See T179115 and T200357.

Parameters:

string (str) –

Return type:

str

tools.get_wrapper_depth(wrapper)[source]#

Return depth of wrapper function.

New in version 3.0.

tools.has_module(module, version=None)[source]#

Check if a module can be imported.

New in version 3.0.

Changed in version 6.1: Dependency of distutils was dropped because the package will be removed with Python 3.12.

Return type:

bool

tools.is_ip_address(value)[source]#

Check if a value is a valid IPv4 or IPv6 address.

New in version 6.1: Was renamed from is_IP().

Parameters:

value (str) – value to check

Return type:

bool

tools.issue_deprecation_warning(name, instead='', depth=2, warning_class=None, since='')[source]#

Issue a deprecation warning.

Changed in version 7.0: since parameter must be a release number, not a timestamp.

Parameters:
  • name (str) – the name of the deprecated object

  • instead (str) – suggested replacement for the deprecated object

  • depth (int) – depth + 1 will be used as stacklevel for the warnings

  • warning_class (type) – a warning class (category) to be used, defaults to FutureWarning

  • since (str) – a version string string when the method was deprecated

Return type:

None

tools.manage_wrapping(wrapper, obj)[source]#

Add attributes to wrapper and wrapped functions.

New in version 3.0.

Return type:

None

tools.merge_unique_dicts(*args, **kwargs)[source]#

Return a merged dict and make sure that the original dicts keys are unique.

The positional arguments are the dictionaries to be merged. It is also possible to define an additional dict using the keyword arguments.

tools.normalize_username(username)[source]#

Normalize the username.

New in version 3.0.

Return type:

Optional[str]

tools.open_archive(filename, mode='rb', use_extension=True)[source]#

Open a file and uncompress it if needed.

This function supports bzip2, gzip, 7zip, lzma, and xz as compression containers. It uses the packages available in the standard library for bzip2, gzip, lzma, and xz so they are always available. 7zip is only available when a 7za program is available and only supports reading from it.

The compression is either selected via the magic number or file ending.

New in version 3.0.

Parameters:
  • filename (str) – The filename.

  • use_extension (bool) – Use the file extension instead of the magic number to determine the type of compression (default True). Must be True when writing or appending.

  • mode (str) – The mode in which the file should be opened. It may either be ‘r’, ‘rb’, ‘a’, ‘ab’, ‘w’ or ‘wb’. All modes open the file in binary mode. It defaults to ‘rb’.

Raises:
  • ValueError – When 7za is not available or the opening mode is unknown or it tries to write a 7z archive.

  • FileNotFoundError – When the filename doesn’t exist and it tries to read from it or it tries to determine the compression algorithm.

  • OSError – When it’s not a 7z archive but the file extension is 7z. It is also raised by bz2 when its content is invalid. gzip does not immediately raise that error but only on reading it.

  • lzma.LZMAError – When error occurs during compression or decompression or when initializing the state with lzma or xz.

  • ImportError – When file is compressed with bz2 but neither bz2 nor bz2file is importable, or when file is compressed with lzma or xz but lzma is not importable.

Returns:

A file-like object returning the uncompressed data in binary mode.

Return type:

file-like object

tools.redirect_func(target, source_module=None, target_module=None, old_name=None, class_name=None, since='', future_warning=True)[source]#

Return a function which can be used to redirect to ‘target’.

It also acts like marking that function deprecated and copies all parameters.

Changed in version 7.0: since parameter must be a release number, not a timestamp.

Parameters:
  • target (callable) – The targeted function which is to be executed.

  • source_module (Optional[str]) – The module of the old function. If ‘.’ defaults to target_module. If ‘None’ (default) it tries to guess it from the executing function.

  • target_module (Optional[str]) – The module of the target function. If ‘None’ (default) it tries to get it from the target. Might not work with nested classes.

  • old_name (Optional[str]) – The old function name. If None it uses the name of the new function.

  • class_name (Optional[str]) – The name of the class. It’s added to the target and source module (separated by a ‘.’).

  • since (str) – a version string string when the method was deprecated

  • future_warning (bool) – if True a FutureWarning will be thrown, otherwise it provides a DeprecationWarning

Returns:

A new function which adds a warning prior to each execution.

Return type:

callable

tools.remove_last_args(arg_names)[source]#

Decorator to declare all args additionally provided deprecated.

All positional arguments appearing after the normal arguments are marked deprecated. It marks also all keyword arguments present in arg_names as deprecated. Any arguments (positional or keyword) which are not present in arg_names are forwarded. For example a call with 3 parameters and the original function requests one and arg_names contain one name will result in an error, because the function got called with 2 parameters.

The decorated function may not use *args or **kwargs.

Parameters:

arg_names (iterable; for the most explanatory message it should retain the given order (so not a set for example).) – The names of all arguments.

tools.strtobool(val)[source]#

Convert a string representation of truth to True or False.

This is a reimplementation of distutils.util.strtobool due to PEP 632#Migration Advice

New in version 7.1.

Parameters:

val (str) – True values are ‘y’, ‘yes’, ‘t’, ‘true’, ‘on’, and ‘1’; false values are ‘n’, ‘no’, ‘f’, ‘false’, ‘off’, and ‘0’.

Raises:

ValueErrorval is not a valid truth value

Return type:

bool

class tools.suppress_warnings(message='', category=<class 'Warning'>, filename='')[source]#

Bases: catch_warnings

A decorator/context manager that temporarily suppresses warnings.

Those suppressed warnings that do not match the parameters will be raised shown upon exit.

New in version 3.0.

Initialize the object.

The parameter semantics are similar to those of warnings.filterwarnings.

Parameters:
  • message (str) – A string containing a regular expression that the start of the warning message must match. (case-insensitive)

  • category (type) – A class (a subclass of Warning) of which the warning category must be a subclass in order to match.

  • filename (str) – A string containing a regular expression that the start of the path to the warning module must match. (case-sensitive)

tools.chars — Character Based Helper Functions#

Character based helper functions (not wiki-dependent).

tools.chars.contains_invisible(text)[source]#

Return True if the text contain any of the invisible characters.

tools.chars.replace_invisible(text)[source]#

Replace invisible characters by ‘<codepoint>’.

tools.chars.string2html(string, encoding)[source]#

Convert unicode string to requested HTML encoding.

Attempt to encode the string into the desired format; if that work return it unchanged. Otherwise encode the non-ASCII characters into HTML &#; entities.

Parameters:
  • string (str) – String to update

  • encoding (str) – Encoding to use

Return type:

str

tools.chars.string_to_ascii_html(string)[source]#

Convert unicode chars of str to HTML entities if chars are not ASCII.

Parameters:

string (str) – String to update

Return type:

str

tools.chars.url2string(title, encodings='utf-8')[source]#

Convert URL-encoded text to unicode using several encoding.

Uses the first encoding that doesn’t cause an error.

Parameters:
  • title (str) – URL-encoded character data to convert

  • encodings (Union[str, List[str], Tuple[str, ...]]) – Encodings to attempt to use during conversion.

Raises:

UnicodeError – Could not convert using any encoding.

Return type:

str

tools.collections — Container datatypes#

Collections datatypes.

exception tools.collections.CombinedError[source]#

Bases: KeyError, IndexError

An error that gets caught by both KeyError and IndexError.

New in version 3.0.

class tools.collections.DequeGenerator[source]#

Bases: Iterator, deque

A generator that allows items to be added during generating.

New in version 3.0.

Changed in version 6.1: Provide a representation string.

tools.collections.EMPTY_DEFAULT = ''#
class tools.collections.EmptyDefault[source]#

Bases: str, Mapping

A default for a not existing siteinfo property.

It should be chosen if there is no better default known. It acts like an empty collections, so it can be iterated through it safely if treated as a list, tuple, set or dictionary. It is also basically an empty string.

Accessing a value via __getitem__ will result in a combined KeyError and IndexError.

New in version 3.0.

Changed in version 6.2: empty_iterator() was removed in favour of iter().

Initialise the default as an empty string.

class tools.collections.GeneratorWrapper[source]#

Bases: ABC, Generator

A Generator base class which wraps the internal generator property.

This generator iterator also has generator.close() mixin method and it can be used as Iterable and Iterator as well.

New in version 7.6.

Example:

>>> class Gen(GeneratorWrapper):
...     @property
...     def generator(self):
...         return (c for c in 'Pywikibot')
>>> gen = Gen()
>>> next(gen)  # can be used as Iterator ...
'P'
>>> next(gen)
'y'
>>> ''.join(c for c in gen)  # ... or as Iterable
'wikibot'
>>> next(gen)  # the generator is exhausted ...
Traceback (most recent call last):
    ...
StopIteration
>>> gen.restart()  # ... but can be restarted
>>> next(gen) + next(gen)
'Py'
>>> gen.close()  # the generator may be closed
>>> next(gen)
Traceback (most recent call last):
    ...
StopIteration
>>> gen.restart()  # restart a closed generator
>>> # also send() and throw() works
>>> gen.send(None) + gen.send(None)
'Py'
>>> gen.throw(RuntimeError('Foo'))
Traceback (most recent call last):
    ...
RuntimeError: Foo

See also

PEP 342

abstract property generator: Generator[Any, Any, Any]#

Abstract generator property.

restart()[source]#

Restart the generator.

Return type:

None

send(value)[source]#

Return next yielded value from generator or raise StopIteration.

The value parameter is ignored yet; usually it should be None. If the wrapped generator property exits without yielding another value this method raises StopIteration. The send method works like the next function with a GeneratorWrapper instance as parameter.

Refer generator.send() for its usage.

Raises:

TypeError – generator property is not a generator

Parameters:

value (Any) –

Return type:

Any

throw(typ, val=None, tb=None)[source]#

Raise an exception inside the wrapped generator.

Refer generator.throw() for various parameter usage.

Raises:

RuntimeError – No generator started

Parameters:

typ (Exception) –

Return type:

None

class tools.collections.SizedKeyCollection(keyattr)[source]#

Bases: Container, Iterable, Sized

Structure to hold values where the key is given by the value itself.

A structure like a defaultdict but the key is given by the value itself and cannot be assigned directly. It returns the number of all items with len() but not the number of keys.

Samples:

>>> from pywikibot.tools.collections import SizedKeyCollection
>>> data = SizedKeyCollection('title')
>>> data.append('foo')
>>> data.append('bar')
>>> data.append('Foo')
>>> list(data)
['foo', 'Foo', 'bar']
>>> len(data)
3
>>> 'Foo' in data
True
>>> 'foo' in data
False
>>> data['Foo']
['foo', 'Foo']
>>> list(data.keys())
['Foo', 'Bar']
>>> data.remove_key('Foo')
>>> list(data)
['bar']
>>> data.clear()
>>> list(data)
[]

New in version 6.1.

Parameters:

keyattr (str) – an attribute or method of the values to be hold with this collection which will be used as key.

append(value)[source]#

Add a value to the collection.

Return type:

None

clear()[source]#

Remove all elements from SizedKeyCollection.

Return type:

None

filter(key)[source]#

Iterate over items for a given key.

iter_values_len()[source]#

Yield key, len(values) pairs.

remove(value)[source]#

Remove a value from the container.

Return type:

None

remove_key(key)[source]#

Remove all values for a given key.

Return type:

None

tools.deprecate — Deprecating Decorators and Classes#

Module providing deprecation decorators.

Decorator functions without parameters are _invoked_ differently from decorator functions with function syntax. For example, @deprecated causes a different invocation to @deprecated().

The former is invoked with the decorated function as args[0]. The latter is invoked with the decorator arguments as *args & **kwargs, and it must return a callable which will be invoked with the decorated function as args[0].

The follow deprecators may support both syntax, e.g. @deprecated and @deprecated() both work. In order to achieve that, the code inspects args[0] to see if it callable. Therefore, a decorator must not accept only one arg, and that arg be a callable, as it will be detected as a deprecator without any arguments.

Changed in version 6.4: deprecation decorators moved to _deprecate submodule

class tools._deprecate.ModuleDeprecationWrapper(module)[source]#

Bases: module

A wrapper for a module to deprecate classes or variables of it.

Initialise the wrapper.

It will automatically overwrite the module with this instance in sys.modules.

Parameters:

module (str or module) – The module name or instance

add_deprecated_attr(name, replacement=None, *, replacement_name=None, warning_message=None, since='', future_warning=True)[source]#

Add the name to the local deprecated names dict.

Changed in version 7.0: since parameter must be a release number, not a timestamp.

Parameters:
  • name (str) – The name of the deprecated class or variable. It may not be already deprecated.

  • replacement (Optional[Any]) – The replacement value which should be returned instead. If the name is already an attribute of that module this must be None. If None it’ll return the attribute of the module.

  • replacement_name (Optional[str]) – The name of the new replaced value. Required if replacement is not None and it has no __name__ attribute. If it contains a ‘.’, it will be interpreted as a Python dotted object name, and evaluated when the deprecated object is needed.

  • warning_message (Optional[str]) – The warning to display, with positional variables: {0} = module, {1} = attribute name, {2} = replacement.

  • since (str) – a version string string when the method was deprecated

  • future_warning (bool) – if True a FutureWarning will be thrown, otherwise it provides a DeprecationWarning

tools._deprecate.add_decorated_full_name(obj, stacklevel=1)[source]#

Extract full object name, including class, and store in __full_name__.

This must be done on all decorators that are chained together, otherwise the second decorator will have the wrong full name.

Parameters:
  • obj (object) – An object being decorated

  • stacklevel (int) – level to use

Return type:

None

tools._deprecate.add_full_name(obj)[source]#

A decorator to add __full_name__ to the function being decorated.

This should be done for all decorators used in pywikibot, as any decorator that does not add __full_name__ will prevent other decorators in the same chain from being able to obtain it.

This can be used to monkey-patch decorators in other modules. e.g. <xyz>.foo = add_full_name(<xyz>.foo)

Parameters:

obj (callable) – The function to decorate

Returns:

decorating function

Return type:

function

tools._deprecate.deprecate_arg(old_arg, new_arg)[source]#

Decorator to declare old_arg deprecated and replace it with new_arg.

Usage:

@deprecate_arg(‘foo’, ‘bar’) def my_function(bar=’baz’): pass # replaces ‘foo’ keyword by ‘bar’ used by my_function

@deprecare_arg(‘foo’, None) def my_function(): pass # ignores ‘foo’ keyword no longer used by my_function

deprecated_args decorator should be used in favour of this deprecate_arg decorator but it is held to deprecate args which become a reserved word in future Python releases and to prevent syntax errors.

Parameters:
  • old_arg (str) – old keyword

  • new_arg (str or None or bool) – new keyword

tools._deprecate.deprecated(*outer_args, **outer_kwargs)[source]#

Outer wrapper.

The outer wrapper may be the replacement function if the decorated decorator was called without arguments, or the replacement decorator if the decorated decorator was called without arguments.

Parameters:
  • outer_args – args

  • outer_kwargs – kwargs

tools._deprecate.deprecated_args(**arg_pairs)[source]#

Decorator to declare multiple args deprecated.

Usage:

@deprecated_args(foo=’bar’, baz=None) def my_function(bar=’baz’): pass # replaces ‘foo’ keyword by ‘bar’ and ignores ‘baz’ keyword

Parameters:

arg_pairs – Each entry points to the new argument name. If an argument is to be removed, the value may be one of the following: - None: shows a DeprecationWarning - False: shows a PendingDeprecationWarning - True: shows a FutureWarning (only once) - empty string: no warning is printed

tools._deprecate.get_wrapper_depth(wrapper)[source]#

Return depth of wrapper function.

New in version 3.0.

tools._deprecate.issue_deprecation_warning(name, instead='', depth=2, warning_class=None, since='')[source]#

Issue a deprecation warning.

Changed in version 7.0: since parameter must be a release number, not a timestamp.

Parameters:
  • name (str) – the name of the deprecated object

  • instead (str) – suggested replacement for the deprecated object

  • depth (int) – depth + 1 will be used as stacklevel for the warnings

  • warning_class (type) – a warning class (category) to be used, defaults to FutureWarning

  • since (str) – a version string string when the method was deprecated

Return type:

None

tools._deprecate.manage_wrapping(wrapper, obj)[source]#

Add attributes to wrapper and wrapped functions.

New in version 3.0.

Return type:

None

tools._deprecate.redirect_func(target, source_module=None, target_module=None, old_name=None, class_name=None, since='', future_warning=True)[source]#

Return a function which can be used to redirect to ‘target’.

It also acts like marking that function deprecated and copies all parameters.

Changed in version 7.0: since parameter must be a release number, not a timestamp.

Parameters:
  • target (callable) – The targeted function which is to be executed.

  • source_module (Optional[str]) – The module of the old function. If ‘.’ defaults to target_module. If ‘None’ (default) it tries to guess it from the executing function.

  • target_module (Optional[str]) – The module of the target function. If ‘None’ (default) it tries to get it from the target. Might not work with nested classes.

  • old_name (Optional[str]) – The old function name. If None it uses the name of the new function.

  • class_name (Optional[str]) – The name of the class. It’s added to the target and source module (separated by a ‘.’).

  • since (str) – a version string string when the method was deprecated

  • future_warning (bool) – if True a FutureWarning will be thrown, otherwise it provides a DeprecationWarning

Returns:

A new function which adds a warning prior to each execution.

Return type:

callable

tools._deprecate.remove_last_args(arg_names)[source]#

Decorator to declare all args additionally provided deprecated.

All positional arguments appearing after the normal arguments are marked deprecated. It marks also all keyword arguments present in arg_names as deprecated. Any arguments (positional or keyword) which are not present in arg_names are forwarded. For example a call with 3 parameters and the original function requests one and arg_names contain one name will result in an error, because the function got called with 2 parameters.

The decorated function may not use *args or **kwargs.

Parameters:

arg_names (iterable; for the most explanatory message it should retain the given order (so not a set for example).) – The names of all arguments.

tools.djvu — DJVU files wrapper#

Wrapper around djvulibre to access djvu files properties and content.

class tools.djvu.DjVuFile(file)[source]#

Bases: object

Wrapper around djvulibre to access djvu files properties and content.

Perform file existence checks.

Control characters in djvu text-layer are converted for convenience (see http://djvu.sourceforge.net/doc/man/djvused.html for control chars details).

Parameters:

file (str) – filename (including path) to djvu file

static check_cache(fn)[source]#

Decorator to check if cache shall be cleared.

static check_page_number(fn)[source]#

Decorator to check if page number is valid.

:raises ValueError

delete_page(*args, **kwargs)[source]#
get_most_common_info()[source]#

Return most common size and dpi for pages in djvu file.

get_page(*args, **kwargs)[source]#
has_text(*args, **kwargs)[source]#
number_of_images(*args, **kwargs)[source]#
page_info(*args, **kwargs)[source]#
whiten_page(*args, **kwargs)[source]#

tools.formatter — Formatting Related Functions and Classes#

Module containing various formatting related utilities.

class tools.formatter.SequenceOutputter(sequence)[source]#

Bases: object

A class formatting a list of items.

It is possible to customize the appearance by changing format_string which is used by str.format with index, width and item. Each line is joined by the separator and the complete text is surrounded by the prefix and the suffix. All three are by default a new line. The index starts at 1 and for the width it’s using the width of the sequence’s length written as a decimal number. So a length of 100 will result in a with of 3 and a length of 99 in a width of 2.

It is iterating over self.sequence to generate the text. That sequence can be any iterator but the result is better when it has an order.

Create a new instance with a reference to the sequence.

format_list()[source]#

DEPRECATED: Create the text with one item on each line.

format_string = '  {index:>{width}} - {item}'#
property out#

Create the text with one item on each line.

output()[source]#

Output the text of the current sequence.

Return type:

None

prefix = '\n'#
separator = '\n'#
suffix = '\n'#
tools.formatter.color_format(text, *args, **kwargs)[source]#

Do str.format without having to worry about colors.

It is automatically adding 03 in front of color fields so it’s unnecessary to add them manually. Any other 03 in the text is disallowed.

You may use a variant {color} by assigning a valid color to a named parameter color.

Deprecated since version 7.2: new color format pattern like <<color>>colored text<<default>> may be used instead.

Parameters:

text (str) – The format template string

Returns:

The formatted string

Raises:

ValueError – Wrong format string or wrong keywords

Return type:

str

tools.itertools — Iterators for Efficient Looping#

Iterator functions.

Note

pairwise() function introduced in Python 3.10 is backported in backports

tools.itertools.filter_unique(iterable, container=None, key=None, add=None)[source]#

Yield unique items from an iterable, omitting duplicates.

By default, to provide uniqueness, it puts the generated items into a set created as a local variable. It only yields items which are not already present in the local set.

For large collections, this is not memory efficient, as a strong reference to every item is kept in a local set which cannot be cleared.

Also, the local set can’t be re-used when chaining unique operations on multiple generators.

To avoid these issues, it is advisable for the caller to provide their own container and set the key parameter to be the function hash, or use a weakref as the key.

The container can be any object that supports __contains__. If the container is a set or dict, the method add or __setitem__ will be used automatically. Any other method may be provided explicitly using the add parameter.

Beware that key=id is only useful for cases where id() is not unique.

Warning

This is not thread safe.

Parameters:
  • iterable (collections.abc.Iterable) – the source iterable

  • container (type) – storage of seen items

  • key (callable) – function to convert the item to a key

  • add (callable) – function to add an item to the container

tools.itertools.intersect_generators(*iterables, allow_duplicates=False)[source]#

Generator of intersect iterables.

Yield items only if they are yielded by all iterables. zip_longest is used to retrieve items from all iterables in parallel, so that items can be yielded before iterables are exhausted.

Generator is stopped when all iterables are exhausted. Quitting before all iterables are finished is attempted if there is no more chance of finding an item in all of them.

Sample:

>>> iterables = 'mississippi', 'missouri'
>>> list(intersect_generators(*iterables))
['m', 'i', 's']
>>> list(intersect_generators(*iterables, allow_duplicates=True))
['m', 'i', 's', 's', 'i']

New in version 3.0.

Changed in version 5.0: Avoid duplicates (T263947).

Changed in version 6.4: genlist was renamed to iterables; consecutive iterables are to be used as iterables parameters or ‘*’ to unpack a list

Deprecated since version 6.4: allow_duplicates as positional argument, iterables as list type

Changed in version 7.0: Reimplemented without threads which is up to 10’000 times faster

Parameters:
  • iterables – page generators

  • allow_duplicates (bool) – optional keyword argument to allow duplicates if present in all generators

tools.itertools.islice_with_ellipsis(iterable, *args, marker='…')[source]#

Generator which yields the first n elements of the iterable.

If more elements are available and marker is True, it returns an extra string marker as continuation mark.

Function takes the and the additional keyword marker.

Parameters:
  • iterable (iterable) – the iterable to work on

  • args – same args as: - itertools.islice(iterable, stop) - itertools.islice(iterable, start, stop[, step])

  • marker (str) – element to yield if iterable still contains elements after showing the required number. Default value: ‘…’

tools.itertools.itergroup(iterable, size, strict=False)[source]#

Make an iterator that returns lists of (up to) size items from iterable.

Example:

>>> i = itergroup(range(25), 10)
>>> print(next(i))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print(next(i))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> print(next(i))
[20, 21, 22, 23, 24]
>>> print(next(i))
Traceback (most recent call last):
 ...
StopIteration
Parameters:
  • size (int) – How many items of the iterable to get in one chunk

  • strict (bool) – If True, raise a ValueError if length of iterable is not divisible by size.

Raises:

ValueError – iterable is not divisible by size

Return type:

Generator[Any, None, None]

tools.itertools.roundrobin_generators(*iterables)[source]#

Yield simultaneous from each iterable.

Sample:

>>> tuple(roundrobin_generators('ABC', range(5)))
('A', 0, 'B', 1, 'C', 2, 3, 4)

New in version 3.0.

Changed in version 6.4: A sentinel variable is used to determine the end of an iterable instead of None.

Parameters:

iterables (iterable) – any iterable to combine in roundrobin way

Returns:

the combined generator of iterables

Return type:

generator

tools.threading — Thread-based Classes#

Classes which can be used for threading.

class tools.threading.RLock(*args, **kwargs)[source]#

Bases: object

Context manager which implements extended reentrant lock objects.

This RLock is implicit derived from threading.RLock but provides a locked() method like in threading.Lock and a count attribute which gives the active recursion level of locks.

Usage:

>>> lock = RLock()
>>> lock.acquire()
True
>>> with lock: print(lock.count)  # nested lock
2
>>> lock.locked()
True
>>> lock.release()
>>> lock.locked()
False

New in version 6.2.

property count#

Return number of acquired locks.

locked()[source]#

Return true if the lock is acquired.

class tools.threading.ThreadList(limit=128, wait_time=2, *args)[source]#

Bases: list

A simple threadpool class to limit the number of simultaneous threads.

Any threading.Thread object can be added to the pool using the append() method. If the maximum number of simultaneous threads has not been reached, the Thread object will be started immediately; if not, the append() call will block until the thread is able to start.

>>> pool = ThreadList(limit=10)
>>> def work():
...     time.sleep(1)
...
>>> for x in range(20):
...     pool.append(threading.Thread(target=work))
...
Parameters:
  • limit (int) – the number of simultaneous threads

  • wait_time (float) – how long to wait if active threads exceeds limit

active_count()[source]#

Return the number of alive threads and delete all non-alive ones.

append(thd)[source]#

Add a thread to the pool and start it.

class tools.threading.ThreadedGenerator(group=None, target=None, name='GeneratorThread', args=(), kwargs=None, qsize=65536)[source]#

Bases: Thread

Look-ahead generator class.

Runs a generator in a separate thread and queues the results; can be called like a regular generator.

Subclasses should override self.generator, not self.run

Important: the generator thread will stop itself if the generator’s internal queue is exhausted; but, if the calling program does not use all the generated values, it must call the generator’s stop() method to stop the background thread. Example usage:

>>> gen = ThreadedGenerator(target=range, args=(20,))
>>> try:
...     data = list(gen)
... finally:
...     gen.stop()
>>> data
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

New in version 3.0.

Initializer. Takes same keyword arguments as threading.Thread.

target must be a generator function (or other callable that returns an iterable object).

Parameters:
  • qsize (int) – The size of the lookahead queue. The larger the qsize, the more values will be computed in advance of use (which can eat up memory and processor time).

  • name (str) –

run()[source]#

Run the generator and store the results on the queue.

Return type:

None

stop()[source]#

Stop the background thread.

Return type:

None

tools._logging — logging.Formatter Subclass#

Logging tools.

class tools._logging.LoggingFormatter(fmt=None, datefmt=None, style='%')[source]#

Bases: Formatter

Format LogRecords for output to file.

Initialize the formatter with specified format strings.

Initialize the formatter either with the specified format string, or a default as described above. Allow for specialized date formatting with the optional datefmt argument. If datefmt is omitted, you get an ISO8601-like (or RFC 3339-like) format.

Use a style parameter of ‘%’, ‘{’ or ‘$’ to specify that you want to use one of %-formatting, str.format() ({}) formatting or string.Template formatting in your format string.

Changed in version 3.2: Added the style parameter.

format(record)[source]#

Strip trailing newlines before outputting text to file.