bot — Classes for Building Bots#

User-interface related functions for building bots.

This module supports several different bot classes which could be used in conjunction. Each bot should subclass at least one of these four classes:

  • BaseBot: Basic bot class in case where the site is handled differently, like working on multiple sites in parallel. No site attribute is provided. Instead site of the current page should be used. This class should normally not be used directly.

  • SingleSiteBot: Bot class which should only be run on a single site. They usually store site specific content and thus can’t be easily run when the generator returns a page on another site. It has a property site which can also be changed. If the generator returns a page of a different site it’ll skip that page.

  • MultipleSitesBot: An alias of BaseBot. Should not be used if any other bot class is used.

  • ConfigParserBot: Bot class which supports reading options from a scripts.ini configuration file. That file consists of sections, led by a [section] header and followed by option: value or option=value entries. The section is the script name without .py suffix. All options identified must be predefined in available_options dictionary.

  • Bot: The previous base class which should be avoided. This class is mainly used for bots which work with Wikibase or together with an image repository.

Additionally there is the CurrentPageBot class which automatically sets the current page to the page treated. It is recommended to use this class and to use treat_page instead of treat and put_current instead of userPut. It by default subclasses the BaseBot class.

With CurrentPageBot it’s possible to subclass one of the following classes to filter the pages which are ultimately handled by CurrentPageBot.treat_page():

It is possible to combine filters by subclassing multiple of them. They are new-style classes so when a class is first subclassing ExistingPageBot and then FollowRedirectPageBot it will also work on pages which do not exist when a redirect pointed to that. If the order is inversed it’ll first follow them and then check whether they exist.

Additionally there is the AutomaticTWSummaryBot which subclasses CurrentPageBot and automatically defines the summary when put_current() is used.

Deprecated since version 7.2: The bot classes RedirectPageBot and NoRedirectPageBot are deprecated. Use use_redirects attribute instead.

Deprecated since version 9.2: The functions critical() debug() error() exception() log() output() stdout() and warning() as well as the constants CRITICAL, DEBUG, ERROR, INFO, INPUT STDOUT VERBOSE and WARNING imported from logging module are deprecated within this module. Import them directly. These functions can also be used as pywikibot members.

class bot.BaseBot(**kwargs)[source]#

Bases: OptionHandler

Generic Bot to be subclassed.

Only accepts generator and options defined in available_options.

This class provides a run() method for basic processing of a generator one page at a time.

If the subclass places a page generator in generator, Bot will process each page in the generator, invoking the method treat() which must then be implemented by subclasses.

Each item processed by treat() must be a page.BasePage type. Use init_page() to upcast the type. To enable other types, set BaseBot.treat_page_type to an appropriate type; your bot should derive from BaseBot in that case and handle site properties.

If the subclass does not set a generator, or does not override treat() or run(), NotImplementedError is raised.

For bot options handling refer OptionHandler class above.

Changed in version 7.0: A counter instance variable is provided.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

generator: Iterable#

Instance variable to hold the Iterbale processed by run() method. The is added to the class with generator keyword argument and the proposed type is a Generator. If not, run() upcast the generator attribute to become a Generator type. If a BaseBot subclass has its own generator attribute, a warning will be thrown when an object is passed to generator keyword parameter.

Warning

this is just a sample

_save_page(page, func, *args, **kwargs)[source]#

Helper function to handle page save-related option error handling.

Note

Do no use it directly. Use userPut() instead.

Parameters:
  • page (BasePage) – currently edited page

  • func (Callable[[...], Any]) – the function to call

  • args (Any) – passed to the function

  • kwargs (Any) – passed to the function

Keyword Arguments:
  • ignore_server_errors (bool) – if True, server errors will be reported and ignored (default: False)

  • ignore_save_related_errors (bool) – if True, errors related to page save will be reported and ignored (default: False)

Returns:

whether the page was saved successfully

Return type:

bool

available_options: dict[str, Any] = {'always': False}#

Handler configuration attribute. Only the keys of the dict can be passed as __init__ options. The values are the default values. Overwrite this in subclasses!

counter: Counter#

Instance variable which holds counters. The default counters are ‘read’, ‘write’ and ‘skip’. All of them are printed within exit(). You can use your own counters like:

self.counter['delete'] += 1

Added in version 7.0.

Changed in version 7.3: Your additional counters are also printed during exit()

property current_page: BasePage#

Return the current working page as a property.

exit()[source]#

Cleanup and exit processing.

Invoked when run() is finished. Waits for pending threads, prints counter statistics and informs whether the script terminated gracefully or was halted by exception.

Note

Do not overwrite it by subclasses; teardown() should be used instead.

Changed in version 7.3: Statistics are printed for all entries in counter

Changed in version 9.0: Print execution time with days, hours, minutes and seconds.

Return type:

None

generator: Iterable#
generator_completed: bool#

Instance attribute which is True if the generator is completed.

It gives False if the the generator processing in run() is either interrupted by KeyboardInterrupt or exited by QuitKeyboardInterrupt while closing the generator i.e. self.generator.close() keeps the value True.

To check for an empty generator you may use:

if self.generator_completed and not self.counter['read']:
    print('generator was emtpty')

Note

An empty generator returns True.

Added in version 3.0.

Changed in version 7.4: renamed to generator_completed to become a public attribute.

init_page(item)[source]#

Initialize a generator item before treating.

Ensure that the result of init_page is always a pywikibot.Page object or any other type given by the treat_page_type even when the generator returns something else.

Also used to set the arrange the current site. This is called before skip_page() and treat().

Parameters:

item (Any) – any item from generator

Returns:

return the page object to be processed further

Return type:

BasePage

quit()[source]#

Cleanup and quit processing.

Return type:

None

run()[source]#

Process all pages in generator.

Call setup(), check for a valid Iterable type in generator, upcast it to a Generator type if necessary, process every generator`s item as follows:

For each item call init_page(), check whether the result is a treat_page_type type, call skip_page() to determine whether to skip the current page. Otherwise call treat() for each item.

This method also adjust read and skip counter, and finally it calls exit() when leaving the method. In short this method is implemented similar to this:

def run(self) -> None:
    '''Process all pages in generator.'''
    self.setup()

    if not hasattr(self, 'generator'):
        raise NotImplementedError('"generator" not set.')

    if self.generator is None;
        print('No generator was defined')

    try:
        for item in self.generator:
            page = self.init_page(item)

        if self.skip_page(page):
            continue

        self.treat(page)

    except(QuitKeyboardInterrupt, KeyboardInterrupt):
        print('User canceled bot run.')

    finally:
        self.exit()

Changed in version 3.0: skip counter was added.; call setup() first.

Changed in version 6.0: upcast generator to a Generator type to enable generator.close() method.

Changed in version 6.1: Objects from generator may be different from pywikibot.Page but the type must be registered in treat_page_type.

Changed in version 9.2: leave method gracefully if generator is None using suggest_help() function.

Raises:
  • AssertionError – “page” is not a pywikibot.page.BasePage object

  • KeyboardInterrupt – KeyboardInterrupt occurred while config.verbose_output was set

  • NotImplementedErrorgenerator is not set

  • TypeError – invalid generator type or page is not a treat_page_type

Return type:

None

setup()[source]#

Some initial setup before run() operation starts.

This can be used for reading huge parts from life wiki or file operation which is more than just initialize the instance. Invoked by run() before running through generator loop.

Added in version 3.0.

Return type:

None

skip_page(page)[source]#

Return whether treat should be skipped for the page.

Added in version 3.0.

Changed in version 7.2: use use_redirects to handle redirects, use use_disambigs to handle disambigs

Parameters:

page (BasePage) – Page object to be processed

Return type:

bool

teardown()[source]#

Some cleanups after run() operation. Invoked by exit().

Added in version 3.0.

Return type:

None

treat(page)[source]#

Process one page (abstract method).

Parameters:

page (Any) – Object to be processed, usually a page.BasePage. For other page types the treat_page_type must be set.

Return type:

None

treat_page_type: Any#

Instance variable to hold the default page type used by run().

Added in version 6.1.

update_options: dict[str, Any] = {}#

update_options can be used to update available_options; do not use it if the bot class is to be derived but use self.available_options.update(<dict>) initializer in such case.

Added in version 6.4.

use_disambigs: bool | None = None#

Attribute to determine whether to use disambiguation pages. Set it to True to use disambigs only, set it to False to skip disambigs. If None both are processed.

Added in version 7.2.

use_redirects: bool | None = None#

Attribute to determine whether to use redirect pages. Set it to True to use redirects only, set it to False to skip redirects. If None both are processed. For example to create a RedirectBot you may define:

class MyRedirectBot(ExistingPageBot):

    '''Bot who only works on existing redirects.'''

    use_redirects = True

Added in version 7.2.

userPut(page, oldtext, newtext, **kwargs)[source]#

Save a new revision of a page, with user confirmation as required.

Print differences, ask user for confirmation, and puts the page if needed.

Option used:

  • ‘always’

Keyword Arguments:
  • asynchronous – passed to page.save

  • summary – passed to page.save

  • show_diff – show changes between oldtext and newtext (enabled)

  • ignore_save_related_errors – report and ignore (disabled)

  • ignore_server_errors – report and ignore (disabled)

Returns:

whether the page was saved successfully

Parameters:
  • page (BasePage)

  • oldtext (str)

  • newtext (str)

  • kwargs (Any)

Return type:

bool

user_confirm(question)[source]#

Obtain user response if bot option ‘always’ not enabled.

Parameters:

question (str)

Return type:

bool

class bot.LoggingFormatter(fmt=None, datefmt=None, style='%', validate=True, *, defaults=None)[source]#

Bases: Formatter

Format LogRecords for output to file.

Initialize the formatter with specified format strings.

Initialize the formatter either with the specified format string, or a default as described above. Allow for specialized date formatting with the optional datefmt argument. If datefmt is omitted, you get an ISO8601-like (or RFC 3339-like) format.

Use a style parameter of ‘%’, ‘{’ or ‘$’ to specify that you want to use one of %-formatting, str.format() ({}) formatting or string.Template formatting in your format string.

Changed in version 3.2: Added the style parameter.

format(record)[source]#

Strip trailing newlines before outputting text to file.

bot.set_interface(module_name)[source]#

Configures any bots to use the given interface module.

Search for user interface module in the pywikibot.userinterfaces subdirectory and initialize UI. Calls init_handlers() to re-initialize if we were already initialized with another UI.

Added in version 6.4.

Parameters:

module_name (str)

Return type:

None

bot.init_handlers()[source]#

Initialize the handlers and formatters for the logging system.

This relies on the global variable ui which is a UI object.

See also

pywikibot.userinterfaces

Calls writelogheader() after handlers are initialized. This function must be called before using any input/output methods; and must be called again if ui handler is changed. Use set_interface() to set the new interface which initializes it.

Note

this function is called by any user input and output function, so it should normally not need to be called explicitly.

All user output is routed through the logging module. Each type of output is handled by an appropriate handler object. This structure is used to permit eventual development of other user interfaces (GUIs) without modifying the core bot code.

The following output levels are defined:

  • DEBUG: only for file logging; debugging messages.

  • STDOUT: output that must be sent to sys.stdout (for bots that may have their output redirected to a file or other destination).

  • VERBOSE: optional progress information for display to user.

  • INFO: normal (non-optional) progress information for display to user.

  • INPUT: prompts requiring user response.

  • WARN: user warning messages.

  • ERROR: user error messages.

  • CRITICAL: fatal error messages.

Accordingly, do not use print statements in bot code; instead, use pywikibot.info() function and other functions from pywikibot.logging module.

Changed in version 6.2: Different logfiles are used if multiple processes of the same script are running.

Return type:

None

bot.writelogheader()[source]#

Save additional version, system and status info to the log file in use.

This may help the user to track errors or report bugs.

Changed in version 9.0: ignore milliseconds with timestamp.

Return type:

None

bot.input(question, password=False, default='', force=False)[source]#

Ask the user a question, return the user’s answer.

Parameters:
  • question (str) – a string that will be shown to the user. Don’t add a space after the question mark/colon, this method will do this for you.

  • password (bool) – if True, hides the user’s input (for password entry).

  • default (str | None) – The default answer if none was entered. None to require an answer.

  • force (bool) – Automatically use the default

Return type:

str

bot.input_choice(question, answers, default=None, return_shortcut=True, automatic_quit=True, force=False)[source]#

Ask the user the question and return one of the valid answers.

Parameters:
  • question (str) – The question asked without trailing spaces.

  • answers (AnswerType) – The valid answers each containing a full length answer and a shortcut. Each value must be unique.

  • default (str | None) – The result if no answer was entered. It must not be in the valid answers and can be disabled by setting it to None. If it should be linked with the valid answers it must be its shortcut.

  • return_shortcut (bool) – Whether the shortcut or the index of the answer is returned.

  • automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a bot.QuitKeyboardInterrupt if selected.

  • force (bool) – Automatically use the default.

Returns:

The selected answer shortcut or index. Is -1 if the default is selected, it does not return the shortcut and the default is not a valid shortcut.

Return type:

Any

bot.input_yn(question, default=None, automatic_quit=True, force=False)[source]#

Ask the user a yes/no question and return the answer as a bool.

Example:

>>> input_yn('Do you like Pywikibot?', 'y', False, force=True)
... 
Do you like Pywikibot? ([Y]es, [n]o)
True
>>> input_yn('More examples?', False, automatic_quit=False, force=True)
... 
Some more examples? ([y]es, [N]o)
False

See also

input_choice()

Parameters:
  • question (str) – The question asked without trailing spaces.

  • default (bool | str | None) – The result if no answer was entered. It must be a bool or 'y', 'n', 0 or 1 and can be disabled by setting it to None.

  • automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a bot.QuitKeyboardInterrupt if selected.

  • force (bool) – Automatically use the default.

Returns:

Return True if the user selected yes and False if the user selected no. If the default is not None it’ll return True if default is True or ‘y’ and False if default is False or ‘n’.

Return type:

bool

bot.input_list_choice(question, answers, default=None, force=False)[source]#

Ask the user the question and return one of the valid answers.

Parameters:
  • question (str) – The question asked without trailing spaces.

  • answers (AnswerType) – The valid answers each containing a full length answer.

  • default (int | str | None) – The result if no answer was entered. It must not be in the valid answers and can be disabled by setting it to None.

  • force (bool) – Automatically use the default

Returns:

The selected answer.

Return type:

str

bot.ui: ABUIC | None = <pywikibot.userinterfaces.terminal_interface_unix.UnixUI object>#

Holds a user interface object defined in pywikibot.userinterfaces subpackage.

class bot.Option(stop=True)[source]#

Bases: ABC

A basic option for input_choice.

The following methods need to be implemented:

  • format(default=None)

  • result(value)

  • test(value)

The methods test and handled are in such a relationship that when handled returns itself that test must return True for that value. So if test returns False handled may not return itself but it may return not None.

Also result only returns a sensible value when test returns True for the same value.

Parameters:

stop (bool)

static formatted(text, options, default=None)[source]#

Create a text with the options formatted into it.

This static method is used by pywikibot.input_choice(). It calls format for all options to combine the question for pywikibot.input().

Parameters:
  • text (str) – Text into which options are to be formatted

  • options (Iterable[Option]) – Option instances to be formatted

  • default (str | None) – filler for any option’s ‘default’ placeholder

Returns:

Text with the options formatted into it

Return type:

str

property stop: bool#

Return whether this option stops asking.

handled(value)[source]#

Return the Option object that applies to the given value.

If this Option object doesn’t know which applies it returns None.

Parameters:

value (str)

Return type:

Option | None

format(default=None)[source]#

Return a formatted string for that option.

Parameters:

default (str | None)

Return type:

str

test(value)[source]#

Return True whether this option applies.

Parameters:

value (str)

Return type:

bool

abstract result(value)[source]#

Return the actual value which is associated by the given one.

Added in version 6.2: result() is an abstract method and must be defined in subclasses

Parameters:

value (str)

Return type:

Any

class bot.StandardOption(option, shortcut, **kwargs)[source]#

Bases: Option

An option with a description and shortcut and returning the shortcut.

Parameters:
  • option (str) – option string

  • shortcut (str) – Shortcut of the option

  • kwargs (Any)

format(default=None)[source]#

Return a formatted string for that option.

Parameters:

default (str | None)

Return type:

str

result(value)[source]#

Return the lowercased shortcut.

Parameters:

value (str)

Return type:

Any

test(value)[source]#

Return True whether this option applies.

Parameters:

value (str)

Return type:

bool

class bot.NestedOption(option, shortcut, description, options)[source]#

Bases: OutputOption, StandardOption

An option containing other options.

It will return True in test if this option applies but False if a sub option applies while handle returns the sub option.

Parameters:
  • option (str)

  • shortcut (str)

  • description (str)

  • options (Iterable[Option])

format(default=None)[source]#

Return a formatted string for that option.

Parameters:

default (str | None)

Return type:

str

handled(value)[source]#

Return itself if it applies or the applying sub option.

Parameters:

value (str)

Return type:

Option | None

property out: str#

Output of suboptions.

class bot.IntegerOption(minimum=1, maximum=None, prefix='', **kwargs)[source]#

Bases: Option

An option allowing a range of integers.

Parameters:
  • minimum (int)

  • maximum (int | None)

  • prefix (str)

  • kwargs (Any)

test(value)[source]#

Return whether the value is an int and in the specified range.

Parameters:

value (str)

Return type:

bool

property minimum: int#

Return the lower bound of the range of allowed values.

property maximum: int | None#

Return the upper bound of the range of allowed values.

format(default=None)[source]#

Return a formatted string showing the range.

Parameters:

default (str | None)

Return type:

str

parse(value)[source]#

Return integer from value with prefix removed.

Parameters:

value (str)

Return type:

int

result(value)[source]#

Return a tuple with the prefix and value converted into an int.

Parameters:

value (str)

Return type:

Any

class bot.ContextOption(option, shortcut, text, context, delta=100, start=0, end=0)[source]#

Bases: OutputOption, StandardOption

An option to show more and more context.

Parameters:
  • option (str)

  • shortcut (str)

  • text (str)

  • context (int)

  • delta (int)

  • start (int)

  • end (int)

result(value)[source]#

Add the delta to the context.

Parameters:

value (str)

Return type:

Any

property out: str#

Output section of the text.

class bot.ListOption(sequence, prefix='', **kwargs)[source]#

Bases: IntegerOption

An option to select something from a list.

Parameters:
  • sequence (Sequence[str])

  • prefix (str)

  • kwargs (Any)

format(default=None)[source]#

Return a string showing the range.

Parameters:

default (str | None)

Return type:

str

property maximum: int#

Return the maximum value.

result(value)[source]#

Return a tuple with the prefix and selected value.

Parameters:

value (str)

Return type:

Any

class bot.ShowingListOption(sequence, prefix='', pre=None, post=None, **kwargs)[source]#

Bases: ListOption, OutputOption

An option to show a list and select an item.

Added in version 3.0.

Parameters:
  • pre (str | None) – Additional comment printed before the list.

  • post (str | None) – Additional comment printed after the list.

  • sequence (Sequence[str])

  • prefix (str)

  • kwargs (Any)

before_question: bool = True#

Place output before or after the question

property stop: bool#

Return whether this option stops asking.

property out: str#

Output text of the enumerated list.

class bot.MultipleChoiceList(sequence, prefix='', **kwargs)[source]#

Bases: ListOption

An option to select multiple items from a list.

Added in version 3.0.

Parameters:
  • sequence (Sequence[str])

  • prefix (str)

  • kwargs (Any)

test(value)[source]#

Return whether the values are int and in the specified range.

Parameters:

value (str)

Return type:

bool

result(value)[source]#

Return a tuple with the prefix and selected values as a list.

Parameters:

value (str)

Return type:

Any

class bot.ShowingMultipleChoiceList(sequence, prefix='', pre=None, post=None, **kwargs)[source]#

Bases: ShowingListOption, MultipleChoiceList

An option to show a list and select multiple items.

Added in version 3.0.

Parameters:
  • pre (str | None) – Additional comment printed before the list.

  • post (str | None) – Additional comment printed after the list.

  • sequence (Sequence[str])

  • prefix (str)

  • kwargs (Any)

class bot.OutputProxyOption(option, shortcut, output, **kwargs)[source]#

Bases: OutputOption, StandardOption

An option which calls out property of the given output class.

Create a new option for the given sequence.

Parameters:
  • option (str)

  • shortcut (str)

  • output (OutputOption)

  • kwargs (Any)

property out: str#

Return the contents.

class bot.HighlightContextOption(option, shortcut, text, context, delta=100, start=0, end=0)[source]#

Bases: ContextOption

Show the original region highlighted.

Parameters:
  • option (str)

  • shortcut (str)

  • text (str)

  • context (int)

  • delta (int)

  • start (int)

  • end (int)

color = 'lightred'#
property out: str#

Highlighted output section of the text.

exception bot.ChoiceException(option, shortcut, **kwargs)[source]#

Bases: StandardOption, Exception

A choice for input_choice which result in this exception.

Parameters:
  • option (str) – option string

  • shortcut (str) – Shortcut of the option

  • kwargs (Any)

Return type:

None

result(value)[source]#

Return itself to raise the exception.

Parameters:

value (Any)

Return type:

Any

exception bot.UnhandledAnswer[source]#

Bases: Exception

The given answer didn’t suffice.

class bot.Choice(option, shortcut, replacer)[source]#

Bases: StandardOption

A simple choice consisting of an option, shortcut and handler.

Parameters:
property replacer: InteractiveReplace | None#

The replacer.

abstract handle()[source]#

Handle this choice. Must be implemented.

Return type:

Any

The current link will be handled by this choice.

Return type:

bool

class bot.StaticChoice(option, shortcut, result)[source]#

Bases: Choice

A static choice which just returns the given value.

Create instance with replacer set to None.

Parameters:
  • option (str)

  • shortcut (str)

  • result (Any)

handle()[source]#

Return the predefined value.

Return type:

Any

class bot.LinkChoice(option, shortcut, replacer, replace_section, replace_label)[source]#

Bases: Choice

A choice returning a mix of the link new and current link.

Parameters:
  • option (str)

  • shortcut (str)

  • replacer (InteractiveReplace | None)

  • replace_section (bool)

  • replace_label (bool)

handle()[source]#

Handle by either applying the new section or label.

Return type:

Any

class bot.AlwaysChoice(replacer, option='always', shortcut='a')[source]#

Bases: Choice

Add an option to always apply the default.

Parameters:
handle()[source]#

Handle the custom shortcut.

Return type:

Any

Directly return answer whether it’s applying it always.

Return type:

bool

property answer: Any#

Get the actual default answer instructing the replacement.

exception bot.QuitKeyboardInterrupt[source]#

Bases: ChoiceException, KeyboardInterrupt

The user has cancelled processing at a prompt.

Constructor using the ‘quit’ (‘q’) in input_choice.

Return type:

None

class bot.InteractiveReplace(old_link, new_link, default=None, automatic_quit=True)[source]#

Bases: object

A callback class for textlib’s replace_links.

It shows various options which can be switched on and off: * allow_skip_link = True (skip the current link) * allow_unlink = True (unlink) * allow_replace = False (just replace target, keep section and label) * allow_replace_section = False (replace target and section, keep label) * allow_replace_label = False (replace target and label, keep section) * allow_replace_all = False (replace target, section and label) (The boolean values are the default values)

It has also a context attribute which must be a non-negative integer. If it is greater 0 it shows that many characters before and after the link in question. The context_delta attribute can be defined too and adds an option to increase context by the given amount each time the option is selected.

Additional choices can be defined using the ‘additional_choices’ and will be amended to the choices defined by this class. This list is mutable and the Choice instance returned and created by this class are too.

Parameters:
  • old_link (Link | Page) – The old link which is searched. The label and section are ignored.

  • new_link (Link | Page | Literal[False]) – The new link with which it should be replaced. Depending on the replacement mode it’ll use this link’s label and section. If False it’ll unlink all and the attributes beginning with allow_replace are ignored.

  • default (str | None) – The default answer as the shortcut

  • automatic_quit (bool) – Add an option to quit and raise a QuitKeyboardException.

handle_answer(choice)[source]#

Return the result for replace_links.

Parameters:

choice (str)

Return type:

Any

property choices: tuple[StandardOption, ...]#

Return the tuple of choices.

Handle the currently given replacement.

Return type:

Any

Get the current link when it’s handling one currently.

property current_text: str#

Get the current text when it’s handling one currently.

property current_groups: Mapping[str, str]#

Get the current groups when it’s handling one currently.

property current_range: tuple[int, int]#

Get the current range when it’s handling one currently.

bot.calledModuleName()[source]#

Return the name of the module calling this function.

This is required because the -help option loads the module’s docstring and because the module name will be used for the filename of the log.

Return type:

str

bot.handle_args(args=None, do_help=True)[source]#

Handle global command line arguments and return the rest as a list.

Takes the command line arguments as strings, processes all global parameters such as -lang or -log, initialises the logging layer, which emits startup information into log at level ‘verbose’. This function makes sure that global arguments are applied first, regardless of the order in which the arguments were given. args may be passed as an argument, thereby overriding sys.argv.

>>> local_args = pywikibot.handle_args()  # sys.argv is used
>>> local_args  
[]
>>> local_args = pywikibot.handle_args(['-simulate', '-myoption'])
>>> local_args  # global optons are handled, show the remaining
['-myoption']
>>> for arg in local_args: pass  # do whatever is wanted with local_args

Caution

Global options might be introduced without warning period. It is up to developers to verify that global options do not interfere with local script options of private scripts.

Tip

Avoid using this method in your private scripts and use the pwb wrapper instead. In directory mode:

python pwb.py <global options> <name_of_script> <local options>

With installed site package:

pwb <global options> <name_of_script> <local options>

Note

the pwb wrapper can be used even if the handle_args method is used within the script.

Changed in version 5.2: -site global option was added

Changed in version 7.1: -cosmetic_changes and -cc may be set directly instead of toggling the value. Refer tools.strtobool() for valid values.

Changed in version 7.7: -config global option was added.

Changed in version 8.0: Short site value can be given if site code is equal to family like -site:meta.

Changed in version 8.1: -nolog option also discards command.log.

Parameters:
  • args (Iterable[str] | None) – Command line arguments. If None, pywikibot.argvu is used which is a copy of sys.argv

  • do_help (bool) – Handle parameter ‘-help’ to show help and invoke sys.exit

Returns:

list of arguments not recognised globally

Return type:

list[str]

bot.show_help(module_name=None, show_global=False)[source]#

Show help for the Bot.

Changed in version 4.0: Renamed from showHelp() to show_help().

Changed in version 8.0: Do not show version changes.

Parameters:
  • module_name (str | None)

  • show_global (bool)

Return type:

None

bot.suggest_help(missing_parameters=None, missing_generator=False, unknown_parameters=None, exception=None, missing_action=False, additional_text='', missing_dependencies=None)[source]#

Output error message to use -help with additional text before it.

Parameters:
  • missing_parameters (Sequence[str] | None) – A list of parameters which are missing.

  • missing_generator (bool) – Whether a generator is missing.

  • unknown_parameters (Sequence[str] | None) – A list of parameters which are unknown.

  • exception (Exception | None) – An exception thrown.

  • missing_action (bool) – Add an entry that no action was defined.

  • additional_text (str) – Additional text added to the end.

  • missing_dependencies (Sequence[str] | None) – A list of dependencies which cannot be imported.

Returns:

True if an error message was printed, False otherwise

Return type:

bool

bot.writeToCommandLogFile()[source]#

Save name of the called module along with all params to logfile.

This can be used by user later to track errors or report bugs.

Return type:

None

bot.open_webbrowser(page)[source]#

Open the web browser displaying the page and wait for input.

Parameters:

page (BasePage)

Return type:

None

class bot.OptionHandler(**kwargs)[source]#

Bases: object

Class to get and set options.

How to use options of OptionHandler and its BaseBot subclasses: First define an available_options class attribute for your own option handler to define all available options:

>>> default_options = {'foo': 'bar', 'bar': 42, 'baz': False}
>>> class MyHandler(OptionHandler): available_options = default_options

Or you may update the predefined setting in the class initializer. BaseBot predefines a ‘always’ options and sets it to False:

self.available_options.update(always=True, another_option=’Yes’)

Now you can instantiate an OptionHandler or BaseBot class passing options other than default values:

>>> bot = MyHandler(baz=True)

You can access bot options either as keyword item or attribute:

>>> bot.opt.foo
'bar'
>>> bot.opt['bar']
42
>>> bot.opt.baz  # default was overridden
True

You can set the options in the same way:

>>> bot.opt.bar = 4711
>>> bot.opt['baz'] = None
>>>

Or you can use the option as a dict:

>>> 'Option opt.{foo} is {bar}'.format_map(bot.opt)
'Option opt.bar is 4711'

Warning

You must not access bot options as an attribute if the keyword is a dict method.

Only accept options defined in available_options.

Parameters:

kwargs (Any) – bot options

available_options: dict[str, Any] = {}#

Handler configuration attribute. Only the keys of the dict can be passed as __init__ options. The values are the default values. Overwrite this in subclasses!

set_options(**options)[source]#

Set the instance options.

Parameters:

options (Any)

Return type:

None

class bot.BaseBot(**kwargs)[source]#

Bases: OptionHandler

Generic Bot to be subclassed.

Only accepts generator and options defined in available_options.

This class provides a run() method for basic processing of a generator one page at a time.

If the subclass places a page generator in generator, Bot will process each page in the generator, invoking the method treat() which must then be implemented by subclasses.

Each item processed by treat() must be a page.BasePage type. Use init_page() to upcast the type. To enable other types, set BaseBot.treat_page_type to an appropriate type; your bot should derive from BaseBot in that case and handle site properties.

If the subclass does not set a generator, or does not override treat() or run(), NotImplementedError is raised.

For bot options handling refer OptionHandler class above.

Changed in version 7.0: A counter instance variable is provided.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

use_disambigs: bool | None = None#

Attribute to determine whether to use disambiguation pages. Set it to True to use disambigs only, set it to False to skip disambigs. If None both are processed.

Added in version 7.2.

use_redirects: bool | None = None#

Attribute to determine whether to use redirect pages. Set it to True to use redirects only, set it to False to skip redirects. If None both are processed. For example to create a RedirectBot you may define:

class MyRedirectBot(ExistingPageBot):

    '''Bot who only works on existing redirects.'''

    use_redirects = True

Added in version 7.2.

available_options: dict[str, Any] = {'always': False}#

Handler configuration attribute. Only the keys of the dict can be passed as __init__ options. The values are the default values. Overwrite this in subclasses!

update_options: dict[str, Any] = {}#

update_options can be used to update available_options; do not use it if the bot class is to be derived but use self.available_options.update(<dict>) initializer in such case.

Added in version 6.4.

counter: Counter#

Instance variable which holds counters. The default counters are ‘read’, ‘write’ and ‘skip’. All of them are printed within exit(). You can use your own counters like:

self.counter['delete'] += 1

Added in version 7.0.

Changed in version 7.3: Your additional counters are also printed during exit()

generator_completed: bool#

Instance attribute which is True if the generator is completed.

It gives False if the the generator processing in run() is either interrupted by KeyboardInterrupt or exited by QuitKeyboardInterrupt while closing the generator i.e. self.generator.close() keeps the value True.

To check for an empty generator you may use:

if self.generator_completed and not self.counter['read']:
    print('generator was emtpty')

Note

An empty generator returns True.

Added in version 3.0.

Changed in version 7.4: renamed to generator_completed to become a public attribute.

treat_page_type: Any#

Instance variable to hold the default page type used by run().

Added in version 6.1.

property current_page: BasePage#

Return the current working page as a property.

user_confirm(question)[source]#

Obtain user response if bot option ‘always’ not enabled.

Parameters:

question (str)

Return type:

bool

userPut(page, oldtext, newtext, **kwargs)[source]#

Save a new revision of a page, with user confirmation as required.

Print differences, ask user for confirmation, and puts the page if needed.

Option used:

  • ‘always’

Keyword Arguments:
  • asynchronous – passed to page.save

  • summary – passed to page.save

  • show_diff – show changes between oldtext and newtext (enabled)

  • ignore_save_related_errors – report and ignore (disabled)

  • ignore_server_errors – report and ignore (disabled)

Returns:

whether the page was saved successfully

Parameters:
  • page (BasePage)

  • oldtext (str)

  • newtext (str)

  • kwargs (Any)

Return type:

bool

_save_page(page, func, *args, **kwargs)[source]#

Helper function to handle page save-related option error handling.

Note

Do no use it directly. Use userPut() instead.

Parameters:
  • page (BasePage) – currently edited page

  • func (Callable[[...], Any]) – the function to call

  • args (Any) – passed to the function

  • kwargs (Any) – passed to the function

Keyword Arguments:
  • ignore_server_errors (bool) – if True, server errors will be reported and ignored (default: False)

  • ignore_save_related_errors (bool) – if True, errors related to page save will be reported and ignored (default: False)

Returns:

whether the page was saved successfully

Return type:

bool

quit()[source]#

Cleanup and quit processing.

Return type:

None

exit()[source]#

Cleanup and exit processing.

Invoked when run() is finished. Waits for pending threads, prints counter statistics and informs whether the script terminated gracefully or was halted by exception.

Note

Do not overwrite it by subclasses; teardown() should be used instead.

Changed in version 7.3: Statistics are printed for all entries in counter

Changed in version 9.0: Print execution time with days, hours, minutes and seconds.

Return type:

None

init_page(item)[source]#

Initialize a generator item before treating.

Ensure that the result of init_page is always a pywikibot.Page object or any other type given by the treat_page_type even when the generator returns something else.

Also used to set the arrange the current site. This is called before skip_page() and treat().

Parameters:

item (Any) – any item from generator

Returns:

return the page object to be processed further

Return type:

BasePage

skip_page(page)[source]#

Return whether treat should be skipped for the page.

Added in version 3.0.

Changed in version 7.2: use use_redirects to handle redirects, use use_disambigs to handle disambigs

Parameters:

page (BasePage) – Page object to be processed

Return type:

bool

treat(page)[source]#

Process one page (abstract method).

Parameters:

page (Any) – Object to be processed, usually a page.BasePage. For other page types the treat_page_type must be set.

Return type:

None

setup()[source]#

Some initial setup before run() operation starts.

This can be used for reading huge parts from life wiki or file operation which is more than just initialize the instance. Invoked by run() before running through generator loop.

Added in version 3.0.

Return type:

None

teardown()[source]#

Some cleanups after run() operation. Invoked by exit().

Added in version 3.0.

Return type:

None

run()[source]#

Process all pages in generator.

Call setup(), check for a valid Iterable type in generator, upcast it to a Generator type if necessary, process every generator`s item as follows:

For each item call init_page(), check whether the result is a treat_page_type type, call skip_page() to determine whether to skip the current page. Otherwise call treat() for each item.

This method also adjust read and skip counter, and finally it calls exit() when leaving the method. In short this method is implemented similar to this:

def run(self) -> None:
    '''Process all pages in generator.'''
    self.setup()

    if not hasattr(self, 'generator'):
        raise NotImplementedError('"generator" not set.')

    if self.generator is None;
        print('No generator was defined')

    try:
        for item in self.generator:
            page = self.init_page(item)

        if self.skip_page(page):
            continue

        self.treat(page)

    except(QuitKeyboardInterrupt, KeyboardInterrupt):
        print('User canceled bot run.')

    finally:
        self.exit()

Changed in version 3.0: skip counter was added.; call setup() first.

Changed in version 6.0: upcast generator to a Generator type to enable generator.close() method.

Changed in version 6.1: Objects from generator may be different from pywikibot.Page but the type must be registered in treat_page_type.

Changed in version 9.2: leave method gracefully if generator is None using suggest_help() function.

Raises:
  • AssertionError – “page” is not a pywikibot.page.BasePage object

  • KeyboardInterrupt – KeyboardInterrupt occurred while config.verbose_output was set

  • NotImplementedErrorgenerator is not set

  • TypeError – invalid generator type or page is not a treat_page_type

Return type:

None

class bot.Bot(site=None, **kwargs)[source]#

Bases: BaseBot

Generic bot subclass for multiple sites.

If possible the MultipleSitesBot or SingleSiteBot classes should be used instead which specifically handle multiple or single sites.

Create a Bot instance and initialize cached sites.

Parameters:
property site: BaseSite | None#

Get the current site.

run()[source]#

Check if it automatically updates the site before run.

Return type:

None

init_page(item)[source]#

Update site before calling treat.

Parameters:

item (Any)

Return type:

BasePage

class bot.ConfigParserBot(**kwargs)[source]#

Bases: BaseBot

A bot class that can read options from scripts.ini file.

All options must be predefined in available_options dictionary. The type of these options is responsible for the correct interpretation of the options type given by the .ini file. They can be interpreted as bool, int, float or str (default). The settings file may be like:

[add_text]
# edit summary for the bot.
summary = Bot: Aggiungo template Categorizzare

[commonscat] ; commonscat options
always: true

The option values are interpreted in this order:

  1. available_options default setting

  2. script.ini options settings

  3. command line arguments

Added in version 3.0.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

INI = 'scripts.ini'#
set_options(**kwargs)[source]#

Read settings from scripts.ini file.

Parameters:

kwargs (Any)

Return type:

None

class bot.SingleSiteBot(site=True, **kwargs)[source]#

Bases: BaseBot

A bot only working on one site and ignoring the others.

If no site is given from the start it’ll use the first page’s site. Any page after the site has been defined and is not on the defined site will be ignored.

Create a SingleSiteBot instance.

Parameters:
  • site (BaseSite | bool | None) – If True it’ll be set to the configured site using pywikibot.Site.

  • kwargs (Any)

property site: BaseSite#

Site that the bot is using.

init_page(item)[source]#

Set site if not defined.

Parameters:

item (Any)

Return type:

BasePage

skip_page(page)[source]#

Skip page if it is not on the defined site.

Parameters:

page (BasePage)

Return type:

bool

class bot.MultipleSitesBot(**kwargs)[source]#

Bases: BaseBot

A bot class working on multiple sites.

The bot should accommodate for that case and not store site specific information on only one site.

Changed in version 6.2: Site attribute has been dropped.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

class bot.CurrentPageBot(**kwargs)[source]#

Bases: BaseBot

A bot which automatically sets ‘current_page’ on each treat().

This class should be always used together with either the MultipleSitesBot or SingleSiteBot class as there is no site management in this class.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

ignore_server_errors = False#
treat_page()[source]#

Process one page (Abstract method).

Return type:

None

treat(page)[source]#

Set page to current page and treat that page.

Parameters:

page (BasePage)

Return type:

None

put_current(new_text, ignore_save_related_errors=None, ignore_server_errors=None, **kwargs)[source]#

Call Bot.userPut but use the current page.

It compares the new_text to the current page text.

Parameters:
  • new_text (str) – The new text

  • ignore_save_related_errors (bool | None) – Ignore save related errors and automatically print a message. If None uses this instances default.

  • ignore_server_errors (bool | None) – Ignore server errors and automatically print a message. If None uses this instances default.

  • kwargs (Any) – Additional parameters directly given to Bot.userPut.

Returns:

whether the page was saved successfully

Return type:

bool

class bot.AutomaticTWSummaryBot(**kwargs)[source]#

Bases: CurrentPageBot

A class which automatically defines summary for put_current.

The class must defined a summary_key string which contains the i18n key for i18n.twtranslate. It can also override the summary_parameters property to specify any parameters for the translated message.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

summary_key: str | None = None#

Must be defined in subclasses.

property summary_parameters: dict[str, str]#

A dictionary of all parameters for i18n.

put_current(*args, **kwargs)[source]#

Defining a summary if not already defined and then call original.

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

None

class bot.ExistingPageBot(**kwargs)[source]#

Bases: CurrentPageBot

A CurrentPageBot class which only treats existing pages.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

skip_page(page)[source]#

Treat page if it exists and handle NoPageError.

Warning

If subclassed, call super().skip_page() first to ensure that non existent pages are filtered before other calls are made

Parameters:

page (BasePage)

Return type:

bool

class bot.FollowRedirectPageBot(**kwargs)[source]#

Bases: CurrentPageBot

A CurrentPageBot class which follows the redirect.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

treat(page)[source]#

Treat target if page is redirect and the page otherwise.

Parameters:

page (BasePage)

Return type:

None

class bot.CreatingPageBot(**kwargs)[source]#

Bases: CurrentPageBot

A CurrentPageBot class which only treats nonexistent pages.

Parameters:

kwargs (Any) – bot options

Keyword Arguments:

generator – a generator processed by run() method

skip_page(page)[source]#

Treat page if doesn’t exist.

Parameters:

page (BasePage)

Return type:

bool

class bot.RedirectPageBot(*args, **kwargs)[source]#

Bases: CurrentPageBot

A RedirectPageBot class which only treats redirects.

Deprecated since version 7.2: use BaseBot attribute use_redirects  = True instead

Deprecate RedirectPageBot.

skip_page(page)[source]#

Treat only redirect pages and handle IsNotRedirectPageError.

Parameters:

page (BasePage)

Return type:

bool

class bot.NoRedirectPageBot(*args, **kwargs)[source]#

Bases: CurrentPageBot

A NoRedirectPageBot class which only treats non-redirects.

Deprecated since version 7.2: use BaseBot attribute use_redirects  = False instead

Deprecate NoRedirectPageBot.

skip_page(page)[source]#

Treat only non-redirect pages and handle IsRedirectPageError.

Parameters:

page (BasePage)

Return type:

bool

class bot.WikidataBot(**kwargs)[source]#

Bases: Bot, ExistingPageBot

Generic Wikidata Bot to be subclassed.

Source claims (P143) can be created for specific sites

Variables:
  • use_from_page – If True (default) it will apply ItemPage.fromPage for every item. If False it assumes that the pages are actually already ItemPage (page in treat_page_and_item will be None). If None it’ll use ItemPage.fromPage when the page is not in the site’s item namespace.

  • treat_missing_item – Whether pages without items should be treated. Note that this is checked after create_missing_item.

  • create_missing_item – If True, new items will be created if the current page doesn’t have one. Subclasses should override this in the initializer with a bool value or using self.opt attribute.

Parameters:

kwargs (Any)

Initializer of the WikidataBot.

use_from_page = True#
treat_missing_item = False#
cacheSources()[source]#

Fetch the sources from the list on Wikidata.

It is stored internally and reused by getSource()

Return type:

None

get_property_by_name(property_name)[source]#

Find given property and return its ID.

Method first uses site.search() and if the property isn’t found, then asks user to provide the property ID.

Parameters:

property_name (str) – property to find

Return type:

str

user_edit_entity(entity, data=None, ignore_save_related_errors=None, ignore_server_errors=None, **kwargs)[source]#

Edit entity with data provided, with user confirmation as required.

Parameters:
  • entity (WikibasePage) – page to be edited

  • data (dict[str, str] | None) – data to be saved, or None if the diff should be created automatically

  • ignore_save_related_errors (bool | None) – Ignore save related errors and automatically print a message. If None uses this instances default.

  • ignore_server_errors (bool | None) – Ignore server errors and automatically print a message. If None uses this instances default.

  • kwargs (Any)

Keyword Arguments:
  • summary – revision comment, passed to ItemPage.editEntity

  • show_diff – show changes between oldtext and newtext (default: True)

Returns:

whether the item was saved successfully

Return type:

bool

user_add_claim(item, claim, source=None, bot=True, **kwargs)[source]#

Add a claim to an item, with user confirmation as required.

Parameters:
  • item (pywikibot.page.ItemPage) – page to be edited

  • claim (pywikibot.page.Claim) – claim to be saved

  • source (BaseSite | None) – site where the claim comes from

  • bot (bool) – whether to flag as bot (if possible)

  • kwargs (Any)

Keyword Arguments:
  • ignore_server_errors – if True, server errors will be reported and ignored (default: False)

  • ignore_save_related_errors – if True, errors related to page save will be reported and ignored (default: False)

Returns:

whether the item was saved successfully

Return type:

bool

Note

calling this method sets the current_page property to the item which changes the site property

Note

calling this method with the ‘source’ argument modifies the provided claim object in place

getSource(site)[source]#

Create a Claim usable as a source for Wikibase statements.

Parameters:

site (BaseSite) – site that is the source of assertions.

Returns:

pywikibot.Claim or None

Return type:

pywikibot.page.Claim | None

user_add_claim_unless_exists(item, claim, exists_arg='', source=None, logger_callback=<function deprecated_args.<locals>.decorator.<locals>.wrapper>, **kwargs)[source]#

Decorator of user_add_claim.

Before adding a new claim, it checks if we can add it, using provided filters.

See also

documentation of claimit.py

Parameters:
  • exists_arg (Container) – pattern for merging existing claims with new ones

  • logger_callback (Callable[[str], Any]) – function logging the output of the method

  • item (pywikibot.page.ItemPage)

  • claim (pywikibot.page.Claim)

  • source (BaseSite | None)

  • kwargs (Any)

Returns:

whether the claim could be added

Return type:

bool

Note

calling this method may change the current_page property to the item which will also change the site property

Note

calling this method with the ‘source’ argument modifies the provided claim object in place

create_item_for_page(page, data=None, summary=None, **kwargs)[source]#

Create an ItemPage with the provided page as the sitelink.

Parameters:
  • page (BasePage) – the page for which the item will be created

  • data (dict[str, Any] | None) – additional data to be included in the new item (optional). Note that data created from the page have higher priority.

  • summary (str | None) – optional edit summary to replace the default one

  • kwargs (Any)

Returns:

pywikibot.ItemPage or None

Return type:

ItemPage | None

treat_page()[source]#

Treat a page.

Return type:

None

treat_page_and_item(page, item)[source]#

Treat page together with its item (if it exists).

Must be implemented in subclasses.

Parameters:
Return type:

None