bot
— Classes for Building Bots#
User-interface related functions for building bots.
This module supports several different bot classes which could be used in conjunction. Each bot should subclass at least one of these four classes:
BaseBot
: Basic bot class in case where the site is handled differently, like working on multiple sites in parallel. No site attribute is provided. Instead site of the current page should be used. This class should normally not be used directly.SingleSiteBot
: Bot class which should only be run on a single site. They usually store site specific content and thus can’t be easily run when the generator returns a page on another site. It has a propertysite
which can also be changed. If the generator returns a page of a different site it’ll skip that page.MultipleSitesBot
: An alias ofBaseBot
. Should not be used if any other bot class is used.ConfigParserBot
: Bot class which supports reading options from ascripts.ini
configuration file. That file consists of sections, led by a[section]
header and followed byoption: value
oroption=value
entries. The section is the script name without .py suffix. All options identified must be predefined in available_options dictionary.Bot
: The previous base class which should be avoided. This class is mainly used for bots which work with Wikibase or together with an image repository.
Additionally there is the CurrentPageBot
class which
automatically sets the current page to the page treated. It is
recommended to use this class and to use treat_page
instead of
treat
and put_current
instead of userPut
. It by default
subclasses the BaseBot
class.
With CurrentPageBot
it’s possible to subclass one of the
following classes to filter the pages which are ultimately handled by
CurrentPageBot.treat_page()
:
ExistingPageBot
: Only handle pages which do exist.CreatingPageBot
: Only handle pages which do not exist.FollowRedirectPageBot
: If the generator returns a redirect page it’ll follow the redirect and instead work on the redirected class.
It is possible to combine filters by subclassing multiple of them. They
are new-style classes so when a class is first subclassing
ExistingPageBot
and then FollowRedirectPageBot
it
will also work on pages which do not exist when a redirect pointed to
that. If the order is inversed it’ll first follow them and then check
whether they exist.
Additionally there is the AutomaticTWSummaryBot
which
subclasses CurrentPageBot
and automatically defines the summary
when put_current()
is used.
Deprecated since version 7.2: The bot classes RedirectPageBot
and
NoRedirectPageBot
are deprecated. Use
use_redirects
attribute instead.
Deprecated since version 9.2: The functions
critical()
debug()
error()
exception()
log()
output()
stdout()
and
warning()
as well as the constants
CRITICAL
, DEBUG
, ERROR
, INFO
,
INPUT
STDOUT
VERBOSE
and WARNING
imported from logging
module are deprecated
within this module. Import them directly. These functions can also be
used as pywikibot
members.
Imports in pywikibot
module
The following classes and functions are inported in pywikibot
module and can also be used as pywikibot
members:
- class bot.BaseBot(**kwargs)[source]#
Bases:
OptionHandler
Generic Bot to be subclassed.
Only accepts
generator
and options defined inavailable_options
.This class provides a
run()
method for basic processing of a generator one page at a time.If the subclass places a page generator in
generator
, Bot will process each page in the generator, invoking the methodtreat()
which must then be implemented by subclasses.Each item processed by
treat()
must be apage.BasePage
type. Useinit_page()
to upcast the type. To enable other types, setBaseBot.treat_page_type
to an appropriate type; your bot should derive fromBaseBot
in that case and handle site properties.If the subclass does not set a generator, or does not override
treat()
orrun()
,NotImplementedError
is raised.For bot options handling refer
OptionHandler
class above.Changed in version 7.0: A
counter
instance variable is provided.- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
- generator: Iterable#
Instance variable to hold the Iterbale processed by
run()
method. The is added to the class with generator keyword argument and the proposed type is aGenerator
. If not,run()
upcast the generator attribute to become aGenerator
type. If aBaseBot
subclass has its owngenerator
attribute, a warning will be thrown when an object is passed to generator keyword parameter.Warning
this is just a sample
- _save_page(page, func, *args, **kwargs)[source]#
Helper function to handle page save-related option error handling.
Note
Do no use it directly. Use
userPut()
instead.- Parameters:
page (BasePage) – currently edited page
func (Callable[[...], Any]) – the function to call
args (Any) – passed to the function
kwargs (Any) – passed to the function
- Keyword Arguments:
ignore_server_errors (bool) – if True, server errors will be reported and ignored (default: False)
ignore_save_related_errors (bool) – if True, errors related to page save will be reported and ignored (default: False)
- Returns:
whether the page was saved successfully
- Return type:
bool
- available_options: dict[str, Any] = {'always': False}#
Handler configuration attribute. Only the keys of the dict can be passed as
__init__
options. The values are the default values. Overwrite this in subclasses!
- counter: Counter#
Instance variable which holds counters. The default counters are ‘read’, ‘write’ and ‘skip’. All of them are printed within
exit()
. You can use your own counters like:self.counter['delete'] += 1
Added in version 7.0.
Changed in version 7.3: Your additional counters are also printed during
exit()
- exit()[source]#
Cleanup and exit processing.
Invoked when
run()
is finished. Waits for pending threads, prints counter statistics and informs whether the script terminated gracefully or was halted by exception.Note
Do not overwrite it by subclasses;
teardown()
should be used instead.Changed in version 7.3: Statistics are printed for all entries in
counter
Changed in version 9.0: Print execution time with days, hours, minutes and seconds.
- Return type:
None
- generator: Iterable#
- generator_completed: bool#
Instance attribute which is True if the
generator
is completed.It gives False if the the generator processing in
run()
is either interrupted byKeyboardInterrupt
or exited byQuitKeyboardInterrupt
while closing the generator i.e.self.generator.close()
keeps the value True.To check for an empty generator you may use:
if self.generator_completed and not self.counter['read']: print('generator was emtpty')
Note
An empty generator returns True.
Added in version 3.0.
Changed in version 7.4: renamed to
generator_completed
to become a public attribute.
- init_page(item)[source]#
Initialize a generator item before treating.
Ensure that the result of
init_page
is always a pywikibot.Page object or any other type given by thetreat_page_type
even when the generator returns something else.Also used to set the arrange the current site. This is called before
skip_page()
andtreat()
.
- run()[source]#
Process all pages in generator.
Call
setup()
, check for a validIterable
type ingenerator
, upcast it to aGenerator
type if necessary, process every generator`s item as follows:For each item call
init_page()
, check whether the result is atreat_page_type
type, callskip_page()
to determine whether to skip the current page. Otherwise calltreat()
for each item.This method also adjust
read
andskip
counter
, and finally it callsexit()
when leaving the method. In short this method is implemented similar to this:def run(self) -> None: '''Process all pages in generator.''' self.setup() if not hasattr(self, 'generator'): raise NotImplementedError('"generator" not set.') if self.generator is None; print('No generator was defined') try: for item in self.generator: page = self.init_page(item) if self.skip_page(page): continue self.treat(page) except(QuitKeyboardInterrupt, KeyboardInterrupt): print('User canceled bot run.') finally: self.exit()
Changed in version 3.0:
skip
counter was added.; callsetup()
first.Changed in version 6.0: upcast
generator
to aGenerator
type to enablegenerator.close()
method.Changed in version 6.1: Objects from
generator
may be different frompywikibot.Page
but the type must be registered intreat_page_type
.Changed in version 9.2: leave method gracefully if
generator
is None usingsuggest_help()
function.- Raises:
AssertionError – “page” is not a pywikibot.page.BasePage object
KeyboardInterrupt – KeyboardInterrupt occurred while
config.verbose_output
was setNotImplementedError –
generator
is not setTypeError – invalid generator type or page is not a
treat_page_type
- Return type:
None
- setup()[source]#
Some initial setup before
run()
operation starts.This can be used for reading huge parts from life wiki or file operation which is more than just initialize the instance. Invoked by
run()
before running throughgenerator
loop.Added in version 3.0.
- Return type:
None
- skip_page(page)[source]#
Return whether treat should be skipped for the page.
Added in version 3.0.
Changed in version 7.2: use
use_redirects
to handle redirects, useuse_disambigs
to handle disambigs- Parameters:
page (BasePage) – Page object to be processed
- Return type:
bool
- teardown()[source]#
Some cleanups after
run()
operation. Invoked byexit()
.Added in version 3.0.
- Return type:
None
- treat(page)[source]#
Process one page (abstract method).
- Parameters:
page (Any) – Object to be processed, usually a
page.BasePage
. For other page types thetreat_page_type
must be set.- Return type:
None
- treat_page_type: Any#
Instance variable to hold the default page type used by
run()
.Added in version 6.1.
- update_options: dict[str, Any] = {}#
update_options
can be used to updateavailable_options
; do not use it if the bot class is to be derived but useself.available_options.update(<dict>)
initializer in such case.Added in version 6.4.
- use_disambigs: bool | None = None#
Attribute to determine whether to use disambiguation pages. Set it to True to use disambigs only, set it to False to skip disambigs. If None both are processed.
Added in version 7.2.
- use_redirects: bool | None = None#
Attribute to determine whether to use redirect pages. Set it to True to use redirects only, set it to False to skip redirects. If None both are processed. For example to create a RedirectBot you may define:
class MyRedirectBot(ExistingPageBot): '''Bot who only works on existing redirects.''' use_redirects = True
Added in version 7.2.
- userPut(page, oldtext, newtext, **kwargs)[source]#
Save a new revision of a page, with user confirmation as required.
Print differences, ask user for confirmation, and puts the page if needed.
Option used:
‘always’
- Keyword Arguments:
asynchronous – passed to page.save
summary – passed to page.save
show_diff – show changes between oldtext and newtext (enabled)
ignore_save_related_errors – report and ignore (disabled)
ignore_server_errors – report and ignore (disabled)
- Returns:
whether the page was saved successfully
- Parameters:
page (BasePage)
oldtext (str)
newtext (str)
kwargs (Any)
- Return type:
bool
- class bot.LoggingFormatter(fmt=None, datefmt=None, style='%', validate=True, *, defaults=None)[source]#
Bases:
Formatter
Format LogRecords for output to file.
Initialize the formatter with specified format strings.
Initialize the formatter either with the specified format string, or a default as described above. Allow for specialized date formatting with the optional datefmt argument. If datefmt is omitted, you get an ISO8601-like (or RFC 3339-like) format.
Use a style parameter of ‘%’, ‘{’ or ‘$’ to specify that you want to use one of %-formatting,
str.format()
({}
) formatting orstring.Template
formatting in your format string.Changed in version 3.2: Added the
style
parameter.
- bot.set_interface(module_name)[source]#
Configures any bots to use the given interface module.
Search for user interface module in the
pywikibot.userinterfaces
subdirectory and initialize UI. Callsinit_handlers()
to re-initialize if we were already initialized with another UI.Added in version 6.4.
- Parameters:
module_name (str)
- Return type:
None
- bot.init_handlers()[source]#
Initialize the handlers and formatters for the logging system.
This relies on the global variable
ui
which is a UI object.See also
pywikibot.userinterfaces
Calls
writelogheader()
after handlers are initialized. This function must be called before using any input/output methods; and must be called again if ui handler is changed. Useset_interface()
to set the new interface which initializes it.Note
this function is called by any user input and output function, so it should normally not need to be called explicitly.
All user output is routed through the logging module. Each type of output is handled by an appropriate handler object. This structure is used to permit eventual development of other user interfaces (GUIs) without modifying the core bot code.
The following output levels are defined:
DEBUG: only for file logging; debugging messages.
STDOUT: output that must be sent to sys.stdout (for bots that may have their output redirected to a file or other destination).
VERBOSE: optional progress information for display to user.
INFO: normal (non-optional) progress information for display to user.
INPUT: prompts requiring user response.
WARN: user warning messages.
ERROR: user error messages.
CRITICAL: fatal error messages.
See also
Accordingly, do not use print statements in bot code; instead, use
pywikibot.info()
function and other functions frompywikibot.logging
module.Changed in version 6.2: Different logfiles are used if multiple processes of the same script are running.
- Return type:
None
- bot.writelogheader()[source]#
Save additional version, system and status info to the log file in use.
This may help the user to track errors or report bugs.
Changed in version 9.0: ignore milliseconds with timestamp.
- Return type:
None
- bot.input(question, password=False, default='', force=False)[source]#
Ask the user a question, return the user’s answer.
- Parameters:
question (str) – a string that will be shown to the user. Don’t add a space after the question mark/colon, this method will do this for you.
password (bool) – if True, hides the user’s input (for password entry).
default (str | None) – The default answer if none was entered. None to require an answer.
force (bool) – Automatically use the default
- Return type:
str
- bot.input_choice(question, answers, default=None, return_shortcut=True, automatic_quit=True, force=False)[source]#
Ask the user the question and return one of the valid answers.
See also
- Parameters:
question (str) – The question asked without trailing spaces.
answers (AnswerType) – The valid answers each containing a full length answer and a shortcut. Each value must be unique.
default (str | None) – The result if no answer was entered. It must not be in the valid answers and can be disabled by setting it to None. If it should be linked with the valid answers it must be its shortcut.
return_shortcut (bool) – Whether the shortcut or the index of the answer is returned.
automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a
bot.QuitKeyboardInterrupt
if selected.force (bool) – Automatically use the default.
- Returns:
The selected answer shortcut or index. Is -1 if the default is selected, it does not return the shortcut and the default is not a valid shortcut.
- Return type:
Any
- bot.input_yn(question, default=None, automatic_quit=True, force=False)[source]#
Ask the user a yes/no question and return the answer as a bool.
Example:
>>> input_yn('Do you like Pywikibot?', 'y', False, force=True) ... Do you like Pywikibot? ([Y]es, [n]o) True >>> input_yn('More examples?', False, automatic_quit=False, force=True) ... Some more examples? ([y]es, [N]o) False
See also
- Parameters:
question (str) – The question asked without trailing spaces.
default (bool | str | None) – The result if no answer was entered. It must be a bool or
'y'
,'n'
,0
or1
and can be disabled by setting it to None.automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a
bot.QuitKeyboardInterrupt
if selected.force (bool) – Automatically use the default.
- Returns:
Return True if the user selected yes and False if the user selected no. If the default is not None it’ll return True if default is True or ‘y’ and False if default is False or ‘n’.
- Return type:
bool
- bot.input_list_choice(question, answers, default=None, force=False)[source]#
Ask the user the question and return one of the valid answers.
- Parameters:
question (str) – The question asked without trailing spaces.
answers (AnswerType) – The valid answers each containing a full length answer.
default (int | str | None) – The result if no answer was entered. It must not be in the valid answers and can be disabled by setting it to None.
force (bool) – Automatically use the default
- Returns:
The selected answer.
- Return type:
str
- bot.ui: ABUIC | None = <pywikibot.userinterfaces.terminal_interface_unix.UnixUI object>#
Holds a user interface object defined in
pywikibot.userinterfaces
subpackage.
- class bot.Option(stop=True)[source]#
Bases:
ABC
A basic option for input_choice.
The following methods need to be implemented:
format(default=None)
result(value)
test(value)
The methods
test
andhandled
are in such a relationship that whenhandled
returns itself thattest
must return True for that value. So iftest
returns Falsehandled
may not return itself but it may return not None.Also
result
only returns a sensible value whentest
returns True for the same value.- Parameters:
stop (bool)
- static formatted(text, options, default=None)[source]#
Create a text with the options formatted into it.
This static method is used by
pywikibot.input_choice()
. It callsformat
for all options to combine the question forpywikibot.input()
.- Parameters:
text (str) – Text into which options are to be formatted
options (Iterable[Option]) – Option instances to be formatted
default (str | None) – filler for any option’s ‘default’ placeholder
- Returns:
Text with the options formatted into it
- Return type:
str
- property stop: bool#
Return whether this option stops asking.
- handled(value)[source]#
Return the Option object that applies to the given value.
If this Option object doesn’t know which applies it returns None.
- Parameters:
value (str)
- Return type:
Option | None
- format(default=None)[source]#
Return a formatted string for that option.
- Parameters:
default (str | None)
- Return type:
str
- class bot.StandardOption(option, shortcut, **kwargs)[source]#
Bases:
Option
An option with a description and shortcut and returning the shortcut.
- Parameters:
option (str) – option string
shortcut (str) – Shortcut of the option
kwargs (Any)
- class bot.NestedOption(option, shortcut, description, options)[source]#
Bases:
OutputOption
,StandardOption
An option containing other options.
It will return True in test if this option applies but False if a sub option applies while handle returns the sub option.
- Parameters:
option (str)
shortcut (str)
description (str)
options (Iterable[Option])
- format(default=None)[source]#
Return a formatted string for that option.
- Parameters:
default (str | None)
- Return type:
str
- handled(value)[source]#
Return itself if it applies or the applying sub option.
- Parameters:
value (str)
- Return type:
Option | None
- property out: str#
Output of suboptions.
- class bot.IntegerOption(minimum=1, maximum=None, prefix='', **kwargs)[source]#
Bases:
Option
An option allowing a range of integers.
- Parameters:
minimum (int)
maximum (int | None)
prefix (str)
kwargs (Any)
- test(value)[source]#
Return whether the value is an int and in the specified range.
- Parameters:
value (str)
- Return type:
bool
- property minimum: int#
Return the lower bound of the range of allowed values.
- property maximum: int | None#
Return the upper bound of the range of allowed values.
- format(default=None)[source]#
Return a formatted string showing the range.
- Parameters:
default (str | None)
- Return type:
str
- class bot.ContextOption(option, shortcut, text, context, delta=100, start=0, end=0)[source]#
Bases:
OutputOption
,StandardOption
An option to show more and more context.
- Parameters:
option (str)
shortcut (str)
text (str)
context (int)
delta (int)
start (int)
end (int)
- property out: str#
Output section of the text.
- class bot.ListOption(sequence, prefix='', **kwargs)[source]#
Bases:
IntegerOption
An option to select something from a list.
- Parameters:
sequence (Sequence[str])
prefix (str)
kwargs (Any)
- format(default=None)[source]#
Return a string showing the range.
- Parameters:
default (str | None)
- Return type:
str
- property maximum: int#
Return the maximum value.
- class bot.ShowingListOption(sequence, prefix='', pre=None, post=None, **kwargs)[source]#
Bases:
ListOption
,OutputOption
An option to show a list and select an item.
Added in version 3.0.
- Parameters:
pre (str | None) – Additional comment printed before the list.
post (str | None) – Additional comment printed after the list.
sequence (Sequence[str])
prefix (str)
kwargs (Any)
- before_question: bool = True#
Place output before or after the question
- property stop: bool#
Return whether this option stops asking.
- property out: str#
Output text of the enumerated list.
- class bot.MultipleChoiceList(sequence, prefix='', **kwargs)[source]#
Bases:
ListOption
An option to select multiple items from a list.
Added in version 3.0.
- Parameters:
sequence (Sequence[str])
prefix (str)
kwargs (Any)
- class bot.ShowingMultipleChoiceList(sequence, prefix='', pre=None, post=None, **kwargs)[source]#
Bases:
ShowingListOption
,MultipleChoiceList
An option to show a list and select multiple items.
Added in version 3.0.
- Parameters:
pre (str | None) – Additional comment printed before the list.
post (str | None) – Additional comment printed after the list.
sequence (Sequence[str])
prefix (str)
kwargs (Any)
- class bot.OutputProxyOption(option, shortcut, output, **kwargs)[source]#
Bases:
OutputOption
,StandardOption
An option which calls out property of the given output class.
Create a new option for the given sequence.
- Parameters:
option (str)
shortcut (str)
output (OutputOption)
kwargs (Any)
- property out: str#
Return the contents.
- class bot.HighlightContextOption(option, shortcut, text, context, delta=100, start=0, end=0)[source]#
Bases:
ContextOption
Show the original region highlighted.
- Parameters:
option (str)
shortcut (str)
text (str)
context (int)
delta (int)
start (int)
end (int)
- color = 'lightred'#
- property out: str#
Highlighted output section of the text.
- exception bot.ChoiceException(option, shortcut, **kwargs)[source]#
Bases:
StandardOption
,Exception
A choice for input_choice which result in this exception.
- Parameters:
option (str) – option string
shortcut (str) – Shortcut of the option
kwargs (Any)
- Return type:
None
- class bot.Choice(option, shortcut, replacer)[source]#
Bases:
StandardOption
A simple choice consisting of an option, shortcut and handler.
- Parameters:
option (str)
shortcut (str)
replacer (InteractiveReplace | None)
- property replacer: InteractiveReplace | None#
The replacer.
- class bot.StaticChoice(option, shortcut, result)[source]#
Bases:
Choice
A static choice which just returns the given value.
Create instance with replacer set to None.
- Parameters:
option (str)
shortcut (str)
result (Any)
- class bot.LinkChoice(option, shortcut, replacer, replace_section, replace_label)[source]#
Bases:
Choice
A choice returning a mix of the link new and current link.
- Parameters:
option (str)
shortcut (str)
replacer (InteractiveReplace | None)
replace_section (bool)
replace_label (bool)
- class bot.AlwaysChoice(replacer, option='always', shortcut='a')[source]#
Bases:
Choice
Add an option to always apply the default.
- Parameters:
replacer (InteractiveReplace | None)
option (str)
shortcut (str)
- property answer: Any#
Get the actual default answer instructing the replacement.
- exception bot.QuitKeyboardInterrupt[source]#
Bases:
ChoiceException
,KeyboardInterrupt
The user has cancelled processing at a prompt.
Constructor using the ‘quit’ (‘q’) in input_choice.
- Return type:
None
- class bot.InteractiveReplace(old_link, new_link, default=None, automatic_quit=True)[source]#
Bases:
object
A callback class for textlib’s replace_links.
It shows various options which can be switched on and off: * allow_skip_link = True (skip the current link) * allow_unlink = True (unlink) * allow_replace = False (just replace target, keep section and label) * allow_replace_section = False (replace target and section, keep label) * allow_replace_label = False (replace target and label, keep section) * allow_replace_all = False (replace target, section and label) (The boolean values are the default values)
It has also a
context
attribute which must be a non-negative integer. If it is greater 0 it shows that many characters before and after the link in question. Thecontext_delta
attribute can be defined too and adds an option to increasecontext
by the given amount each time the option is selected.Additional choices can be defined using the ‘additional_choices’ and will be amended to the choices defined by this class. This list is mutable and the Choice instance returned and created by this class are too.
- Parameters:
old_link (Link | Page) – The old link which is searched. The label and section are ignored.
new_link (Link | Page | Literal[False]) – The new link with which it should be replaced. Depending on the replacement mode it’ll use this link’s label and section. If False it’ll unlink all and the attributes beginning with allow_replace are ignored.
default (str | None) – The default answer as the shortcut
automatic_quit (bool) – Add an option to quit and raise a QuitKeyboardException.
- handle_answer(choice)[source]#
Return the result for replace_links.
- Parameters:
choice (str)
- Return type:
Any
- property choices: tuple[StandardOption, ...]#
Return the tuple of choices.
- property current_text: str#
Get the current text when it’s handling one currently.
- property current_groups: Mapping[str, str]#
Get the current groups when it’s handling one currently.
- property current_range: tuple[int, int]#
Get the current range when it’s handling one currently.
- bot.calledModuleName()[source]#
Return the name of the module calling this function.
This is required because the -help option loads the module’s docstring and because the module name will be used for the filename of the log.
- Return type:
str
- bot.handle_args(args=None, do_help=True)[source]#
Handle global command line arguments and return the rest as a list.
Takes the command line arguments as strings, processes all global parameters such as
-lang
or-log
, initialises the logging layer, which emits startup information into log at level ‘verbose’. This function makes sure that global arguments are applied first, regardless of the order in which the arguments were given.args
may be passed as an argument, thereby overridingsys.argv
.>>> local_args = pywikibot.handle_args() # sys.argv is used >>> local_args [] >>> local_args = pywikibot.handle_args(['-simulate', '-myoption']) >>> local_args # global optons are handled, show the remaining ['-myoption'] >>> for arg in local_args: pass # do whatever is wanted with local_args
Caution
Global options might be introduced without warning period. It is up to developers to verify that global options do not interfere with local script options of private scripts.
Tip
Avoid using this method in your private scripts and use the
pwb
wrapper instead. In directory mode:python pwb.py <global options> <name_of_script> <local options>
With installed site package:
pwb <global options> <name_of_script> <local options>
Note
the
pwb
wrapper can be used even if thehandle_args
method is used within the script.Changed in version 5.2: -site global option was added
Changed in version 7.1: -cosmetic_changes and -cc may be set directly instead of toggling the value. Refer
tools.strtobool()
for valid values.Changed in version 7.7: -config global option was added.
Changed in version 8.0: Short site value can be given if site code is equal to family like
-site:meta
.Changed in version 8.1:
-nolog
option also discards command.log.- Parameters:
args (Iterable[str] | None) – Command line arguments. If None,
pywikibot.argvu
is used which is a copy ofsys.argv
do_help (bool) – Handle parameter ‘-help’ to show help and invoke sys.exit
- Returns:
list of arguments not recognised globally
- Return type:
list[str]
- bot.show_help(module_name=None, show_global=False)[source]#
Show help for the Bot.
Changed in version 4.0: Renamed from showHelp() to show_help().
Changed in version 8.0: Do not show version changes.
- Parameters:
module_name (str | None)
show_global (bool)
- Return type:
None
- bot.suggest_help(missing_parameters=None, missing_generator=False, unknown_parameters=None, exception=None, missing_action=False, additional_text='', missing_dependencies=None)[source]#
Output error message to use -help with additional text before it.
- Parameters:
missing_parameters (Sequence[str] | None) – A list of parameters which are missing.
missing_generator (bool) – Whether a generator is missing.
unknown_parameters (Sequence[str] | None) – A list of parameters which are unknown.
exception (Exception | None) – An exception thrown.
missing_action (bool) – Add an entry that no action was defined.
additional_text (str) – Additional text added to the end.
missing_dependencies (Sequence[str] | None) – A list of dependencies which cannot be imported.
- Returns:
True if an error message was printed, False otherwise
- Return type:
bool
- bot.writeToCommandLogFile()[source]#
Save name of the called module along with all params to logfile.
This can be used by user later to track errors or report bugs.
- Return type:
None
- bot.open_webbrowser(page)[source]#
Open the web browser displaying the page and wait for input.
- Parameters:
page (BasePage)
- Return type:
None
- class bot.OptionHandler(**kwargs)[source]#
Bases:
object
Class to get and set options.
How to use options of OptionHandler and its BaseBot subclasses: First define an available_options class attribute for your own option handler to define all available options:
>>> default_options = {'foo': 'bar', 'bar': 42, 'baz': False} >>> class MyHandler(OptionHandler): available_options = default_options
Or you may update the predefined setting in the class initializer. BaseBot predefines a ‘always’ options and sets it to False:
self.available_options.update(always=True, another_option=’Yes’)
Now you can instantiate an OptionHandler or BaseBot class passing options other than default values:
>>> bot = MyHandler(baz=True)
You can access bot options either as keyword item or attribute:
>>> bot.opt.foo 'bar' >>> bot.opt['bar'] 42 >>> bot.opt.baz # default was overridden True
You can set the options in the same way:
>>> bot.opt.bar = 4711 >>> bot.opt['baz'] = None >>>
Or you can use the option as a dict:
>>> 'Option opt.{foo} is {bar}'.format_map(bot.opt) 'Option opt.bar is 4711'
Warning
You must not access bot options as an attribute if the keyword is a dict method.
Only accept options defined in available_options.
- Parameters:
kwargs (Any) – bot options
- available_options: dict[str, Any] = {}#
Handler configuration attribute. Only the keys of the dict can be passed as
__init__
options. The values are the default values. Overwrite this in subclasses!
- class bot.BaseBot(**kwargs)[source]#
Bases:
OptionHandler
Generic Bot to be subclassed.
Only accepts
generator
and options defined inavailable_options
.This class provides a
run()
method for basic processing of a generator one page at a time.If the subclass places a page generator in
generator
, Bot will process each page in the generator, invoking the methodtreat()
which must then be implemented by subclasses.Each item processed by
treat()
must be apage.BasePage
type. Useinit_page()
to upcast the type. To enable other types, setBaseBot.treat_page_type
to an appropriate type; your bot should derive fromBaseBot
in that case and handle site properties.If the subclass does not set a generator, or does not override
treat()
orrun()
,NotImplementedError
is raised.For bot options handling refer
OptionHandler
class above.Changed in version 7.0: A
counter
instance variable is provided.- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
- use_disambigs: bool | None = None#
Attribute to determine whether to use disambiguation pages. Set it to True to use disambigs only, set it to False to skip disambigs. If None both are processed.
Added in version 7.2.
- use_redirects: bool | None = None#
Attribute to determine whether to use redirect pages. Set it to True to use redirects only, set it to False to skip redirects. If None both are processed. For example to create a RedirectBot you may define:
class MyRedirectBot(ExistingPageBot): '''Bot who only works on existing redirects.''' use_redirects = True
Added in version 7.2.
- available_options: dict[str, Any] = {'always': False}#
Handler configuration attribute. Only the keys of the dict can be passed as
__init__
options. The values are the default values. Overwrite this in subclasses!
- update_options: dict[str, Any] = {}#
update_options
can be used to updateavailable_options
; do not use it if the bot class is to be derived but useself.available_options.update(<dict>)
initializer in such case.Added in version 6.4.
- counter: Counter#
Instance variable which holds counters. The default counters are ‘read’, ‘write’ and ‘skip’. All of them are printed within
exit()
. You can use your own counters like:self.counter['delete'] += 1
Added in version 7.0.
Changed in version 7.3: Your additional counters are also printed during
exit()
- generator_completed: bool#
Instance attribute which is True if the
generator
is completed.It gives False if the the generator processing in
run()
is either interrupted byKeyboardInterrupt
or exited byQuitKeyboardInterrupt
while closing the generator i.e.self.generator.close()
keeps the value True.To check for an empty generator you may use:
if self.generator_completed and not self.counter['read']: print('generator was emtpty')
Note
An empty generator returns True.
Added in version 3.0.
Changed in version 7.4: renamed to
generator_completed
to become a public attribute.
- treat_page_type: Any#
Instance variable to hold the default page type used by
run()
.Added in version 6.1.
- user_confirm(question)[source]#
Obtain user response if bot option ‘always’ not enabled.
- Parameters:
question (str)
- Return type:
bool
- userPut(page, oldtext, newtext, **kwargs)[source]#
Save a new revision of a page, with user confirmation as required.
Print differences, ask user for confirmation, and puts the page if needed.
Option used:
‘always’
- Keyword Arguments:
asynchronous – passed to page.save
summary – passed to page.save
show_diff – show changes between oldtext and newtext (enabled)
ignore_save_related_errors – report and ignore (disabled)
ignore_server_errors – report and ignore (disabled)
- Returns:
whether the page was saved successfully
- Parameters:
page (BasePage)
oldtext (str)
newtext (str)
kwargs (Any)
- Return type:
bool
- _save_page(page, func, *args, **kwargs)[source]#
Helper function to handle page save-related option error handling.
Note
Do no use it directly. Use
userPut()
instead.- Parameters:
page (BasePage) – currently edited page
func (Callable[[...], Any]) – the function to call
args (Any) – passed to the function
kwargs (Any) – passed to the function
- Keyword Arguments:
ignore_server_errors (bool) – if True, server errors will be reported and ignored (default: False)
ignore_save_related_errors (bool) – if True, errors related to page save will be reported and ignored (default: False)
- Returns:
whether the page was saved successfully
- Return type:
bool
- exit()[source]#
Cleanup and exit processing.
Invoked when
run()
is finished. Waits for pending threads, prints counter statistics and informs whether the script terminated gracefully or was halted by exception.Note
Do not overwrite it by subclasses;
teardown()
should be used instead.Changed in version 7.3: Statistics are printed for all entries in
counter
Changed in version 9.0: Print execution time with days, hours, minutes and seconds.
- Return type:
None
- init_page(item)[source]#
Initialize a generator item before treating.
Ensure that the result of
init_page
is always a pywikibot.Page object or any other type given by thetreat_page_type
even when the generator returns something else.Also used to set the arrange the current site. This is called before
skip_page()
andtreat()
.
- skip_page(page)[source]#
Return whether treat should be skipped for the page.
Added in version 3.0.
Changed in version 7.2: use
use_redirects
to handle redirects, useuse_disambigs
to handle disambigs- Parameters:
page (BasePage) – Page object to be processed
- Return type:
bool
- treat(page)[source]#
Process one page (abstract method).
- Parameters:
page (Any) – Object to be processed, usually a
page.BasePage
. For other page types thetreat_page_type
must be set.- Return type:
None
- setup()[source]#
Some initial setup before
run()
operation starts.This can be used for reading huge parts from life wiki or file operation which is more than just initialize the instance. Invoked by
run()
before running throughgenerator
loop.Added in version 3.0.
- Return type:
None
- teardown()[source]#
Some cleanups after
run()
operation. Invoked byexit()
.Added in version 3.0.
- Return type:
None
- run()[source]#
Process all pages in generator.
Call
setup()
, check for a validIterable
type ingenerator
, upcast it to aGenerator
type if necessary, process every generator`s item as follows:For each item call
init_page()
, check whether the result is atreat_page_type
type, callskip_page()
to determine whether to skip the current page. Otherwise calltreat()
for each item.This method also adjust
read
andskip
counter
, and finally it callsexit()
when leaving the method. In short this method is implemented similar to this:def run(self) -> None: '''Process all pages in generator.''' self.setup() if not hasattr(self, 'generator'): raise NotImplementedError('"generator" not set.') if self.generator is None; print('No generator was defined') try: for item in self.generator: page = self.init_page(item) if self.skip_page(page): continue self.treat(page) except(QuitKeyboardInterrupt, KeyboardInterrupt): print('User canceled bot run.') finally: self.exit()
Changed in version 3.0:
skip
counter was added.; callsetup()
first.Changed in version 6.0: upcast
generator
to aGenerator
type to enablegenerator.close()
method.Changed in version 6.1: Objects from
generator
may be different frompywikibot.Page
but the type must be registered intreat_page_type
.Changed in version 9.2: leave method gracefully if
generator
is None usingsuggest_help()
function.- Raises:
AssertionError – “page” is not a pywikibot.page.BasePage object
KeyboardInterrupt – KeyboardInterrupt occurred while
config.verbose_output
was setNotImplementedError –
generator
is not setTypeError – invalid generator type or page is not a
treat_page_type
- Return type:
None
- class bot.Bot(site=None, **kwargs)[source]#
Bases:
BaseBot
Generic bot subclass for multiple sites.
If possible the MultipleSitesBot or SingleSiteBot classes should be used instead which specifically handle multiple or single sites.
Create a Bot instance and initialize cached sites.
- Parameters:
site (BaseSite | None)
kwargs (Any)
- class bot.ConfigParserBot(**kwargs)[source]#
Bases:
BaseBot
A bot class that can read options from scripts.ini file.
All options must be predefined in available_options dictionary. The type of these options is responsible for the correct interpretation of the options type given by the .ini file. They can be interpreted as bool, int, float or str (default). The settings file may be like:
[add_text] # edit summary for the bot. summary = Bot: Aggiungo template Categorizzare [commonscat] ; commonscat options always: true
The option values are interpreted in this order:
available_options
default settingscript.ini options
settingscommand line arguments
Added in version 3.0.
- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
generator – a
generator
processed byrun()
method
- INI = 'scripts.ini'#
- class bot.SingleSiteBot(site=True, **kwargs)[source]#
Bases:
BaseBot
A bot only working on one site and ignoring the others.
If no site is given from the start it’ll use the first page’s site. Any page after the site has been defined and is not on the defined site will be ignored.
Create a SingleSiteBot instance.
- Parameters:
site (BaseSite | bool | None) – If True it’ll be set to the configured site using pywikibot.Site.
kwargs (Any)
- class bot.MultipleSitesBot(**kwargs)[source]#
Bases:
BaseBot
A bot class working on multiple sites.
The bot should accommodate for that case and not store site specific information on only one site.
Changed in version 6.2: Site attribute has been dropped.
- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
generator – a
generator
processed byrun()
method
- class bot.CurrentPageBot(**kwargs)[source]#
Bases:
BaseBot
A bot which automatically sets ‘current_page’ on each treat().
This class should be always used together with either the MultipleSitesBot or SingleSiteBot class as there is no site management in this class.
- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
generator – a
generator
processed byrun()
method
- ignore_server_errors = False#
- treat(page)[source]#
Set page to current page and treat that page.
- Parameters:
page (BasePage)
- Return type:
None
- put_current(new_text, ignore_save_related_errors=None, ignore_server_errors=None, **kwargs)[source]#
Call
Bot.userPut
but use the current page.It compares the new_text to the current page text.
- Parameters:
new_text (str) – The new text
ignore_save_related_errors (bool | None) – Ignore save related errors and automatically print a message. If None uses this instances default.
ignore_server_errors (bool | None) – Ignore server errors and automatically print a message. If None uses this instances default.
kwargs (Any) – Additional parameters directly given to
Bot.userPut
.
- Returns:
whether the page was saved successfully
- Return type:
bool
- class bot.AutomaticTWSummaryBot(**kwargs)[source]#
Bases:
CurrentPageBot
A class which automatically defines
summary
forput_current
.The class must defined a
summary_key
string which contains the i18n key fori18n.twtranslate
. It can also override thesummary_parameters
property to specify any parameters for the translated message.- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
generator – a
generator
processed byrun()
method
- summary_key: str | None = None#
Must be defined in subclasses.
- property summary_parameters: dict[str, str]#
A dictionary of all parameters for i18n.
- class bot.ExistingPageBot(**kwargs)[source]#
Bases:
CurrentPageBot
A CurrentPageBot class which only treats existing pages.
- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
generator – a
generator
processed byrun()
method
- class bot.FollowRedirectPageBot(**kwargs)[source]#
Bases:
CurrentPageBot
A CurrentPageBot class which follows the redirect.
- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
generator – a
generator
processed byrun()
method
- class bot.CreatingPageBot(**kwargs)[source]#
Bases:
CurrentPageBot
A CurrentPageBot class which only treats nonexistent pages.
- Parameters:
kwargs (Any) – bot options
- Keyword Arguments:
generator – a
generator
processed byrun()
method
- class bot.RedirectPageBot(*args, **kwargs)[source]#
Bases:
CurrentPageBot
A RedirectPageBot class which only treats redirects.
Deprecated since version 7.2: use BaseBot attribute
use_redirects = True
insteadDeprecate RedirectPageBot.
- class bot.NoRedirectPageBot(*args, **kwargs)[source]#
Bases:
CurrentPageBot
A NoRedirectPageBot class which only treats non-redirects.
Deprecated since version 7.2: use BaseBot attribute
use_redirects = False
insteadDeprecate NoRedirectPageBot.
- class bot.WikidataBot(**kwargs)[source]#
Bases:
Bot
,ExistingPageBot
Generic Wikidata Bot to be subclassed.
Source claims (P143) can be created for specific sites
- Variables:
use_from_page – If True (default) it will apply ItemPage.fromPage for every item. If False it assumes that the pages are actually already ItemPage (page in treat_page_and_item will be None). If None it’ll use ItemPage.fromPage when the page is not in the site’s item namespace.
treat_missing_item – Whether pages without items should be treated. Note that this is checked after create_missing_item.
create_missing_item – If True, new items will be created if the current page doesn’t have one. Subclasses should override this in the initializer with a bool value or using self.opt attribute.
- Parameters:
kwargs (Any)
Initializer of the WikidataBot.
- use_from_page = True#
- treat_missing_item = False#
- cacheSources()[source]#
Fetch the sources from the list on Wikidata.
It is stored internally and reused by getSource()
- Return type:
None
- get_property_by_name(property_name)[source]#
Find given property and return its ID.
Method first uses site.search() and if the property isn’t found, then asks user to provide the property ID.
- Parameters:
property_name (str) – property to find
- Return type:
str
- user_edit_entity(entity, data=None, ignore_save_related_errors=None, ignore_server_errors=None, **kwargs)[source]#
Edit entity with data provided, with user confirmation as required.
- Parameters:
entity (WikibasePage) – page to be edited
data (dict[str, str] | None) – data to be saved, or None if the diff should be created automatically
ignore_save_related_errors (bool | None) – Ignore save related errors and automatically print a message. If None uses this instances default.
ignore_server_errors (bool | None) – Ignore server errors and automatically print a message. If None uses this instances default.
kwargs (Any)
- Keyword Arguments:
summary – revision comment, passed to ItemPage.editEntity
show_diff – show changes between oldtext and newtext (default: True)
- Returns:
whether the item was saved successfully
- Return type:
bool
- user_add_claim(item, claim, source=None, bot=True, **kwargs)[source]#
Add a claim to an item, with user confirmation as required.
- Parameters:
item (pywikibot.page.ItemPage) – page to be edited
claim (pywikibot.page.Claim) – claim to be saved
source (BaseSite | None) – site where the claim comes from
bot (bool) – whether to flag as bot (if possible)
kwargs (Any)
- Keyword Arguments:
ignore_server_errors – if True, server errors will be reported and ignored (default: False)
ignore_save_related_errors – if True, errors related to page save will be reported and ignored (default: False)
- Returns:
whether the item was saved successfully
- Return type:
bool
Note
calling this method sets the current_page property to the item which changes the site property
Note
calling this method with the ‘source’ argument modifies the provided claim object in place
- getSource(site)[source]#
Create a Claim usable as a source for Wikibase statements.
- Parameters:
site (BaseSite) – site that is the source of assertions.
- Returns:
pywikibot.Claim or None
- Return type:
pywikibot.page.Claim | None
- user_add_claim_unless_exists(item, claim, exists_arg='', source=None, logger_callback=<function deprecated_args.<locals>.decorator.<locals>.wrapper>, **kwargs)[source]#
Decorator of
user_add_claim
.Before adding a new claim, it checks if we can add it, using provided filters.
See also
documentation of
claimit.py
- Parameters:
exists_arg (Container) – pattern for merging existing claims with new ones
logger_callback (Callable[[str], Any]) – function logging the output of the method
item (pywikibot.page.ItemPage)
claim (pywikibot.page.Claim)
source (BaseSite | None)
kwargs (Any)
- Returns:
whether the claim could be added
- Return type:
bool
Note
calling this method may change the current_page property to the item which will also change the site property
Note
calling this method with the ‘source’ argument modifies the provided claim object in place
- create_item_for_page(page, data=None, summary=None, **kwargs)[source]#
Create an ItemPage with the provided page as the sitelink.
- Parameters:
page (BasePage) – the page for which the item will be created
data (dict[str, Any] | None) – additional data to be included in the new item (optional). Note that data created from the page have higher priority.
summary (str | None) – optional edit summary to replace the default one
kwargs (Any)
- Returns:
pywikibot.ItemPage or None
- Return type:
ItemPage | None