scripts package

THIS DIRECTORY IS TO HOLD BOT SCRIPTS FOR THE NEW FRAMEWORK.

Submodules

scripts.add_text script

This is a Bot to add a text at the end of the content of the page.

By default it adds the text above categories and interwiki.

Alternatively it may also add a text at the top of the page. These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-talkpage         Put the text onto the talk page instead the generated on
-talk

-text             Define which text to add. "\n" are interpreted as newlines.

-textfile         Define a texfile name which contains the text to add

-summary          Define the summary to use

-except           Use a regex to check if the text is already in the page

-excepturl        Use the html page as text where you want to see if there's
                  the text, not the wiki-page.

-newimages        Add text in the new images

-always           If used, the bot won't ask if it should add the text
                  specified

-up               If used, put the text at the top of the page

-noreorder        Avoid to reorder cats and interwiki

Example

1. This is a script to add a template to the top of the pages with category:catname Warning! Put it in one line, otherwise it won’t work correctly:

python pwb.py add_text -cat:catname -summary:"Bot: Adding a template" \
    -text:"{{Something}}" -except:"\{\{([Tt]emplate:\|)[Ss]omething" -up

2. Command used on it.wikipedia to put the template in the page without any category. Warning! Put it in one line, otherwise it won’t work correctly:

python pwb.py add_text -except:"\{\{([Tt]emplate:\|)[Cc]ategorizzare" \
    -text:"{{Categorizzare}}" -excepturl:"class='catlinks'>" -uncat \
    -summary:"Bot: Aggiungo template Categorizzare"
scripts.add_text.add_text(page, addText, summary=None, regexSkip=None, regexSkipUrl=None, always=False, up=False, putText=True, oldTextGiven=None, reorderEnabled=True, create=False)[source]

Add text to a page.

Return type:tuple of (text, newtext, always)
scripts.add_text.get_text(page, old, create)[source]

Get the old text.

scripts.add_text.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.add_text.put_text(page, new, summary, count, asynchronous=False)[source]

Save the new text.

scripts.archivebot script

archivebot.py - discussion page archiving bot.

usage:

python pwb.py archivebot [OPTIONS] TEMPLATE_PAGE

Bot examines backlinks (Special:WhatLinksHere) to TEMPLATE_PAGE. Then goes through all pages (unless a specific page specified using options) and archives old discussions. This is done by breaking a page into threads, then scanning each thread for timestamps. Threads older than a specified threshold are then moved to another page (the archive), which can be named either basing on the thread’s name or then name can contain a counter which will be incremented when the archive reaches a certain size.

Transcluded template may contain the following parameters:

{{TEMPLATE_PAGE
\|archive =
\|algo =
\|counter =
\|maxarchivesize =
\|minthreadsleft =
\|minthreadstoarchive =
\|archiveheader =
\|key =
}}

Meanings of parameters are:

archive              Name of the page to which archived threads will be put.
                     Must be a subpage of the current page. Variables are
                     supported.
algo                 Specifies the maximum age of a thread. Must be
                     in the form old(<delay>) where <delay> specifies
                     the age in seconds (s), hours (h), days (d),
                     weeks (w), or years (y) like 24h or 5d. Default is
                     old(24h).
counter              The current value of a counter which could be assigned as
                     variable. Will be updated by bot. Initial value is 1.
maxarchivesize       The maximum archive size before incrementing the counter.
                     Value can be given with appending letter like K or M
                     which indicates KByte or MByte. Default value is 200K.
minthreadsleft       Minimum number of threads that should be left on a page.
                     Default value is 5.
minthreadstoarchive  The minimum number of threads to archive at once. Default
                     value is 2.
archiveheader        Content that will be put on new archive pages as the
                     header. This parameter supports the use of variables.
                     Default value is {{talkarchive}}
key                  A secret key that (if valid) allows archives not to be
                     subpages of the page being archived.

Variables below can be used in the value for “archive” in the template above:

%(counter)d          the current value of the counter
%(year)d             year of the thread being archived
%(isoyear)d          ISO year of the thread being archived
%(isoweek)d          ISO week number of the thread being archived
%(semester)d         semester term of the year of the thread being archived
%(quarter)d          quarter of the year of the thread being archived
%(month)d            month (as a number 1-12) of the thread being archived
%(monthname)s        localized name of the month above
%(monthnameshort)s   first three letters of the name above
%(week)d             week number of the thread being archived

The ISO calendar starts with the Monday of the week which has at least four days in the new Gregorian calendar. If January 1st is between Monday and Thursday (including), the first week of that year started the Monday of that week, which is in the year before if January 1st is not a Monday. If it’s between Friday or Sunday (including) the following week is then the first week of the year. So up to three days are still counted as the year before.

Options (may be omitted):

-help           show this help message and exit
-calc:PAGE      calculate key for PAGE and exit
-file:FILE      load list of pages from FILE
-force          override security options
-locale:LOCALE  switch to locale LOCALE
-namespace:NS   only archive pages from a given namespace
-page:PAGE      archive a single PAGE, default ns is a user talk page
-salt:SALT      specify salt
exception scripts.archivebot.AlgorithmError(arg)[source]

Bases: scripts.archivebot.MalformedConfigError

Invalid specification of archiving algorithm.

__module__ = 'scripts.archivebot'
exception scripts.archivebot.ArchiveBotSiteConfigError(arg)[source]

Bases: pywikibot.exceptions.Error

There is an error originated by archivebot’s on-site configuration.

__module__ = 'scripts.archivebot'
exception scripts.archivebot.ArchiveSecurityError(arg)[source]

Bases: scripts.archivebot.ArchiveBotSiteConfigError

Page title is not a valid archive of page being archived.

The page title is neither a subpage of the page being archived, nor does it match the key specified in the archive configuration template.

__module__ = 'scripts.archivebot'
class scripts.archivebot.DiscussionPage(source, archiver, params=None)[source]

Bases: pywikibot.page.Page

A class that represents a single page of discussion threads.

Feed threads to it and run an update() afterwards.

__init__(source, archiver, params=None)[source]

Initializer.

__module__ = 'scripts.archivebot'
feed_thread(thread, max_archive_size=(256000, 'B'))[source]

Check whether archive size exceeded.

load_page()[source]

Load the page to be archived and break it up into threads.

size()[source]

Return size of talk page threads.

update(summary, sort_threads=False)[source]

Recombine threads and save page.

class scripts.archivebot.DiscussionThread(title, now, timestripper)[source]

Bases: object

An object representing a discussion thread on a page.

It represents something that is of the form:

== Title of thread ==

Thread content here. ~~~~ :Reply, etc. ~~~~

__init__(title, now, timestripper)[source]

Initializer.

__module__ = 'scripts.archivebot'
__repr__()[source]

Return a string representation.

feed_line(line)[source]

Add a line to the content and find the newest timestamp.

should_be_archived(archiver)[source]

Check whether thread has to be archived.

Returns:the archivation reason as a tuple of localization args
Return type:tuple
size()[source]

Return size of discussion thread.

to_text()[source]

Return wikitext discussion thread.

exception scripts.archivebot.MalformedConfigError(arg)[source]

Bases: scripts.archivebot.ArchiveBotSiteConfigError

There is an error in the configuration template.

__module__ = 'scripts.archivebot'
exception scripts.archivebot.MissingConfigError(arg)[source]

Bases: scripts.archivebot.ArchiveBotSiteConfigError

The config is missing in the header.

It’s in one of the threads or transcluded from another page.

__module__ = 'scripts.archivebot'
class scripts.archivebot.PageArchiver(page, template, salt, force=False)[source]

Bases: object

A class that encapsulates all archiving methods.

__init__(page, template, salt, force=False)[source]

Initializer.

param page: a page object to be archived type page: pywikibot.Page param template: a template with configuration settings type template: pywikibot.Page param salt: salt value type salt: str param force: override security value type force: bool

__module__ = 'scripts.archivebot'
algo = 'none'
analyze_page()[source]

Analyze DiscussionPage.

attr2text()[source]

Return a template with archiver saveable attributes.

feed_archive(archive, thread, max_archive_size, params=None)[source]

Feed the thread to one of the archives.

If it doesn’t exist yet, create it. Also check for security violations.

get_attr(attr, default='')[source]

Get an archiver attribute.

key_ok()[source]

Return whether key is valid.

load_config()[source]

Load and validate archiver template.

run()[source]

Process a single DiscussionPage object.

saveables()[source]

Return a list of saveable attributes.

set_attr(attr, value, out=True)[source]

Set an archiver attribute.

class scripts.archivebot.TZoneUTC[source]

Bases: datetime.tzinfo

Class building a UTC tzinfo object.

__module__ = 'scripts.archivebot'
__repr__()[source]

Return a string representation.

dst(dt)[source]

Subclass implementation, return timedelta(0).

tzname(dt)[source]

Subclass implementation.

utcoffset(dt)[source]

Subclass implementation, return timedelta(0).

scripts.archivebot.calc_md5_hexdigest(txt, salt)[source]

Return md5 hexdigest computed from text and salt.

scripts.archivebot.checkstr(string)[source]

Return the key and duration extracted from the string.

Parameters:string (str) – a string defining a time period: 300s - 300 seconds 36h - 36 hours 7d - 7 days 2w - 2 weeks (14 days) 1y - 1 year
Returns:key and duration extracted form the string
Return type:(str, str)
scripts.archivebot.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.archivebot.str2localized_duration(site, string)[source]

Localise a shorthand duration.

Translates a duration written in the shorthand notation (ex. “24h”, “7d”) into an expression in the local wiki language (“24 hours”, “7 days”).

scripts.archivebot.str2size(string)[source]

Return a size for a shorthand size.

Accepts a string defining a size: 1337 - 1337 bytes 150K - 150 kilobytes 2M - 2 megabytes Returns a tuple (size,unit), where size is an integer and unit is ‘B’ (bytes) or ‘T’ (threads).

scripts.archivebot.str2time(string, timestamp=None)[source]

Return a timedelta for a shorthand duration.

Parameters:
  • string (str) – a string defining a time period: 300s - 300 seconds 36h - 36 hours 7d - 7 days 2w - 2 weeks (14 days) 1y - 1 year
  • timestamp (datetime.datetime) – a timestamp to calculate a more accurate duration offset used by years
Returns:

the corresponding timedelta object

Return type:

datetime.timedelta

scripts.archivebot.template_title_regex(tpl_page)[source]

Return a regex that matches to variations of the template title.

It supports the transcluding variant as well as localized namespaces and case-insensitivity depending on the namespace.

Parameters:tpl_page (Page) – The template page

scripts.basic script

An incomplete sample script.

This is not a complete bot; rather, it is a template from which simple bots can be made. You can rename it to mybot.py, then edit it in whatever way you want.

Use global -simulate option for test purposes. No changes to live wiki will be done.

The following parameters are supported:

-always           The bot won't ask for confirmation when putting a page

-text:            Use this text to be added; otherwise 'Test' is used

-replace:         Don't add text but replace it

-top              Place additional text on top of the page

-summary:         Set the action summary message for the edit.

The following generators and filters are supported:

This script supports use of pywikibot.pagegenerators arguments.

class scripts.basic.BasicBot(generator, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot, pywikibot.bot.AutomaticTWSummaryBot

An incomplete sample bot.

@ivar summary_key: Edit summary message key. The message that should be
used is placed on /i18n subdirectory. The file containing these messages should have the same name as the caller script (i.e. basic.py in this case). Use summary_key to set a default edit summary message.
__init__(generator, **kwargs)[source]

Initializer.

Parameters:generator (generator) – the page generator that determines on which pages to work
__module__ = 'scripts.basic'
summary_key = 'basic-changing'
treat_page()[source]

Load the given page, do some changes, and save it.

scripts.basic.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.blockpageschecker script

A bot to remove stale protection templates from pages that are not protected.

Very often sysops block the pages for a set time but then they forget to remove the warning! This script is useful if you want to remove those useless warning left in these pages.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-protectedpages  Check all the blocked pages; useful when you have not
                 categories or when you have problems with them. (add the
                 namespace after ":" where you want to check - default checks
                 all protected pages.)

-moveprotected   Same as -protectedpages, for moveprotected pages

-always          Doesn't ask every time whether the bot should make the change.
                 Do it always.

-show            When the bot can't delete the template from the page (wrong
                 regex or something like that) it will ask you if it should
                 show the page on your browser.
                 (attention: pages included may give false positives!)

-move            The bot will check if the page is blocked also for the move
                 option, not only for edit

Examples

python pwb.py blockpageschecker -always

python pwb.py blockpageschecker -cat:Geography -always

python pwb.py blockpageschecker -show -protectedpages:4

scripts.blockpageschecker.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.blockpageschecker.showQuest(page)[source]

Ask for an editor and invoke it.

scripts.blockpageschecker.understandBlock(text, TTP, TSP, TSMP, TTMP, TU)[source]

Understand if the page is blocked and if it has the right template.

scripts.capitalize_redirects script

Bot to create capitalized redirects.

It creates redirects where the first character of the first word is uppercase and the remaining characters and words are lowercase.

Command-line arguments:

-always           Don't prompt to make changes, just do them.

-titlecase        creates a titlecased redirect version of a given page
                  where all words of the title start with an uppercase
                  character and the remaining characters are lowercase.

This script supports use of pywikibot.pagegenerators arguments.

Example

python pwb.py capitalize_redirects -start:B -always

class scripts.capitalize_redirects.CapitalizeBot(generator, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot, pywikibot.bot.FollowRedirectPageBot, pywikibot.bot.ExistingPageBot

Capitalization Bot.

__init__(generator, **kwargs)[source]

Initializer.

:param : param generator: The page generator that determines on which pages
to work.
:param : kwarg titlecase: create a titlecased redirect page instead a
capitalized one.
__module__ = 'scripts.capitalize_redirects'
treat_page()[source]

Capitalize redirects of the current page.

scripts.capitalize_redirects.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.casechecker script

Bot to find all pages on the wiki with mixed latin and cyrilic alphabets.

class scripts.casechecker.CaseChecker[source]

Bases: object

Case checker.

AddNoSuggestionTitle(title)[source]

Add backlinks to log.

AppendLineToLog(filename, text)[source]

Write text to logfile.

ColorCodeWord(word, toScreen=False)[source]

Colorize code word.

FindBadWords(title)[source]

Retrieve bad words.

Create a colored link string.

MakeMoveSummary(fromTitle, toTitle)[source]

Move summary from i18n.

OpenLogFile(filename)[source]

Open logfile.

Page(title)[source]

Create Page object from title.

PickTarget(title, original, candidates)[source]

Pick target from candidates.

ProcessDataBlock(data)[source]

Process data block given by RunQuery().

ProcessTitle(title)[source]

Process title.

PutNewPage(pageObj, pageTxt, msg)[source]

Save new page.

Replace links.

Run()[source]

Run the bot.

RunQuery(params)[source]

API query.

WikiLog(text)[source]

Write log.

__init__()[source]

Initializer with arg parsing.

__module__ = 'scripts.casechecker'
alwaysInLatin = ['II', 'III']
alwaysInLocal = ['СССР', 'Как', 'как']
apfrom = ''
aplimit = None
autonomous = False
colorFormatLatinColor = '{red}'
colorFormatLocalColor = '{green}'
colorFormatSuffix = '{default}'
doFailed = False
failedTitles = 'failedTitles.txt'
filterredir = 'nonredirects'
latClrFnt = '<font color=brown>'
latinKeyboard = 'qwertyuiopasdfghjklzxcvbnm'
latinSuspects = 'ABEKMHOPCTXIËÏaeopcyxiëï'
lclClrFnt = '<font color=green>'
localKeyboard = 'йцукенгшщзфывапролдячсмить'
localLowerLtr = 'ёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ'
localLtr = 'ЁІЇЎАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯҐёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ'
localSuspects = 'АВЕКМНОРСТХІЁЇаеорсухіёї'
localUpperLtr = 'ЁІЇЎАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯҐ'
namespaces = []
nosuggestions = 'nosuggestions.txt'
replace = False
romanNumChars = 'IVXLCDM'
romanNumSfxPtrn = re.compile('[IVXLCDM]+[ёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ]+$')
romannumSuffixes = 'ёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ'
stopAfter = -1
stripChars = ' \t,'
suffixClr = '</font>'
title = None
titleList = None
titles = True
whitelists = {'ru': 'ВП:КЛ/Проверенные'}
wikilog = None
wikilogfile = 'wikilog.txt'
wordBreaker = re.compile('[ _\\-/\\|#[\\]():]')

scripts.catall script

This script shows the categories on each page and lets you change them.

For each page in the target wiki

  • If the page contains no categories, you can specify a list of categories to add to the page.
  • If the page already contains one or more categories, you can specify a new list of categories to replace the current list of categories of the page.

Usage:

python pwb.py catall [start]

If no starting name is provided, the bot starts at ‘A’.

Options:

-onlynew : Only run on pages that do not yet have a category.
scripts.catall.choosecats(pagetext)[source]

Coose categories.

scripts.catall.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.catall.make_categories(page, list, site=None)[source]

Make categories.

scripts.category script

Script to manage categories.

Syntax:

python pwb.py category action [-option]

where action can be one of these

  • add - mass-add a category to a list of pages.
  • remove - remove category tag from all pages in a category.
  • move - move all pages in a category to another category.
  • tidy - tidy up a category by moving its pages into subcategories.
  • tree - show a tree of subcategories of a given category.
  • listify - make a list of all of the articles that are in a category.

and option can be one of these

Options for “add” action:

-person      - Sort persons by their last name.
-create      - If a page doesn't exist, do not skip it, create it instead.
-redirect    - Follow redirects.

If action is “add”, the following options are supported:

This script supports use of pywikibot.pagegenerators arguments.

Options for “listify” action:

-append      - This appends the list to the current page that is already
               existing (appending to the bottom by default).
-overwrite   - This overwrites the current page with the list even if
               something is already there.
-showimages  - This displays images rather than linking them in the list.
-talkpages   - This outputs the links to talk pages of the pages to be
               listified in addition to the pages themselves.
-prefix:#    - You may specify a list prefix like "#" for a numbered list or
               any other prefix. Default is a bullet list with prefix "*".

Options for “remove” action:

-nodelsum    - This specifies not to use the custom edit summary as the
               deletion reason. Instead, it uses the default deletion reason
               for the language, which is "Category was disbanded" in
               English.

Options for “move” action:

-hist        - Creates a nice wikitable on the talk page of target category
               that contains detailed page history of the source category.
-nodelete    - Don't delete the old category after move.
-nowb        - Don't update the wikibase repository.
-allowsplit  - If that option is not set, it only moves the talk and main
               page together.
-mvtogether  - Only move the pages/subcategories of a category, if the
               target page (and talk page, if -allowsplit is not set)
               doesn't exist.
-keepsortkey - Use sortKey of the old category also for the new category.
               If not specified, sortKey is removed.
               An alternative method to keep sortKey is to use -inplace
               option.

Options for “tidy” action:

-namespaces    Filter the arcitles in the specified namespaces. Separate
-namespace     multiple namespace numbers or names with commas. Examples::
-ns            -ns:0,2,4
               -ns:Help,MediaWiki

Options for several actions:

-rebuild     - Reset the database.
-from:       - The category to move from (for the move option)
               Also, the category to remove from in the remove option
               Also, the category to make a list of in the listify option.
-to:         - The category to move to (for the move option).
             - Also, the name of the list to make in the listify option.
      NOTE: If the category names have spaces in them you may need to use
      a special syntax in your shell so that the names aren't treated as
      separate parameters. For instance, in BASH, use single quotes,
      e.g. -from:'Polar bears'.
-batch       - Don't prompt to delete emptied categories (do it
               automatically).
-summary:    - Pick a custom edit summary for the bot.
-inplace     - Use this flag to change categories in place rather than
               rearranging them.
-recurse     - Recurse through all subcategories of categories.
-pagesonly   - While removing pages from a category, keep the subpage links
               and do not remove them.
-match       - Only work on pages whose titles match the given regex (for
               move and remove actions).
-depth:      - The max depth limit beyond which no subcategories will be
               listed.

For the actions tidy and tree, the bot will store the category structure locally in category.dump. This saves time and server load, but if it uses these data later, they may be outdated; use the -rebuild parameter in this case.

For example, to create a new category from a list of persons, type:

python pwb.py category add -person

and follow the on-screen instructions.

Or to do it all from the command-line, use the following syntax:

python pwb.py category move -from:US -to:"United States"

This will move all pages in the category US to the category United States.

class scripts.category.CategoryAddBot(generator, newcat=None, sort_by_last_name=False, create=False, comment='', follow_redirects=False, dry=NotImplemented, editSummary='[deprecated name of comment]')[source]

Bases: pywikibot.bot.MultipleSitesBot, scripts.category.CategoryPreprocess

A robot to mass-add a category to a list of pages.

__init__(generator, newcat=None, sort_by_last_name=False, create=False, comment='', follow_redirects=False, dry=NotImplemented, editSummary='[deprecated name of comment]')[source]

Initializer.

__module__ = 'scripts.category'
sorted_by_last_name(catlink, pagelink)[source]

Return a Category with key that sorts persons by their last name.

Parameters: catlink - The Category to be linked.
pagelink - the Page to be placed in the category.

Trailing words in brackets will be removed. Example: If category_name is ‘Author’ and pl is a Page to [[Alexandre Dumas (senior)]], this function will return this Category: [[Category:Author|Dumas, Alexandre]].

treat(page)[source]

Process one page.

class scripts.category.CategoryDatabase(rebuild=False, filename='category.dump.bz2')[source]

Bases: object

Temporary database saving pages and subcategories for each category.

This prevents loading the category pages over and over again.

__init__(rebuild=False, filename='category.dump.bz2')[source]

Initializer.

__module__ = 'scripts.category'
dump(filename=None)[source]

Save the dictionaries to disk if not empty.

Pickle the contents of the dictionaries superclassDB and catContentDB if at least one is not empty. If both are empty, removes the file from the disk.

If the filename is None, it’ll use the filename determined in __init__.

getArticles(cat)[source]

Return the list of pages for a given category.

Saves this list in a temporary database so that it won’t be loaded from the server next time it’s required.

getSubcats(supercat)[source]

Return the list of subcategories for a given supercategory.

Saves this list in a temporary database so that it won’t be loaded from the server next time it’s required.

getSupercats(subcat)[source]

Return the supercategory (or a set of) for a given subcategory.

is_loaded

Return whether the contents have been loaded.

rebuild()[source]

Rebuild the dabatase.

class scripts.category.CategoryListifyRobot(catTitle, listTitle, editSummary, append=False, overwrite=False, showImages=False, subCats=False, talkPages=False, recurse=False, prefix='*')[source]

Bases: object

Create a list containing all of the members in a category.

__init__(catTitle, listTitle, editSummary, append=False, overwrite=False, showImages=False, subCats=False, talkPages=False, recurse=False, prefix='*')[source]

Initializer.

__module__ = 'scripts.category'
run()[source]

Start bot.

class scripts.category.CategoryMoveRobot(oldcat, newcat=None, batch=False, comment='', inplace=False, move_oldcat=True, delete_oldcat=True, title_regex=None, history=False, pagesonly=False, deletion_comment=0, move_comment=None, wikibase=True, allow_split=False, move_together=False, keep_sortkey=None, newCatTitle='[deprecated name of newcat]', inPlace='[deprecated name of inplace]', batchMode='[deprecated name of batch]', moveCatPage='[deprecated name of move_oldcat]', titleRegex='[deprecated name of title_regex]', withHistory='[deprecated name of history]', deleteEmptySourceCat='[deprecated name of delete_oldcat]', editSummary='[deprecated name of comment]', oldCatTitle='[deprecated name of oldcat]')[source]

Bases: scripts.category.CategoryPreprocess

Change or remove the category from the pages.

If the new category is given changes the category from the old to the new one. Otherwise remove the category from the page and the category if it’s empty.

Per default the operation applies to pages and subcategories.

DELETION_COMMENT_AUTOMATIC = 0
DELETION_COMMENT_SAME_AS_EDIT_COMMENT = 1
__init__(oldcat, newcat=None, batch=False, comment='', inplace=False, move_oldcat=True, delete_oldcat=True, title_regex=None, history=False, pagesonly=False, deletion_comment=0, move_comment=None, wikibase=True, allow_split=False, move_together=False, keep_sortkey=None, newCatTitle='[deprecated name of newcat]', inPlace='[deprecated name of inplace]', batchMode='[deprecated name of batch]', moveCatPage='[deprecated name of move_oldcat]', titleRegex='[deprecated name of title_regex]', withHistory='[deprecated name of history]', deleteEmptySourceCat='[deprecated name of delete_oldcat]', editSummary='[deprecated name of comment]', oldCatTitle='[deprecated name of oldcat]')[source]

Store all given parameters in the objects attributes.

Parameters:
  • oldcat – The move source.
  • newcat – The move target.
  • batch – If True the user has not to confirm the deletion.
  • comment – The edit summary for all pages where the category is changed, and also for moves and deletions if not overridden.
  • inplace – If True the categories are not reordered.
  • move_oldcat – If True the category page (and talkpage) is copied to the new category.
  • delete_oldcat – If True the oldcat page and talkpage are deleted (or nominated for deletion) if it is empty.
  • title_regex – Only pages (and subcats) with a title that matches the regex are moved.
  • history – If True the history of the oldcat is posted on the talkpage of newcat.
  • pagesonly – If True only move pages, not subcategories.
  • deletion_comment – Either string or special value: DELETION_COMMENT_AUTOMATIC: use a generated message, DELETION_COMMENT_SAME_AS_EDIT_COMMENT: use the same message for delete that is used for the edit summary of the pages whose category was changed (see the comment param above). If the value is not recognized, it’s interpreted as DELETION_COMMENT_AUTOMATIC.
  • move_comment – If set, uses this as the edit summary on the actual move of the category page. Otherwise, defaults to the value of the comment parameter.
  • wikibase – If True, update the Wikibase item of the old category.
  • allow_split – If False only moves page and talk page together.
  • move_together – If True moves the pages/subcategories only if page and talk page could be moved or both source page and target page don’t exist.
__module__ = 'scripts.category'
static check_move(name, old_page, new_page)[source]

Return if the old page can be safely moved to the new page.

run()[source]

The main bot function that does all the work.

For readability it is split into several helper functions: - _movecat() - _movetalk() - _hist() - _change() - _delete()

class scripts.category.CategoryPreprocess(follow_redirects=False, edit_redirects=False, create=False, **kwargs)[source]

Bases: pywikibot.bot.BaseBot

A class to prepare a list of pages for robots.

__init__(follow_redirects=False, edit_redirects=False, create=False, **kwargs)[source]

Initializer.

__module__ = 'scripts.category'
determine_template_target(page)[source]

Return template page to be categorized.

Categories for templates can be included in <includeonly> section of template doc page.

Also the doc page can be changed by doc template parameter.

TODO: decide if/how to enable/disable this feature.

Parameters:page (pywikibot.Page) – Page to be processed.
Returns:Page to be categorized.
Return type:pywikibot.Page
determine_type_target(page)[source]

Return page to be categorized by type.

Parameters:page (pywikibot.Page) – Existing, missing or redirect page to be processed.
Returns:Page to be categorized.
Return type:pywikibot.Page or None
class scripts.category.CategoryRemoveRobot(catTitle, batchMode=False, editSummary='', useSummaryForDeletion=0, titleRegex=None, inPlace=False, pagesonly=False)[source]

Bases: scripts.category.CategoryMoveRobot

Removes the category tag for a given category.

It always removes the category tag for all pages in that given category.

If pagesonly parameter is False it removes also the category from all subcategories, without prompting. If the category is empty, it will be tagged for deleting. Does not remove category tags pointing at subcategories.

Deprecated:Using CategoryRemoveRobot is deprecated, use CategoryMoveRobot without newcat param instead.
__init__(catTitle, batchMode=False, editSummary='', useSummaryForDeletion=0, titleRegex=None, inPlace=False, pagesonly=False)[source]

Deprecated; use CategoryMoveRobot without newcat parameter instead.

Initializer.

__module__ = 'scripts.category'
class scripts.category.CategoryTidyRobot(cat_title, cat_db, namespaces=None, comment=None)[source]

Bases: pywikibot.bot.Bot, scripts.category.CategoryPreprocess

Robot to move members of a category into sub- or super-categories.

Specify the category title on the command line. The robot will pick up the page, look for all sub- and super-categories, and show them listed as possibilities to move page into with an assigned number. It will ask you to type number of the appropriate replacement, and performs the change robotically. It will then automatically loop over all pages in the category.

If you don’t want to move the member to a sub- or super-category, but to another category, you can use the ‘j’ (jump) command.

By typing ‘s’ you can leave the complete page unchanged.

By typing ‘m’ you can show more content of the current page, helping you to find out what the page is about and in which other categories it currently is.

Parameters:
  • cat_title – a title of the category to process.
  • cat_db – a CategoryDatabase object.
  • namespaces – namespaces to focus on.
  • comment – a custom summary for edits.
Type:

str

Type:

CategoryDatabase object

Type:

iterable of pywikibot.Namespace

Type:

str

__init__(cat_title, cat_db, namespaces=None, comment=None)[source]

Initializer.

__module__ = 'scripts.category'
move_to_category(member, original_cat, current_cat, article='[deprecated name of member]')[source]

Ask whether to move it to one of the sub- or super-categories.

Given a page in the original_cat category, ask the user whether to move it to one of original_cat’s sub- or super-categories. Recursively run through subcategories’ subcategories. NOTE: current_cat is only used for internal recursion. You should always use current_cat = original_cat.

Parameters:
  • member – a page to process.
  • original_cat – original category to replace.
  • current_cat – a category which is questioned.
Type:

pywikibot.Page

Type:

pywikibot.Category

Type:

pywikibot.Category

teardown()[source]

Cleanups after run operation.

treat(page)[source]

Process page.

class scripts.category.CategoryTreeRobot(catTitle, catDB, filename=None, maxDepth=10)[source]

Bases: object

Robot to create tree overviews of the category structure.

Parameters:
  • catTitle - The category which will be the tree's root. (*) –
  • catDB - A CategoryDatabase object. (*) –
  • maxDepth - The limit beyond which no subcategories will be listed. (*) – This also guarantees that loops in the category structure won’t be a problem.
  • filename - The textfile where the tree should be saved; None to print (*) – the tree to stdout.
__init__(catTitle, catDB, filename=None, maxDepth=10)[source]

Initializer.

__module__ = 'scripts.category'
run()[source]

Handle the multi-line string generated by treeview.

After string was generated by treeview it is either printed to the console or saved it to a file.

treeview(cat, currentDepth=0, parent=None)[source]

Return a tree view of all subcategories of cat.

The multi-line string contains a tree view of all subcategories of cat, up to level maxDepth. Recursively calls itself.

Parameters:
  • cat - the Category of the node we're currently opening. (*) –
  • currentDepth - the current level in the tree (*) –
  • parent - the Category of the category we're coming from. (*) –
scripts.category.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments.

scripts.category_redirect script

This bot will move pages out of redirected categories.

The bot will look for categories that are marked with a category redirect template, take the first parameter of the template as the target of the redirect, and move all pages and subcategories of the category there. It also changes hard redirects into soft redirects, and fixes double redirects. A log is written under <userpage>/category_redirect_log. Only category pages that haven’t been edited for a certain cooldown period (currently 7 days) are taken into account.

The following parameters are supported:

-delay:#          Set an amount of days. If the category is edited more recenty
                  than given days, ignore it. Default is 7.

-tiny             Only loops over Category:Non-empty_category_redirects and
                  moves all images, pages and categories in redirect categories
                  to the target category.

Usage:

python pwb.py category_redirect [options]
class scripts.category_redirect.CategoryRedirectBot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot

Page category update bot.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.category_redirect'
check_hard_redirect()[source]

Check for hard-redirected categories.

Check categories that are not already marked with an appropriate softredirect template.

get_cat()[source]

Specify the category page.

get_log_text()[source]

Rotate log text and return the most recent text.

move_contents(oldCatTitle, newCatTitle, editSummary)[source]

The worker function that moves pages out of oldCat into newCat.

readyToEdit(cat)[source]

Return True if cat not edited during cooldown period, else False.

run()[source]

Run the bot.

scripts.category_redirect.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.cfd script

This script processes the Categories for discussion working page.

It parses out the actions that need to be taken as a result of CFD discussions (as posted to the working page by an administrator) and performs them.

Syntax:

python pwb.py cfd
class scripts.cfd.ReCheck[source]

Bases: object

Helper class.

__init__()[source]

Initializer.

__module__ = 'scripts.cfd'
check(pattern, text)[source]

Search pattern.

scripts.cfd.findDay(pageTitle, oldDay)[source]

Find day link from CFD template.

This function grabs the wiki source of a category page and attempts to extract a link to the CFD per-day discussion page from the CFD template. If the CFD template is not there, it will return the value of the second parameter, which is essentially a fallback that is extracted from the per-day subheadings on the working page.

scripts.cfd.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.checkimages script

Script to check recently uploaded files.

This script checks if a file description is present and if there are other problems in the image’s description.

This script will have to be configured for each language. Please submit translations as addition to the Pywikibot framework.

Everything that needs customisation is indicated by comments.

This script understands the following command-line arguments:

-limit              The number of images to check (default: 80)

-commons            The Bot will check if an image on Commons has the same name
                    and if true it reports the image.

-duplicates[:#]     Checking if the image has duplicates (if arg, set how many
                    rollback wait before reporting the image in the report
                    instead of tag the image) default: 1 rollback.

-duplicatesreport   Report the duplicates in a log *AND* put the template in
                    the images.

-maxusernotify      Maximum nofitications added to a user talk page in a single
                    check, to avoid email spamming.

-sendemail          Send an email after tagging.

-break              To break the bot after the first check (default: recursive)

-sleep[:#]          Time in seconds between repeat runs (default: 30)

-wait[:#]           Wait x second before check the images (default: 0)

-skip[:#]           The bot skip the first [:#] images (default: 0)

-start[:#]          Use allimages() as generator
                    (it starts already from File:[:#])

-cat[:#]            Use a category as generator

-regex[:#]          Use regex, must be used with -url or -page

-page[:#]           Define the name of the wikipage where are the images

-url[:#]            Define the url where are the images

-nologerror         If given, this option will disable the error that is risen
                    when the log is full.

Instructions for the real-time settings. For every new block you have to add:

<------- ------->

In this way the Bot can understand where the block starts in order to take the right parameter.

  • Name= Set the name of the block
  • Find= search this text in the image’s description
  • Findonly= search for exactly this text in the image’s description
  • Summary= That’s the summary that the bot will use when it will notify the
    problem.
  • Head= That’s the incipit that the bot will use for the message.
  • Text= This is the template that the bot will use when it will report the
    image’s problem.
exception scripts.checkimages.LogIsFull(arg)[source]

Bases: pywikibot.exceptions.Error

Log is full and the Bot cannot add other data to prevent Errors.

__module__ = 'scripts.checkimages'
class scripts.checkimages.checkImagesBot(site, logFulNumber=25000, sendemailActive=False, duplicatesReport=False, logFullError=True, max_user_notify=None)[source]

Bases: object

A robot to check recently uploaded files.

__init__(site, logFulNumber=25000, sendemailActive=False, duplicatesReport=False, logFullError=True, max_user_notify=None)[source]

Initializer, define some instance variables.

__module__ = 'scripts.checkimages'
checkImageDuplicated(duplicates_rollback)[source]

Function to check the duplicated files.

checkImageOnCommons()[source]

Checking if the file is on commons.

checkStep()[source]

Check a single file page.

findAdditionalProblems()[source]

Extract additional settings from configuration page.

important_image(listGiven)[source]

Get tuples of image and time, return the most used or oldest image.

Parameters:listGiven (list) – a list of tuples which hold seconds and FilePage
Returns:the most used or oldest image
Return type:FilePage
isTagged()[source]

Understand if a file is already tagged or not.

load(raw)[source]

Load a list of objects from a string using regex.

loadHiddenTemplates()[source]

Function to load the white templates.

load_licenses()[source]

Load the list of the licenses.

miniTemplateCheck(template)[source]

Check if template is in allowed licenses or in licenses to skip.

put_mex_in_talk()[source]

Function to put the warning in talk page of the uploader.

regexGenerator(regexp, textrun)[source]

Find page to yield using regex to parse text.

report(newtext, image_to_report, notification=None, head=None, notification2=None, unver=True, commTalk=None, commImage=None)[source]

Function to make the reports easier.

report_image(image_to_report, rep_page=None, com=None, rep_text=None, addings=True)[source]

Report the files to the report page when needed.

setParameters(image)[source]

Set parameters.

skipImages(skip_number, limit)[source]

Given a number of files, skip the first -number- files.

smartDetection()[source]

Detect templates.

The bot instead of checking if there’s a simple template in the image’s description, checks also if that template is a license or something else. In this sense this type of check is smart.

tag_image(put=True)[source]

Add template to the Image page and find out the uploader.

takesettings()[source]

Function to take the settings from the wiki.

templateInList()[source]

Check if template is in list.

The problem is the calls to the Mediawiki system because they can be pretty slow. While searching in a list of objects is really fast, so first of all let’s see if we can find something in the info that we already have, then make a deeper check.

uploadBotChangeFunction(reportPageText, upBotArray)[source]

Detect the user that has uploaded the file through upload bot.

static wait(generator, wait_time)[source]

Skip the images uploaded before x seconds.

Let the users to fix the image’s problem alone in the first x seconds.

scripts.checkimages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.checkimages.printWithTimeZone(message)[source]

Print the messages followed by the TimeZone encoded correctly.

scripts.claimit script

A script that adds claims to Wikidata items based on a list of pages.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Usage:

python pwb.py claimit [pagegenerators] P1 Q2 P123 Q456

You can use any typical pagegenerator (like categories) to provide with a list of pages. Then list the property–>target pairs to add.

For geographic coordinates:

python pwb.py claimit [pagegenerators] P625 [lat-dec],[long-dec],[prec]

[lat-dec] and [long-dec] represent the latitude and longitude respectively, and [prec] represents the precision. All values are in decimal degrees, not DMS. If [prec] is omitted, the default precision is 0.0001 degrees.

Example

python pwb.py claimit [pagegenerators] P625 -23.3991,-52.0910,0.0001

By default, claimit.py does not add a claim if one with the same property already exists on the page. To override this behavior, use the ‘exists’ option:

python pwb.py claimit [pagegenerators] P246 "string example" -exists:p

Suppose the claim you want to add has the same property as an existing claim and the “-exists:p” argument is used. Now, claimit.py will not add the claim if it has the same target, source, and/or the existing claim has qualifiers. To override this behavior, add ‘t’ (target), ‘s’ (sources), or ‘q’ (qualifiers) to the ‘exists’ argument.

For instance, to add the claim to each page even if one with the same property and target and some qualifiers already exists:

python pwb.py claimit [pagegenerators] P246 "string example" -exists:ptq

Note that the ordering of the letters in the ‘exists’ argument does not matter, but ‘p’ must be included.

class scripts.claimit.ClaimRobot(generator, claims, exists_arg='')[source]

Bases: pywikibot.bot.WikidataBot

A bot to add Wikidata claims.

__init__(generator, claims, exists_arg='')[source]

Initializer.

Parameters:
  • generator (iterator) – A generator that yields Page objects.
  • claims (list) – A list of wikidata claims
  • exists_arg (str) – String specifying how to handle duplicate claims
__module__ = 'scripts.claimit'
treat_page_and_item(page, item)[source]

Treat each page.

use_from_page = None
scripts.claimit.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
Return type:bool

scripts.clean_sandbox script

This bot resets a (user) sandbox with predefined text.

This script understands the following command-line arguments:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-hours:#       Use this parameter if to make the script repeat itself
               after # hours. Hours can be defined as a decimal. 0.01
               hours are 36 seconds; 0.1 are 6 minutes.

-delay:#       Use this parameter for a wait time after the last edit
               was made. If no parameter is given it takes it from
               hours and limits it between 5 and 15 minutes.
               The minimum delay time is 5 minutes.

-text          The text that substitutes in the sandbox, you can use this
               when you haven't configured clean_candbox for your wiki.

-summary       Summary of the edit made by bot.
class scripts.clean_sandbox.SandboxBot(**kwargs)[source]

Bases: pywikibot.bot.Bot, pywikibot.bot.ConfigParserBot

Sandbox reset bot.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.clean_sandbox'
availableOptions = {'delay': -1, 'delay_td': None, 'hours': 1.0, 'no_repeat': True, 'summary': '', 'text': ''}
run()[source]

Run bot.

scripts.clean_sandbox.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.commonscat script

With this tool you can add the template {{commonscat}} to categories.

The tool works by following the interwiki links. If the template is present on another language page, the bot will use it.

You could probably use it at articles as well, but this isn’t tested.

The following parameters are supported:

-always           Don't prompt you for each replacement. Warning message
                  has not to be confirmed. ATTENTION: Use this with care!

-summary:XYZ      Set the action summary message for the edit to XYZ,
                  otherwise it uses messages from add_text.py as default.

-checkcurrent     Work on all category pages that use the primary commonscat
                  template.

This bot uses pagegenerators to get a list of pages. The following options are supported:

This script supports use of pywikibot.pagegenerators arguments.

For example to go through all categories:

python pwb.py commonscat -start:Category:!
class scripts.commonscat.CommonscatBot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot

Commons categorisation bot.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.commonscat'
changeCommonscat(page=None, oldtemplate='', oldcat='', newtemplate='', newcat='', linktitle='', description=NotImplemented)[source]

Change the current commonscat template and target.

Return the name of a valid commons category.

If the page is a redirect this function tries to follow it. If the page doesn’t exists the function will return an empty string

Find CommonsCat template on interwiki pages.

In Pywikibot >=2.0, page.interwiki() now returns Link objects, not Page objects

Returns:name of a valid commons category
Return type:str

Find CommonsCat template on page.

Return type:tuple of (<templatename>, <target>, <linktext>, <note>)
skipPage(page)[source]

Determine if the page should be skipped.

skip_page(page)[source]

Skip category redirects or disambigs.

treat_page()[source]

Add CommonsCat template to page.

Take a page. Go to all the interwiki page looking for a commonscat template. When all the interwiki’s links are checked and a proper category is found add it to the page.

scripts.commonscat.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.coordinate_import script

Coordinate importing script.

Usage:

python pwb.py coordinate_import -lang:en -family:wikipedia \
    -cat:Category:Coordinates_not_on_Wikidata

This will work on all pages in the category “coordinates not on Wikidata” and will import the coordinates on these pages to Wikidata.

The data from the “GeoData” extension (https://www.mediawiki.org/wiki/Extension:GeoData) is used so that extension has to be setup properly. You can look at the [[Special:Nearby]] page on your local Wiki to see if it’s populated.

You can use any typical pagegenerator to provide with a list of pages:

python pwb.py coordinate_import -lang:it -family:wikipedia \
    -namespace:0 -transcludes:Infobox_stazione_ferroviaria

You can also run over a set of items on the repo without coordinates and try to import them from any connected page. To do this, you have to explicitly provide the repo as the site using -lang and -family arguments. .. admonition:: Example

python pwb.py coordinate_import -lang:wikidata -family:wikidata
-namespace:0 -querypage:Deadendpages

The following command line parameters are supported:

-create           Create items for pages without one.

This script supports use of pywikibot.pagegenerators arguments.

class scripts.coordinate_import.CoordImportRobot(generator, **kwargs)[source]

Bases: pywikibot.bot.WikidataBot

A bot to import coordinates to Wikidata.

__init__(generator, **kwargs)[source]

Initializer.

Parameters:generator – A generator that yields Page objects.
__module__ = 'scripts.coordinate_import'
has_coord_qualifier(claims)[source]

Check if self.prop is used as property for a qualifier.

Parameters:claims (dict) – the Wikibase claims to check in
Returns:the first property for which self.prop is used as qualifier, or None if any
Returns:unicode or None
item_has_coordinates(item)[source]

Check if the item has coordinates.

Returns:whether the item has coordinates
Return type:bool
treat_page_and_item(page, item)[source]

Treat page/item.

try_import_coordinates_from_page(page, item)[source]

Try import coordinate from the given page to the given item.

Returns:whether any coordinates were found and the import was successful
Return type:bool
use_from_page = None
scripts.coordinate_import.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.cosmetic_changes script

This module can do slight modifications to tidy a wiki page’s source code.

The changes are not supposed to change the look of the rendered wiki page.

The following parameters are supported:

-always           Don't prompt you for each replacement. Warning (see below)
                  has not to be confirmed. ATTENTION: Use this with care!

-async            Put page on queue to be saved to wiki asynchronously.

-summary:XYZ      Set the summary message text for the edit to XYZ, bypassing
                  the predefined message texts with original and replacements
                  inserted.

-ignore:          Ignores if an error occurred and either skips the page or
                  only that method. It can be set to 'page' or 'method'.

The following generators and filters are supported:

This script supports use of pywikibot.pagegenerators arguments.

ATTENTION: You can run this script as a stand-alone for testing purposes. However, the changes that are made are only minor, and other users might get angry if you fill the version histories and watchlists with such irrelevant changes. Some wikis prohibit stand-alone running.

For further information see pywikibot/cosmetic_changes.py

class scripts.cosmetic_changes.CosmeticChangesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot

Cosmetic changes bot.

__init__(generator, **kwargs)[source]

Initializer.

__module__ = 'scripts.cosmetic_changes'
treat_page()[source]

Treat page with the cosmetic toolkit.

scripts.cosmetic_changes.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.create_categories script

Program to batch create categories.

The program expects a generator of category titles to be used as suffix for creating new categories with a different base.

The following command line parameters are supported:

-always         Don't ask, just do the edit.

-parent         The name of the parent category.

-basename       The base to be used for the new category names.

-overwrite:     Existing category is skipped by default. Use this option to
                overwrite a category.

Example

python pwb.py create_categories
-lang:commons -family:commons -links:User:Multichill/Wallonia -parent:”Cultural heritage monuments in Wallonia” -basename:”Cultural heritage monuments in”

The page ‘User:Multichill/Wallonia’ on commons contains category links like [[Category:Hensies]], causing this script to create [[Category:Cultural heritage monuments in Hensies]].

class scripts.create_categories.CreateCategoriesBot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.AutomaticTWSummaryBot

Category creator bot.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.create_categories'
init_page(item)[source]

Create a category to be processed with the given page title.

skip_page(page)[source]

Skip page if it is not overwritten.

summary_key = 'create_categories-create'
treat_page()[source]

Create category in local site for that page.

scripts.create_categories.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.data_ingestion script

A generic bot to do data ingestion (batch uploading).

usage:

python pwb.py data_ingestion -csvdir:local_dir/ -page:config_page
scripts.data_ingestion.CSVReader(fileobj, urlcolumn, site=None, *args, **kwargs)[source]

Yield Photo objects for each row of a CSV file.

class scripts.data_ingestion.DataIngestionBot(reader, titlefmt, pagefmt, site='deprecated_default_commons')[source]

Bases: pywikibot.bot.Bot

Data ingestion bot.

__init__(reader, titlefmt, pagefmt, site='deprecated_default_commons')[source]

Initializer.

Parameters:
  • reader (Photo page generator) – Generator of Photos to process.
  • titlefmt (basestring) – Title format
  • pagefmt (basestring) – Page format
  • site (APISite, 'deprecated_default_commons' or None) – Target site for image upload. Use None to determine the site from the pages treated. Defaults to ‘deprecated_default_commons’ to use Wikimedia Commons for backwards compatibility reasons. Deprecated.
__module__ = 'scripts.data_ingestion'
classmethod parseConfigurationPage(configurationPage)[source]

Parse a Page which contains the configuration.

Parameters:configurationPage (pywikibot.Page) – page with configuration
treat(photo)[source]

Process each page.

class scripts.data_ingestion.Photo(URL, metadata, site=None)[source]

Bases: pywikibot.page.FilePage

Represents a Photo (or other file), with metadata, to be uploaded.

__init__(URL, metadata, site=None)[source]

Initializer.

Parameters:
  • URL (str) – URL of photo
  • metadata (dict) – metadata about the photo that can be referred to from the title & template
  • site (APISite) – target site
__module__ = 'scripts.data_ingestion'
downloadPhoto()[source]

Download the photo and store it in a io.BytesIO object.

TODO: Add exception handling

findDuplicateImages(site=NotImplemented)[source]

Find duplicates of the photo.

Calculates the SHA1 hash and asks the MediaWiki api for a list of duplicates.

TODO: Add exception handling, fix site thing

getDescription(template, extraparams={})[source]

Generate a description for a file.

getTitle(fmt)[source]

Populate format string with %(name)s entries using metadata.

Note: this does not clean the title, so it may be unusable as a MediaWiki page title, and cause an API exception when used.

Parameters:fmt (str) – format string
Returns:formatted string
Return type:str
scripts.data_ingestion.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.delete script

This script can be used to delete and undelete pages en masse.

Of course, you will need an admin account on the relevant wiki.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-always           Don't prompt to delete pages, just do it.

-summary:XYZ      Set the summary message text for the edit to XYZ.

-undelete         Actually undelete pages instead of deleting.
                  Obviously makes sense only with -page and -file.

-isorphan         Alert if there are pages that link to page to be
                  deleted (check 'What links here').
                  By default it is active and only the summary per namespace
                  is be given.
                  If given as -isorphan:n, n pages per namespace will be shown,
                  If given as -isorphan:0, only the summary per namespace will
                  be shown,
                  If given as -isorphan:n, with n < 0, the option is disabled.
                  This option is disregarded if -always is set.

-orphansonly:     Specified namespaces. Separate multiple namespace
                  numbers or names with commas.
                  Examples:

                  -orphansonly:0,2,4
                  -orphansonly:Help,MediaWiki

                  Note that Main ns can be indicated either with a 0 or a ',':

                  -orphansonly:0,1
                  -orphansonly:,Talk

Usage:

python pwb.py delete [-category categoryName]

Examples

Delete everything in the category “To delete” without prompting:

python pwb.py delete -cat:"To delete" -always
class scripts.delete.DeletionRobot(generator, summary, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot, pywikibot.bot.CurrentPageBot

This robot allows deletion of pages en masse.

__init__(generator, summary, **kwargs)[source]

Initializer.

Parameters:
  • generator (iterable) – the pages to work on
  • summary (str) – the reason for the (un)deletion
__module__ = 'scripts.delete'
display_references()[source]

Display pages that link to the current page, sorted per namespace.

Number of pages to display per namespace is provided by: - self.getOption(‘isorphan’)

skip_page(page)[source]

Skip the page under some conditions.

treat_page()[source]

Process one page from the generator.

class scripts.delete.PageWithRefs(source, title='', ns=0)[source]

Bases: pywikibot.page.Page

A subclass of Page with convenience methods for reference checking.

Supports the same interface as Page, with some added methods.

__init__(source, title='', ns=0)[source]

Initializer.

__module__ = 'scripts.delete'
get_ref_table(*args, **kwargs)[source]

Build mapping table with pages which links the current page.

namespaces_with_ref_to_page(namespaces=None)[source]

Check if current page has links from pages in namepaces.

If namespaces is None, all namespaces are checked. Returns a set with namespaces where a ref to page is present.

Parameters:namespaces (iterable of Namespace objects) – Namespace to check
Rtype set:namespaces where a ref to page is present
ref_table

Build link reference table lazily.

This property gives a default table without any parameter set for getReferences(), whereas self.get_ref_table() is able to accept parameters.

scripts.delete.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.disambredir script

User assisted updating redirect links on disambiguation pages.

Usage:

python pwb.py disambredir [start]

If no starting name is provided, the bot starts at ‘!’.

class scripts.disambredir.DisambiguationRedirectBot(**kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot, pywikibot.bot.AutomaticTWSummaryBot

Change redirects from disambiguation pages.

__module__ = 'scripts.disambredir'
summary_key = 'disambredir-msg'
treat_page()[source]

Iterate over linked pages and replace redirects conditionally.

scripts.disambredir.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.djvutext script

This bot uploads text from djvu files onto pages in the “Page” namespace.

It is intended to be used for Wikisource.

The following parameters are supported:

-index:...     name of the index page (without the Index: prefix)
-djvu:...      path to the djvu file, it shall be::
               - path to a file name
               - dir where a djvu file name as index is located
               optional, by default is current dir '.'
-pages:<start>-<end>,...<start>-<end>,<start>-<end>
               Page range to upload;
               optional, start=1, end=djvu file number of images.
               Page ranges can be specified as::
                 A-B -> pages A until B
                 A-  -> pages A until number of images
                 A   -> just page A
                 -B  -> pages 1 until B
-summary:      custom edit summary.
               Use quotes if edit summary contains spaces.
-force         overwrites existing text
               optional, default False
-always        don't bother asking to confirm any of the changes.
class scripts.djvutext.DjVuTextBot(djvu, index, pages=None, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot

A bot that uploads text-layer from djvu files to Page:namespace.

Works only on sites with Proofread Page extension installed.

__init__(djvu, index, pages=None, **kwargs)[source]

Initializer.

Parameters:
  • djvu (DjVuFile object) – djvu from where to fetch the text layer
  • index (Page object) – index page in the Index: namespace
  • pages (tuple) – page interval to upload (start, end)
__module__ = 'scripts.djvutext'
gen()[source]

Generate pages from specified page interval.

page_number_gen()[source]

Generate pages numbers from specified page intervals.

treat(page)[source]

Process one page.

scripts.djvutext.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.editarticle script

Edit a Wikipedia article with your favourite editor.

The following parameters are supported:

-r                Edit redirect pages without following them
--edit_redirect   automatically.
--edit-redirect

-p P              Choose which page to edit.
--page P          This argument can be passed positionally.

-w                Add the page to the user's watchlist after editing.
--watch
class scripts.editarticle.ArticleEditor(*args)[source]

Bases: object

Edit a wiki page.

__init__(*args)[source]

Initializer.

__module__ = 'scripts.editarticle'
handle_edit_conflict(new)[source]

When an edit conflict occurs save the new text to a file.

run()[source]

Run the bot.

set_options(*args)[source]

Parse commandline and set options attribute.

setpage()[source]

Set page and page title.

scripts.editarticle.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.fixing_redirects script

Correct all redirect links in featured pages or only one page of each wiki.

Can be used with:

-featured         Run over featured pages (for some wikimedia wikis only)

This script supports use of pywikibot.pagegenerators arguments.

class scripts.fixing_redirects.FixingRedirectBot(site=True, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot, pywikibot.bot.AutomaticTWSummaryBot

Run over pages and resolve redirect links.

__module__ = 'scripts.fixing_redirects'
ignore_server_errors = True

Replace all source links by target.

summary_key = 'fixing_redirects-fixing'
treat_page()[source]

Change all redirects from the current page to actual links.

scripts.fixing_redirects.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.flickrripper script

A tool to transfer flickr photos to Wikimedia Commons.

The following parameters are supported:

-group_id:         specify group ID of the pool
-photoset_id:      specify a photoset id
-user_id:          give the user id of the flickrriper user
-start_id:         the photo id to start with
-end_id:           the photo id to end with
-tags:             a tag to filter photo items (only one is supported)
-flickerreview     add a flickr review template to the description
-reviewer:         specify the reviewer
-override:         override text for licence
-addcategory:      specify a category
-removecategories  remove all categories
-autonomous        run bot in autonomous mode
scripts.flickrripper.buildDescription(flinfoDescription='', flickrreview=False, reviewer='', override='', addCategory='', removeCategories=False)[source]

Build the final description for the image.

The description is based on the info from flickrinfo and improved.

scripts.flickrripper.cleanUpTitle(title)[source]

Clean up the title of a potential MediaWiki page.

Otherwise the title of the page might not be allowed by the software.

scripts.flickrripper.downloadPhoto(photoUrl)[source]

Download the photo and store it in a io.BytesIO object.

TODO: Add exception handling

scripts.flickrripper.findDuplicateImages(photo, site=None)[source]

Find duplicate images.

Take the photo, calculate the SHA1 hash and ask the MediaWiki api for a list of duplicates.

TODO: Add exception handling.

Parameters:
  • photo (io.BytesIO) – Photo
  • site (APISite or None) – Site to search for duplicates. Defaults to using Wikimedia Commons if not supplied.
scripts.flickrripper.getFilename(photoInfo, site=None, project='Flickr')[source]

Build a good filename for the upload based on the username and title.

Prevents naming collisions.

scripts.flickrripper.getFlinfoDescription(photo_id)[source]

Get the description from http://wikipedia.ramselehof.de/flinfo.php.

TODO: Add exception handling, try a couple of times

scripts.flickrripper.getPhoto(flickr, photo_id)[source]

Get the photo info and the photo sizes so we can use these later on.

TODO: Add exception handling

scripts.flickrripper.getPhotoUrl(photoSizes)[source]

Get the url of the jpg file with the highest resolution.

scripts.flickrripper.getPhotos(flickr, user_id='', group_id='', photoset_id='', start_id='', end_id='', tags='')[source]

Loop over a set of Flickr photos.

Get a set to work on (start with just a username).
  • Make it possible to delimit the set (from/to)
scripts.flickrripper.getTags(photoInfo)[source]

Get all the tags on a photo.

scripts.flickrripper.isAllowedLicense(photoInfo)[source]

Check if the image contains the right license.

TODO: Maybe add more licenses

scripts.flickrripper.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.flickrripper.processPhoto(flickr, photo_id='', flickrreview=False, reviewer='', override='', addCategory='', removeCategories=False, autonomous=False)[source]

Process a single Flickr photo.

For each image:
  • Check the license
  • Check if it isn’t already on Commons
  • Build suggested filename * Check for name collision and maybe alter it
  • Pull description from Flinfo
  • Show image and description to user * Add a nice hotcat lookalike for the adding of categories * Filter the categories
  • Upload the image

scripts.followlive script

Periodically grab list of new articles and analyze to blank or flag them.

Script to follow new articles on a wikipedia and flag them with a template or eventually blank them.

There must be A LOT of bugs! Use with caution and verify what it is doing!

The following parameters are supported:

This script supports use of pywikibot.pagegenerators arguments.

class scripts.followlive.CleaningBot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.CurrentPageBot

Bot meant to facilitate customized cleaning of the page.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.followlive'
could_be_bad()[source]

Check whether the page could be bad.

handle_bad_page(*values)[source]

Process one bad page.

init_page(item)[source]

Init the page tuple before processing and return a page object.

Set newpages generator result as instance properties. @ivar page: new page :type page: pywikibot.Page @ivar date: creation date :type date: str in ISO8601 format @ivar length: content length :type length: int @ivar user: creator of page :type user: pywikibot.User

setup()[source]

Setup bot before running.

show_page_info()[source]

Display information about an article.

treat_page()[source]

Process one page.

scripts.followlive.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.freebasemappingupload script

Script to upload the mappings of Freebase to Wikidata.

Can be easily adapted to upload other String identifiers as well

This bot needs the dump from https://developers.google.com/freebase/data#freebase-wikidata-mappings

The script takes a single parameter:

-filename: the filename to read the freebase-wikidata mappings from;
           default: fb2w.nt.gz
class scripts.freebasemappingupload.FreebaseMapperRobot(filename)[source]

Bases: object

Freebase Mapping bot.

__init__(filename)[source]

Initializer.

__module__ = 'scripts.freebasemappingupload'
processLine(line)[source]

Process a single line.

run()[source]

Run the bot.

scripts.freebasemappingupload.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.harvest_template script

Template harvesting script.

Usage (see below for explanations and examples):

python pwb.py harvest_template -transcludes:"..." \
   [default optional arguments] \
   template_parameter PID [local optional arguments] \
   [template_parameter PID [local optional arguments]]
python pwb.py harvest_template [generators] -template:"..." \
   [default optional arguments] \
   template_parameter PID [local optional arguments] \
   [template_parameter PID [local optional arguments]]

This will work on all pages that transclude the template in the article namespace

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

You can also use additional parameters:

-create             Create missing items before importing.

The following command line parameters can be used to change the bot’s behavior. If you specify them before all parameters, they are global and are applied to all param-property pairs. If you specify them after a param-property pair, they are local and are only applied to this pair. If you specify the same argument as both local and global, the local argument overrides the global one (see also examples):

-islink           Treat plain text values as links ("text" -> "[[text]]").

-exists           If set to 'p', add a new value, even if the item already
                  has the imported property but not the imported value.
                  If set to 'pt', add a new value, even if the item already
                  has the imported property with the imported value and
                  some qualifiers.

-multi            If set, try to match multiple values from parameter.

Examples

This will try to import existing images from “image” parameter of “Infobox person” on English Wikipedia as Wikidata property “P18” (image):

python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
    -template:"Infobox person" image P18

This will behave the same as the previous example and also try to import [[links]] from “birth_place” parameter of the same template as Wikidata property “P19” (place of birth):

python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
    -template:"Infobox person" image P18 birth_place P19

This will import both “birth_place” and “death_place” params with -islink modifier, ie. the bot will try to import values, even if it doesn’t find a [[link]]:

python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
    -template:"Infobox person" -islink birth_place P19 death_place P20

This will do the same but only “birth_place” can be imported without a link:

python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
    -template:"Infobox person" birth_place P19 -islink death_place P20

This will import an occupation from “occupation” parameter of “Infobox person” on English Wikipedia as Wikidata property “P106” (occupation). The page won’t be skipped if the item already has that property but there is not the new value:

python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
    -template:"Infobox person" occupation P106 -exists:p

This will import band members from the “current_members” parameter of “Infobox musical artist” on English Wikipedia as Wikidata property “P527” (has part). This will only extract multiple band members if each is linked, and will not add duplicate claims for the same member:

python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
    -template:"Infobox musical artist" current_members P527 -exists:p \
    -multi
class scripts.harvest_template.HarvestRobot(generator, template_title, fields, **kwargs)[source]

Bases: pywikibot.bot.WikidataBot

A bot to add Wikidata claims.

__init__(generator, template_title, fields, **kwargs)[source]

Initializer.

Parameters:
  • generator (iterator) – A generator that yields Page objects
  • template_title (str) – The template to work on
  • fields (dict) – A dictionary of fields that are of use to us
Keyword Arguments:
 
  • islink – Whether non-linked values should be treated as links
  • create – Whether to create a new item if it’s missing
  • exists – pattern for merging existing claims with harvested values
  • multi – Whether multiple values should be extracted from a single parameter
__module__ = 'scripts.harvest_template'
getTemplateSynonyms(title)[source]

Fetch redirects of the title, so we can check against them.

treat_page_and_item(page, item)[source]

Process a single page/item.

class scripts.harvest_template.PropertyOptionHandler(**kwargs)[source]

Bases: pywikibot.bot.OptionHandler

Class holding options for a param-property pair.

__module__ = 'scripts.harvest_template'
availableOptions = {'exists': '', 'islink': False, 'multi': False}
scripts.harvest_template.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.illustrate_wikidata script

Bot to add images to Wikidata items.

The image is extracted from the page_props. For this to be available the PageImages extension (https://www.mediawiki.org/wiki/Extension:PageImages) needs to be installed

Usage:

python pwb.py illustrate_wikidata <some generator>

This script supports use of pywikibot.pagegenerators arguments.

class scripts.illustrate_wikidata.IllustrateRobot(generator, wdproperty='P18')[source]

Bases: pywikibot.bot.WikidataBot

A bot to add Wikidata image claims.

__init__(generator, wdproperty='P18')[source]

Initializer.

Parameters:
  • generator (generator) – A generator that yields Page objects
  • wdproperty (str) – The property to add. Should be of type commonsMedia
__module__ = 'scripts.illustrate_wikidata'
treat_page_and_item(page, item)[source]

Treat a page / item.

scripts.illustrate_wikidata.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.image script

This script can be used to change one image to another or remove an image.

Syntax:

python pwb.py image image_name [new_image_name]

If only one command-line parameter is provided then that image will be removed; if two are provided, then the first image will be replaced by the second one on all pages.

Command line options:

-summary:  Provide a custom edit summary. If the summary includes spaces,
           surround it with single quotes, such as:
           -summary:'My edit summary'
-always    Don't prompt to make changes, just do them.
-loose     Do loose replacements. This will replace all occurrences of the name
           of the image (and not just explicit image syntax). This should work
           to catch all instances of the image, including where it is used as a
           template parameter or in image galleries. However, it can also make
           more mistakes. This only works with image replacement, not image
           removal.

Examples

The image “FlagrantCopyvio.jpg” is about to be deleted, so let’s first remove it from everything that displays it:

python pwb.py image FlagrantCopyvio.jpg

The image “Flag.svg” has been uploaded, making the old “Flag.jpg” obsolete:

python pwb.py image Flag.jpg Flag.svg
class scripts.image.ImageRobot(generator, old_image, new_image=None, **kwargs)[source]

Bases: scripts.replace.ReplaceRobot

This bot will replace or remove all occurrences of an old image.

__init__(generator, old_image, new_image=None, **kwargs)[source]

Initializer.

Parameters:
  • generator (iterable) – the pages to work on
  • old_image (str) – the title of the old image (without namespace)
  • new_image (str or None) – the title of the new image (without namespace), or None if you want to remove the image
__module__ = 'scripts.image'
scripts.image.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.imagecopy script

Script to copy files from a local Wikimedia wiki to Wikimedia Commons.

It uses CommonsHelper to not leave any information out and CommonSense to automatically categorise the file. After copying, a NowCommons template is added to the local wiki’s file. It uses a local exclusion list to skip files with templates not allow on Wikimedia Commons. If no categories have been found, the file will be tagged on Commons.

This bot uses a graphical interface and may not work from commandline only environment.

Requests for improvement for CommonsHelper output should be directed to Magnus Manske at his talk page. Please be very specific in your request (describe current output and expected output) and note an example file, so he can test at: [[de:Benutzer Diskussion:Magnus Manske]]. You can write him in German and English.

Command line options:

-always      Skip the GUI validation

-setcat:     Set the category of the copied image

-delete      Delete the image after the image has been transferred. This will
             only work if the user has sysops privileges, otherwise the image
             will only be marked for deletion.

This script supports use of pywikibot.pagegenerators arguments.

Examples

Work on a single image:

python pwb.py imagecopy -page:Image:<imagename>

Work on the 100 newest images:

python pwb.py imagecopy -newimages:100

Work on all images in a category:<cat>:

python pwb.py imagecopy -cat:<cat>

Work on all images which transclude a template:

python pwb.py imagecopy -transcludes:<template>

Work on a single image and deletes the image when the transfer is complete (only works if the user has sysops privilege, otherwise it will be marked for deletion):

python pwb.py imagecopy -page:Image:<imagename> -delete

By default the bot works on your home wiki (set in user-config)

class scripts.imagecopy.TkdialogIC(image_title, content, uploader, url, templates)[source]

Bases: object

The dialog window for image info.

__init__(image_title, content, uploader, url, templates)[source]

Initializer.

__module__ = 'scripts.imagecopy'
add2_auto_skip()[source]

The user pressed the Add to AutoSkip button.

getnewname()[source]

Activate dialog.

Returns:new name and if the image is skipped
Return type:tuple
ok_file()[source]

The user pressed the OK button.

open_in_browser()[source]

The user pressed the View in browser button.

scripts.imagecopy.doiskip(pagetext)[source]

Skip this image or not.

Returns True if the image is on the skip list, otherwise False

scripts.imagecopy.getautoskip()[source]

Get a list of templates to skip.

class scripts.imagecopy.imageTransfer(imagePage, newname, category, delete_after_done=False)[source]

Bases: threading.Thread

Facilitate transfer of image/file to commons.

__init__(imagePage, newname, category, delete_after_done=False)[source]

Initializer.

__module__ = 'scripts.imagecopy'
fixAuthor(pageText)[source]

Fix the author field in the information template.

run()[source]

Run the bot.

scripts.imagecopy.load_global_archivo()[source]

Load/create Uploadbot.localskips.txt and save the path in archivo.

scripts.imagecopy.main(*args)[source]

Process command line arguments and invoke bot.

scripts.imagecopy.pageTextPost(url, parameters)[source]

Get data from commons helper page.

Parameters:
  • url – This parameter is not used here, we keep it here to avoid user scripts from breaking.
  • parameters (dict) – Data that will be submitted to CommonsHelper.
Returns:

A CommonHelper description message.

Return type:

str

scripts.imagecopy_self script

Script to copy self published files from English Wikipedia to Commons.

This bot is based on imagecopy.py and intended to be used to empty out http://en.wikipedia.org/wiki/Category:Self-published_work

This bot uses a graphical interface and may not work from commandline only environment.

Examples

Work on a single file:

python pwb.py imagecopy -page:file:<filename>

Work on all images in a category:<cat>:

python pwb.py imagecopy -cat:<cat>

Work on all images which transclude a template:

python pwb.py imagecopy -transcludes:<template>

See pagegenerators.py for more ways to get a list of images. By default the bot works on your home wiki (set in user-config)

This is a first test version and should be used with care.

Use -nochecktemplate if you don’t want to add the check template. Be sure to check it yourself.

class scripts.imagecopy_self.TkdialogICS(fields)[source]

Bases: object

The dialog window for image info.

__init__(fields)[source]

Initializer.

fields:
imagepage, description, date, source, author, licensetemplate, categories
__module__ = 'scripts.imagecopy_self'
getnewmetadata()[source]

Activate dialog and return new name and if the image is skipped.

ok_file()[source]

The user pressed the OK button.

open_in_browser()[source]

The user pressed the View in browser button.

class scripts.imagecopy_self.imageFetcher(pagegenerator, prefetchQueue)[source]

Bases: threading.Thread

Tries to fetch information for all images in the generator.

__init__(pagegenerator, prefetchQueue)[source]

Initializer.

__module__ = 'scripts.imagecopy_self'

Convert links from the current wiki to Commons.

doiskip(imagepage)[source]

Skip this image or not.

Returns True if the image is on the skip list, otherwise False

getAuthor(imagepage)[source]

Get the first uploader.

getAuthorText(imagepage)[source]

Get uploader to put in the author field of information template.

getNewCategories(imagepage)[source]

Get categories for the image.

Don’t forget to filter.

getNewFields(imagepage)[source]

Build a new description based on the imagepage.

getNewFieldsFromFreetext(imagepage)[source]

Extract fields from free text for the new information template.

getNewFieldsFromInformation(imagepage)[source]

Extract fields from current information template for new template.

Parameters:imagepage (pywikibot.FilePage) – The file page to get the template.
getNewLicensetemplate(imagepage)[source]

Get a license template to put on the image to be uploaded.

getSource(imagepage, source='')[source]

Get text to put in the source field of new information template.

getUploadDate(imagepage)[source]

Get the original upload date for usage.

The date is put in the date field of the new information template. If we really have nothing better.

processImage(page)[source]

Work on a single image.

run()[source]

Run imageFetcher.

scripts.imagecopy_self.main(*args)[source]

Process command line arguments and invoke bot.

scripts.imagecopy_self.supportedSite()[source]

Check if this site is supported.

class scripts.imagecopy_self.uploader(uploadQueue)[source]

Bases: threading.Thread

Upload all images.

__init__(uploadQueue)[source]

Initializer.

__module__ = 'scripts.imagecopy_self'
buildNewImageDescription(fields)[source]

Build a new information template.

getOriginalUploadLog(imagepage)[source]

Get upload log to put at the bottom of the image description page.

Parameters:imagepage (pywikibot.FilePage) – The file page to retrieve the log.
nochecktemplate()[source]

Don’t want to add {{BotMoveToCommons}}.

processImage(fields)[source]

Work on a single image.

replaceUsage(imagepage, filename)[source]

Replace all usage if image is uploaded under a different name.

run()[source]

Run uploader.

tagNowcommons(imagepage, filename)[source]

Tagged the imag which has been moved to Commons for deletion.

class scripts.imagecopy_self.userInteraction(prefetchQueue, uploadQueue)[source]

Bases: threading.Thread

Prompt all images to the user.

__init__(prefetchQueue, uploadQueue)[source]

Initializer.

__module__ = 'scripts.imagecopy_self'
processImage(fields)[source]

Work on a single image.

run()[source]

Run thread.

setAutonomous()[source]

Don’t do any user interaction.

scripts.imageharvest script

Bot for getting multiple images from an external site.

It takes a URL as an argument and finds all images (and other files specified by the extensions in ‘fileformats’) that URL is referring to, asking whether to upload them. If further arguments are given, they are considered to be the text that is common to the descriptions. BeautifulSoup is needed only in this case.

A second use is to get a number of images that have URLs only differing in numbers. To do this, use the command line option “-pattern”, and give the URL with the variable part replaced by ‘$’ (if that character occurs in the URL itself, you will have to change the bot code, my apologies).

Other options:

-shown      Choose images shown on the page as well as linked from it
-justshown  Choose _only_ images shown on the page, not those linked

Given a URL, get all images linked to by the page at that URL.

scripts.imageharvest.main(*args)[source]

Process command line arguments and invoke bot.

scripts.imageharvest.run_bot(give_url, image_url, desc)[source]

Run the bot.

scripts.imagerecat script

Program to (re)categorize images at commons.

The program uses commonshelper for category suggestions. It takes the suggestions and the current categories. Put the categories through some filters and adds the result.

The following command line parameters are supported:

-onlyfilter     Don't use Commonsense to get categories, just filter the
                current categories

-onlyuncat      Only work on uncategorized images. Will prevent the bot from
                working on an image multiple times.

-hint           Give Commonsense a hint.
                For example -hint:li.wikipedia.org

-onlyhint       Give Commonsense a hint. And only work on this hint.
                Syntax is the same as -hint. Some special hints are possible:
                _20 : Work on the top 20 wikipedia's
                _80 : Work on the top 80 wikipedia's
                wps : Work on all wikipedia's
scripts.imagerecat.applyAllFilters(categories)[source]

Apply all filters on categories.

scripts.imagerecat.categorizeImages(generator, onlyFilter, onlyUncat)[source]

Loop over all images in generator and try to categorize them.

Get category suggestions from CommonSense.

scripts.imagerecat.filterBlacklist(categories)[source]

Filter out categories which are on the blacklist.

scripts.imagerecat.filterCountries(categories)[source]

Try to filter out …by country categories.

First make a list of any …by country categories and try to find some countries. If a by country category has a subcategoy containing one of the countries found, add it. The …by country categories remain in the set and should be filtered out by filterParents.

scripts.imagerecat.filterDisambiguation(categories)[source]

Filter out disambiguation categories.

scripts.imagerecat.filterParents(categories)[source]

Remove all parent categories from the set to prevent overcategorization.

DEPRECATED: Toolserver script isn’t available anymore (T78462). This method is kept for compatibility and may be restored sometime by a new implementation.

scripts.imagerecat.followRedirects(categories)[source]

If a category is a redirect, replace the category with the target.

scripts.imagerecat.getCategoryByName(name, parent='', grandparent='')[source]

Get category by name.

scripts.imagerecat.getCheckCategoriesTemplate(usage, galleries, ncats)[source]

Build the check categories template with all parameters.

scripts.imagerecat.getCommonshelperCats(imagepage)[source]

Get category suggestions from CommonSense.

Return type:list of unicode
scripts.imagerecat.getCurrentCats(imagepage)[source]

Get the categories currently on the image.

scripts.imagerecat.getOpenStreetMap(latitude, longitude)[source]

Get the result from https://nominatim.openstreetmap.org/reverse .

Return type:list of tuples
scripts.imagerecat.getOpenStreetMapCats(latitude, longitude)[source]

Get a list of location categories based on the OSM nomatim tool.

scripts.imagerecat.getUsage(use)[source]

Parse the Commonsense output to get the usage.

scripts.imagerecat.initLists()[source]

Get the list of countries & the blacklist from Commons.

scripts.imagerecat.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.imagerecat.removeTemplates(oldtext='')[source]

Remove {{Uncategorized}} and {{Check categories}} templates.

scripts.imagerecat.saveImagePage(imagepage, newcats, usage, galleries, onlyFilter)[source]

Remove the old categories and add the new categories to the image.

scripts.imagetransfer script

Script to copy images to Wikimedia Commons, or to another wiki.

Syntax:

python pwb.py imagetransfer {<pagename>\|<generator>} [<options>]

The following parameters are supported:

-interwiki   Look for images in pages found through interwiki links.

-keepname    Keep the filename and do not verify description while replacing

-tolang:x    Copy the image to the wiki in language x

-tofamily:y  Copy the image to a wiki in the family y

-file:z      Upload many files from textfile: [[Image:x]]
                                              [[Image:y]]

If pagename is an image description page, offers to copy the image to the target site. If it is a normal page, it will offer to copy any of the images used on that page, or if the -interwiki argument is used, any of the images used on a page reachable via interwiki links.

This script supports use of pywikibot.pagegenerators arguments.

class scripts.imagetransfer.ImageTransferBot(generator, targetSite=None, interwiki=False, keep_name=False, ignore_warning=False)[source]

Bases: object

Image transfer bot.

__init__(generator, targetSite=None, interwiki=False, keep_name=False, ignore_warning=False)[source]

Initializer.

__module__ = 'scripts.imagetransfer'
run()[source]

Run the bot.

showImageList(imagelist)[source]

Print image list.

transferImage(sourceImagePage)[source]

Download image and its description, and upload it to another site.

Returns:the filename which was used to upload the image
scripts.imagetransfer.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.imageuncat script

Program to add uncat template to images without categories at commons.

See imagerecat.py to add these images to categories.

This script is working on the given site, so if the commons should be handled, the site commons should be given and not a Wikipedia or similar.

This script supports use of pywikibot.pagegenerators arguments.

scripts.imageuncat.addUncat(page)[source]

Add the uncat template to the page.

Parameters:page (pywikibot.Page) – Page to be modified
scripts.imageuncat.isUncat(page)[source]

Do we want to skip this page.

If we found a category which is not in the ignore list it means that the page is categorized so skip the page. If we found a template which is in the ignore list, skip the page.

scripts.imageuncat.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.imageuncat.uploadedYesterday(site)[source]

Return a pagegenerator containing all the pictures uploaded yesterday.

DEPRECATED. Only used by a deprecated option.

scripts.interwiki script

Script to check language links for general pages.

Uses existing translations of a page, plus hints from the command line, to download the equivalent pages from other languages. All of such pages are downloaded as well and checked for interwiki links recursively until there are no more links that are encountered. A rationalization process then selects the right interwiki links, and if this is unambiguous, the interwiki links in the original page will be automatically updated and the modified page uploaded.

These command-line arguments can be used to specify which pages to work on:

-days:         Like -years, but runs through all date pages. Stops at
               Dec 31. If the argument is given in the form -days:X,
               it will start at month no. X through Dec 31. If the
               argument is simply given as -days, it will run from
               Jan 1 through Dec 31. E.g. for -days:9 it will run
               from Sep 1 through Dec 31.

-years:        run on all year pages in numerical order. Stop at year 2050.
               If the argument is given in the form -years:XYZ, it
               will run from [[XYZ]] through [[2050]]. If XYZ is a
               negative value, it is interpreted as a year BC. If the
               argument is simply given as -years, it will run from 1
               through 2050.

               This implies -noredirect.

-new:          Work on the 100 newest pages. If given as -new:x, will work
               on the x newest pages.
               When multiple -namespace parameters are given, x pages are
               inspected, and only the ones in the selected name spaces are
               processed. Use -namespace:all for all namespaces. Without
               -namespace, only article pages are processed.

               This implies -noredirect.

-restore:      restore a set of "dumped" pages the bot was working on
               when it terminated. The dump file will be subsequently
               removed.

-restore:all   restore a set of "dumped" pages of all dumpfiles to a given
               family remaining in the "interwiki-dumps" directory. All
               these dump files will be subsequently removed. If restoring
               process interrupts again, it saves all unprocessed pages in
               one new dump file of the given site.

-continue:     like restore, but after having gone through the dumped
               pages, continue alphabetically starting at the last of the
               dumped pages. The dump file will be subsequently removed.

-warnfile:     used as -warnfile:filename, reads all warnings from the
               given file that apply to the home wiki language,
               and read the rest of the warning as a hint. Then
               treats all the mentioned pages. A quicker way to
               implement warnfile suggestions without verifying them
               against the live wiki is using the warnfile.py
               script.

This script supports use of pywikibot.pagegenerators arguments.

Additionally, these arguments can be used to restrict the bot to certain pages:

-namespace:n   Number or name of namespace to process. The parameter can be
               used multiple times. It works in combination with all other
               parameters, except for the -start parameter. If you e.g.
               want to iterate over all categories starting at M, use
               -start:Category:M.

-number:       used as -number:#, specifies that the bot should process
               that amount of pages and then stop. This is only useful in
               combination with -start. The default is not to stop.

-until:        used as -until:title, specifies that the bot should
               process pages in wiki default sort order up to, and
               including, "title" and then stop. This is only useful in
               combination with -start. The default is not to stop.
               Note: do not specify a namespace, even if -start has one.

-bracket       only work on pages that have (in the home language)
               parenthesis in their title. All other pages are skipped.
               (note: without ending colon)

-skipfile:     used as -skipfile:filename, skip all links mentioned in
               the given file. This does not work with -number!

-skipauto      use to skip all pages that can be translated automatically,
               like dates, centuries, months, etc.
               (note: without ending colon)

-lack:         used as -lack:xx with xx a language code: only work on pages
               without links to language xx. You can also add a number nn
               like -lack:xx:nn, so that the bot only works on pages with
               at least nn interwiki links (the default value for nn is 1).

These arguments control miscellaneous bot behaviour:

-quiet         Use this option to get less output
               (note: without ending colon)

-async         Put page on queue to be saved to wiki asynchronously. This
               enables loading pages during saving throtteling and gives a
               better performance.
               NOTE: For post-processing it always assumes that saving the
               the pages was successful.
               (note: without ending colon)

-summary:      Set an additional action summary message for the edit. This
               could be used for further explainings of the bot action.
               This will only be used in non-autonomous mode.

-hintsonly     The bot does not ask for a page to work on, even if none of
               the above page sources was specified. This will make the
               first existing page of -hint or -hinfile slip in as start
               page, determining properties like namespace, disambiguation
               state, and so on. When no existing page is found in the
               hints, the bot does nothing.
               Hitting return without input on the "Which page to check:"
               prompt has the same effect as using -hintsonly.
               Options like -back, -same or -wiktionary are in effect only
               after a page has been found to work on.
               (note: without ending colon)

These arguments are useful to provide hints to the bot:

-hint:         used as -hint:de:Anweisung to give the bot a hint
               where to start looking for translations. If no text
               is given after the second ':', the name of the page
               itself is used as the title for the hint, unless the
               -hintnobracket command line option (see there) is also
               selected.

               There are some special hints, trying a number of languages
               at once::

                  * all:       All languages with at least ca. 100 articles
                  * 10:        The 10 largest languages (sites with most
                               articles). Analogous for any other natural
                               number
                  * arab:      All languages using the Arabic alphabet
                  * cyril:     All languages that use the Cyrillic alphabet
                  * chinese:   All Chinese dialects
                  * latin:     All languages using the Latin script
                  * scand:     All Scandinavian languages

               Names of families that forward their interlanguage links
               to the wiki family being worked upon can be used, they are::

                  * commons:   Interlanguage links of Mediawiki Commons
                  * incubator: Links in pages on the Mediawiki Incubator
                  * meta:      Interlanguage links of named pages on Meta
                  * species:   Interlanguage links of the wikispecies wiki
                  * strategy:  Links in pages on Wikimedias strategy wiki
                  * test:      Take interwiki links from Test Wikipedia
                  * wikimania: Interwiki links of Wikimania

               Languages, groups and families having the same page title
               can be combined, as -hint:5,scand,sr,pt,commons:New_York

-hintfile:     similar to -hint, except that hints are taken from the given
               file, enclosed in [[]] each, instead of the command line.

-askhints:     for each page one or more hints are asked. See hint: above
               for the format, one can for example give "en:something" or
               "20:" as hint.

-repository    Include data repository

-same          looks over all 'serious' languages for the same title.
               -same is equivalent to -hint:all::
               (note: without ending colon)

-wiktionary:   similar to -same, but will ONLY accept names that are
               identical to the original. Also, if the title is not
               capitalized, it will only go through other wikis without
               automatic capitalization.

-untranslated: works normally on pages with at least one interlanguage
               link; asks for hints for pages that have none.

-untranslatedonly: same as -untranslated, but pages which already have a
               translation are skipped. Hint: do NOT use this in
               combination with -start without a -number limit, because
               you will go through the whole alphabet before any queries
               are performed!

-showpage      when asking for hints, show the first bit of the text
               of the page always, rather than doing so only when being
               asked for (by typing '?'). Only useful in combination
               with a hint-asking option like -untranslated, -askhints
               or -untranslatedonly.
               (note: without ending colon)

-noauto        Do not use the automatic translation feature for years and
               dates, only use found links and hints.
               (note: without ending colon)

-hintnobracket used to make the bot strip everything in last brackets,
               and surrounding spaces from the page name, before it is
               used in a -hint:xy: where the page name has been left out,
               or -hint:all:, -hint:10:, etc. without a name, or
               an -askhint reply, where only a language is given.

These arguments define how much user confirmation is required:

-autonomous    run automatically, do not ask any questions. If a question
-auto          to an operator is needed, write the name of the page
               to autonomous_problems.dat and continue on the next page.
               (note: without ending colon)

-confirm       ask for confirmation before any page is changed on the
               live wiki. Without this argument, additions and
               unambiguous modifications are made without confirmation.
               (note: without ending colon)

-force         do not ask permission to make "controversial" changes,
               like removing a language because none of the found
               alternatives actually exists.
               (note: without ending colon)

-cleanup       like -force but only removes interwiki links to non-existent
               or empty pages.

-select        ask for each link whether it should be included before
               changing any page. This is useful if you want to remove
               invalid interwiki links and if you do multiple hints of
               which some might be correct and others incorrect. Combining
               -select and -confirm is possible, but seems like overkill.
               (note: without ending colon)

These arguments specify in which way the bot should follow interwiki links:

-noredirect    do not follow redirects nor category redirects.
               (note: without ending colon)

-initialredirect  work on its target if a redirect or category redirect is
               entered on the command line or by a generator (note: without
               ending colon). It is recommended to use this option with the
               -movelog pagegenerator.

-neverlink:    used as -neverlink:xx where xx is a language code::
               Disregard any links found to language xx. You can also
               specify a list of languages to disregard, separated by
               commas.

-ignore:       used as -ignore:xx:aaa where xx is a language code, and
               aaa is a page title to be ignored.

-ignorefile:   similar to -ignore, except that the pages are taken from
               the given file instead of the command line.

-localright    do not follow interwiki links from other pages than the
               starting page. (Warning! Should be used very sparingly,
               only when you are sure you have first gotten the interwiki
               links on the starting page exactly right).
               (note: without ending colon)

-hintsareright do not follow interwiki links to sites for which hints
               on existing pages are given. Note that, hints given
               interactively, via the -askhint command line option,
               are only effective once they have been entered, thus
               interwiki links on the starting page are followed
               regardess of hints given when prompted.
               (Warning! Should be used with caution!)
               (note: without ending colon)

-back          only work on pages that have no backlink from any other
               language; if a backlink is found, all work on the page
               will be halted.  (note: without ending colon)

The following arguments are only important for users who have accounts for multiple languages, and specify on which sites the bot should modify pages:

-localonly     only work on the local wiki, not on other wikis in the
               family I have a login at. (note: without ending colon)

-limittwo      only update two pages - one in the local wiki (if logged-in)
               and one in the top available one.
               For example, if the local page has links to de and fr,
               this option will make sure that only the local site and
               the de: (larger) sites are updated. This option is useful
               to quickly set two way links without updating all of the
               wiki families sites.
               (note: without ending colon)

-whenneeded    works like limittwo, but other languages are changed in the
               following cases::

               * If there are no interwiki links at all on the page
               * If an interwiki link must be removed
               * If an interwiki link must be changed and there has been
                 a conflict for this page

               Optionally, -whenneeded can be given an additional number
               (for example -whenneeded:3), in which case other languages
               will be changed if there are that number or more links to
               change or add. (note: without ending colon)

The following arguments influence how many pages the bot works on at once:

-array:        The number of pages the bot tries to be working on at once.
               If the number of pages loaded is lower than this number,
               a new set of pages is loaded from the starting wiki. The
               default is 100, but can be changed in the config variable
               interwiki_min_subjects

-query:        The maximum number of pages that the bot will load at once.
               Default value is 50.

Some configuration option can be used to change the working of this bot:

interwiki_min_subjects: the minimum amount of subjects that should be
                    processed at the same time.

interwiki_backlink: if set to True, all problems in foreign wikis will
                    be reported

interwiki_shownew:  should interwiki.py display every new link it discovers?

interwiki_graph:    output a graph PNG file on conflicts? You need pydot for
                    this: https://pypi.org/project/pydot/

interwiki_graph_format: the file format for interwiki graphs

without_interwiki:  save file with local articles without interwikis

All these options can be changed through the user-config.py configuration file.

If interwiki.py is terminated before it is finished, it will write a dump file to the interwiki-dumps subdirectory. The program will read it if invoked with the “-restore” or “-continue” option, and finish all the subjects in that list. After finishing the dump file will be deleted. To run the interwiki-bot on all pages on a language, run it with option “-start:!”, and if it takes so long that you have to break it off, use “-continue” next time.

exception scripts.interwiki.GiveUpOnPage(arg)[source]

Bases: pywikibot.exceptions.Error

User chose not to work on this page and its linked pages any more.

__module__ = 'scripts.interwiki'
class scripts.interwiki.InterwikiBot(conf=None)[source]

Bases: object

A class keeping track of a list of subjects.

It controls which pages are queried from which languages when.

__init__(conf=None)[source]

Initializer.

__len__()[source]

Return length of subjects.

__module__ = 'scripts.interwiki'
add(page, hints=None)[source]

Add a single subject to the list.

dump(append=True)[source]

Write dump file.

firstSubject()[source]

Return the first subject that is still being worked on.

generateMore(number)[source]

Generate more subjects.

This is called internally when the list of subjects becomes too small, but only if there is a PageGenerator

isDone()[source]

Check whether there is still more work to do.

maxOpenSite()[source]

Return the site that has the most open queries plus the number.

If there is nothing left, return None. Only languages that are TODO for the first Subject are returned.

minus(site, count=1)[source]

Helper routine that the Subject class expects in a counter.

oneQuery()[source]

Perform one step in the solution process.

Returns True if pages could be preloaded, or false otherwise.

plus(site, count=1)[source]

Helper routine that the Subject class expects in a counter.

queryStep()[source]

Delete the ones that are done now.

run()[source]

Start the process until finished.

selectQuerySite()[source]

Select the site the next query should go out for.

setPageGenerator(pageGenerator, number=None, until=None)[source]

Add a generator of subjects.

Once the list of subjects gets too small, this generator is called to produce more Pages.

class scripts.interwiki.InterwikiBotConfig[source]

Bases: object

Container class for interwikibot’s settings.

__module__ = 'scripts.interwiki'
always = False
askhints = False
asynchronous = False
auto = True
autonomous = False
cleanup = False
confirm = False
contentsondisk = False
followinterwiki = True
followredirect = True
force = False
hintnobracket = False
hints = []
hintsareright = False
ignore = []
initialredirect = False
lacklanguage = None
limittwo = False
localonly = False
maxquerysize = 50
minsubjects = 100
needlimit = 0
nobackonly = False
note(text)[source]

Output a notification message with.

The text will be printed only if conf.quiet isn’t set. :param text: text to be shown :type text: str

parenthesesonly = False
quiet = False
readOptions(option)[source]

Read all commandline parameters for the global container.

rememberno = False
remove = []
repository = False
restore_all = False
same = False
select = False
showtextlinkadd = 300
skip = {}
skipauto = False
strictlimittwo = False
summary = ''
untranslated = False
untranslatedonly = False
exception scripts.interwiki.LinkMustBeRemoved(arg)[source]

Bases: scripts.interwiki.SaveError

An interwiki link has to be removed, but this can’t be done because of user preferences or because the user chose not to change the page.

__module__ = 'scripts.interwiki'
class scripts.interwiki.PageTree[source]

Bases: object

Structure to manipulate a set of pages.

Allows filtering efficiently by Site.

__init__()[source]

Initializer.

While using dict values would be faster for the remove() operation, keeping list values is important, because the order in which the pages were found matters: the earlier a page is found, the closer it is to the Subject.originPage. Chances are that pages found within 2 interwiki distance from the originPage are more related to the original topic than pages found later on, after 3, 4, 5 or more interwiki hops.

Keeping this order is hence important to display an ordered list of pages to the user when he’ll be asked to resolve conflicts.

@ivar tree: dictionary with Site as keys and list of page as values.
All pages found within Site are kept in self.tree[site].
__iter__()[source]

Iterate through all items of the tree.

__len__()[source]

Length of the object.

__module__ = 'scripts.interwiki'
add(page)[source]

Add a page to the tree.

filter(site)[source]

Iterate over pages that are in Site site.

remove(page)[source]

Remove a page from the tree.

removeSite(site)[source]

Remove all pages from Site site.

siteCounts()[source]

Yield (Site, number of pages in site) pairs.

exception scripts.interwiki.SaveError(arg)[source]

Bases: pywikibot.exceptions.Error

An attempt to save a page with changed interwiki has failed.

__module__ = 'scripts.interwiki'
class scripts.interwiki.StoredPage(page)[source]

Bases: pywikibot.page.Page

Store the Page contents on disk.

This is to avoid sucking too much memory when a big number of Page objects will be loaded at the same time.

SPcopy = ['_editrestriction', '_site', '_namespace', '_section', '_title', 'editRestriction', 'moveRestriction', '_permalink', '_userName', '_ipedit', '_editTime', '_startTime', '_revisionId', '_deletedRevs']
SPdelContents()[source]

Delete stored content.

static SPdeleteStore()[source]

Delete SPStore.

SPgetContents()[source]

Get stored content.

SPpath = None
SPsetContents(contents)[source]

Store content.

SPstore = None
__init__(page)[source]

Initializer.

__module__ = 'scripts.interwiki'
class scripts.interwiki.Subject(originPage=None, hints=None, conf=None)[source]

Bases: pywikibot.interwiki_graph.Subject

Class to follow the progress of a single ‘subject’.

(i.e. a page with all its translations)

Subject is a transitive closure of the binary relation on Page: “has_a_langlink_pointing_to”.

A formal way to compute that closure would be:

With P a set of pages, NL (‘NextLevel’) a function on sets defined as:

NL(P) = { target | ∃ source ∈ P, target ∈ source.langlinks() }

pseudocode:

todo <- [originPage]
done <- []
while todo != []:
    pending <- todo
    todo <-NL(pending) / done
    done <- NL(pending) U done
return done

There is, however, one limitation that is induced by implementation: to compute efficiently NL(P), one has to load the page contents of pages in P. (Not only the langlinks have to be parsed from each Page, but we also want to know if the Page is a redirect, a disambiguation, etc…)

Because of this, the pages in pending have to be preloaded. However, because the pages in pending are likely to be in several sites we cannot “just” preload them as a batch.

Instead of doing “pending <- todo” at each iteration, we have to elect a Site, and we put in pending all the pages from todo that belong to that Site:

Code becomes:

todo <- {originPage.site:[originPage]}
done <- []
while todo != {}:
    site <- electSite()
    pending <- todo[site]

    preloadpages(site, pending)

    todo[site] <- NL(pending) / done
    done <- NL(pending) U done
return done

Subject objects only operate on pages that should have been preloaded before. In fact, at any time:

  • todo contains new Pages that have not been loaded yet
  • done contains Pages that have been loaded, and that have been treated.
  • If batch preloadings are successful, Page._get() is never called from this Object.
__init__(originPage=None, hints=None, conf=None)[source]

Initializer.

Takes as arguments the Page on the home wiki plus optionally a list of hints for translation

__module__ = 'scripts.interwiki'
addIfNew(page, counter, linkingPage)[source]

Add the pagelink given to the todo list, if it hasn’t been seen yet.

If it is added, update the counter accordingly.

Also remembers where we found the page, regardless of whether it had already been found before or not.

Returns True if the page is new.

askForHints(counter)[source]

Ask for hints to other sites.

assemble()[source]

Assemble language links.

batchLoaded(counter)[source]

Notify that the promised batch of pages was loaded.

This is called by a worker to tell us that the promised batch of pages was loaded. In other words, all the pages in self.pending have already been preloaded.

The only argument is an instance of a counter class, that has methods minus() and plus() to keep counts of the total work todo.

clean()[source]

Delete the contents that are stored on disk for this Subject.

We cannot afford to define this in a StoredPage destructor because StoredPage instances can get referenced cyclicly: that would stop the garbage collector from destroying some of those objects.

It’s also not necessary to set these lines as a Subject destructor: deleting all stored content one entry by one entry when bailing out after a KeyboardInterrupt for example is redundant, because the whole storage file will be eventually removed.

disambigMismatch(page, counter)[source]

Check whether the given page has a different disambiguation status.

Returns a tuple (skip, alternativePage).

skip is True if the pages have mismatching statuses and the bot is either in autonomous mode, or the user chose not to use the given page.

alternativePage is either None, or a page that the user has chosen to use instead of the given page.

finish()[source]

Round up the subject, making any necessary changes.

This should be called exactly once after the todo list has gone empty.

getFoundDisambig(site)[source]

Return the first disambiguation found.

If we found a disambiguation on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.

getFoundInCorrectNamespace(site)[source]

Return the first page in the extended namespace.

If we found a page that has the expected namespace on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.

getFoundNonDisambig(site)[source]

Return the first non-disambiguation found.

If we found a non-disambiguation on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.

isDone()[source]

Return True if all the work for this subject has completed.

isIgnored(page)[source]

Return True if pages is to be ignored.

makeForcedStop(counter)[source]

End work on the page before the normal end.

namespaceMismatch(linkingPage, linkedPage, counter)[source]

Check whether or not the given page has a different namespace.

Returns True if the namespaces are different and the user has selected not to follow the linked page.

openSites()[source]

Iterator.

Yields (site, count) pairs: * site is a site where we still have work to do on * count is the number of items in that Site that need work on

problem(txt, createneed=True)[source]

Report a problem with the resolution of this subject.

Return True if saving was successful.

Report missing back links. This will be called from finish() if needed.

updatedSites is a list that contains all sites we changed, to avoid reporting of missing backlinks for pages we already fixed

reportInterwikilessPage(page)[source]

Report interwikiless page.

skipPage(page, target, counter)[source]

Return whether page has to be skipped.

translate(hints=None, keephintedsites=False)[source]

Add the given translation hints to the todo list.

whatsNextPageBatch(site)[source]

Return the next page batch.

By calling this method, you ‘promise’ this instance that you will preload all the ‘site’ Pages that are in the todo list.

This routine will return a list of pages that can be treated.

whereReport(page, indent=4)[source]

Report found interlanguage links with conflicts.

wiktionaryMismatch(page)[source]

Check for ignoring pages.

scripts.interwiki.botMayEdit(page)[source]

Test for allowed edits.

scripts.interwiki.compareLanguages(old, new, insite, summary)[source]

Compare changes and setup i18n message.

scripts.interwiki.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.interwiki.page_empty_check(page)[source]

Return True if page should be skipped as it is almost empty.

Pages in content namespaces are considered empty if they contain less than 50 characters, and other pages are considered empty if they are not category pages and contain less than 4 characters excluding interlanguage links and categories.

Return type:bool
scripts.interwiki.readWarnfile(filename, bot)[source]

Read old interlanguage conflicts.

scripts.interwikidata script

Script to handle interwiki links based on Wikibase.

This script connects pages to Wikibase items using language links on the page. If multiple language links are present, and they are connected to different items, the bot skips. After connecting the page to an item, language links can be removed from the page.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-clean            Clean pages.

-create           Create items.

-merge            Merge items.

-summary:         Use your own edit summary for cleaning the page.
class scripts.interwikidata.IWBot(**kwargs)[source]

Bases: pywikibot.bot.ExistingPageBot, pywikibot.bot.SingleSiteBot

The bot for interwiki.

__init__(**kwargs)[source]

Initialize the bot.

__module__ = 'scripts.interwikidata'
clean_page()[source]

Clean interwiki links from the page.

create_item()[source]

Create item in repo for current_page.

get_items()[source]

Return all items of pages linked through the interwiki.

handle_complicated()[source]

Handle pages when they have interwiki conflict.

When this method returns True it means conflict has resolved and it’s okay to clean old interwiki links. This method should change self.current_item and fix conflicts. Change it in subclasses.

treat_page()[source]

Check page.

try_to_add()[source]

Add current page in repo.

try_to_merge(item)[source]

Merge two items.

scripts.interwikidata.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.isbn script

This script reports and fixes invalid ISBN numbers.

Additionally, it can convert all ISBN-10 codes to the ISBN-13 format, and correct the ISBN format by placing hyphens.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-to13             Converts all ISBN-10 codes to ISBN-13.
                  NOTE: This needn't be done, as MediaWiki still supports
                  (and will keep supporting) ISBN-10, and all libraries and
                  bookstores will most likely do so as well.

-format           Corrects the hyphenation.
                  NOTE: This is in here for testing purposes only. Usually
                  it's not worth to create an edit for such a minor issue.
                  The recommended way of doing this is enabling
                  cosmetic_changes, so that these changes are made on-the-fly
                  to all pages that are modified.

-always           Don't prompt you for each replacement.

-prop-isbn-10     Sets ISBN-10 property ID, so it's not tried to be found
                  automatically.
                  The usage is as follows: -prop-isbn-10:propid

-prop-isbn-13     Sets ISBN-13 property ID. The format and purpose is the
                  same as in -prop-isbn-10.
class scripts.isbn.ISBN[source]

Bases: object

Abstract superclass.

__module__ = 'scripts.isbn'
format()[source]

Put hyphens into this ISBN number.

class scripts.isbn.ISBN10(code)[source]

Bases: scripts.isbn.ISBN

ISBN 10.

__init__(code)[source]

Initializer.

__module__ = 'scripts.isbn'
checkChecksum()[source]

Raise an InvalidIsbnException if the ISBN checksum is incorrect.

checkValidity()[source]

Check validity of ISBN.

digits()[source]

Return a list of the digits and Xs in the ISBN code.

format()[source]

Format ISBN number.

possiblePrefixes()[source]

Return possible prefixes.

toISBN13()[source]

Create a 13-digit ISBN from this 10-digit ISBN.

Adds the GS1 prefix ‘978’ and recalculates the checksum. The hyphenation structure is taken from the format of the original ISBN number.

Return type:ISBN13
class scripts.isbn.ISBN13(code, checksumMissing=False)[source]

Bases: scripts.isbn.ISBN

ISBN 13.

__init__(code, checksumMissing=False)[source]

Initializer.

__module__ = 'scripts.isbn'
calculateChecksum()[source]

Calculate checksum.

See https://en.wikipedia.org/wiki/ISBN#Check_digit_in_ISBN_13

checkValidity()[source]

Check validity of ISBN.

digits()[source]

Return a list of the digits in the ISBN code.

possiblePrefixes()[source]

Return possible prefixes.

exception scripts.isbn.InvalidIsbnException(arg)[source]

Bases: pywikibot.exceptions.Error

Invalid ISBN.

__module__ = 'scripts.isbn'
class scripts.isbn.IsbnBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

ISBN bot.

__init__(generator, **kwargs)[source]

Initializer.

__module__ = 'scripts.isbn'
run()[source]

Run the bot.

treat(page)[source]

Treat a page.

class scripts.isbn.IsbnWikibaseBot(generator, **kwargs)[source]

Bases: pywikibot.bot.WikidataBot

ISBN bot to be run on Wikibase sites.

__init__(generator, **kwargs)[source]

Initializer.

__module__ = 'scripts.isbn'
treat_page_and_item(page, item)[source]

Treat a page.

use_from_page = None
scripts.isbn.convertIsbn10toIsbn13(text)[source]

Helper function to convert ISBN 10 to ISBN 13.

scripts.isbn.getIsbn(code)[source]

Return an ISBN object for the code.

scripts.isbn.hyphenateIsbnNumbers(text, *, match_func=<function _hyphenateIsbnNumber>)

Reformat ISBNs.

Parameters:
  • text (str) – text containing ISBNs
  • match_func (callable) – function to reformat matched ISBNs
Returns:

reformatted text

Return type:

str

scripts.isbn.is_valid(isbn)[source]

Check whether an ISBN 10 or 13 is valid.

scripts.isbn.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.listpages script

Print a list of pages, as defined by page generator parameters.

Optionally, it also prints page content to STDOUT or save it to a file in the current directory.

These parameters are supported to specify which pages titles to print:

-format     Defines the output format.

            Can be a custom string according to python string.format() notation
            or can be selected by a number from following list
            (1 is default format):
            1 - '{num:4d} {page.title}'
                --> 10 PageTitle

            2 - '{num:4d} [[{page.title}]]'
                --> 10 [[PageTitle]]

            3 - '{page.title}'
                --> PageTitle

            4 - '[[{page.title}]]'
                --> [[PageTitle]]

            5 - '{num:4d} \03{{lightred}}{page.loc_title:<40}\03{{default}}'
                --> 10 localised_Namespace:PageTitle (colorised in lightred)

            6 - '{num:4d} {page.loc_title:<40} {page.can_title:<40}'
                --> 10 localised_Namespace:PageTitle
                       canonical_Namespace:PageTitle

            7 - '{num:4d} {page.loc_title:<40} {page.trs_title:<40}'
                --> 10 localised_Namespace:PageTitle
                       outputlang_Namespace:PageTitle
                (*) requires "outputlang:lang" set.

            num is the sequential number of the listed page.

            An empty format is equal to -notitle and just shows the total
            amount of pages.

-outputlang Language for translation of namespaces.

-notitle    Page title is not printed.

-get        Page content is printed.

-save       Save Page content to a file named as page.title(as_filename=True).
            Directory can be set with -save:dir_name
            If no dir is specified, current directory will be used.

-encode     File encoding can be specified with '-encode:name' (name must be
            a valid python encoding: utf-8, etc.).
            If not specified, it defaults to config.textfile_encoding.

-put:       Save the list to the defined page of the wiki. By default it does
            not overwrite an existing page.

-overwrite  Overwrite the page if it exists. Can only by applied with -put.

-summary:   The summary text when the page is written. If it's one word just
            containing letters, dashes and underscores it uses that as a
            translation key.

Custom format can be applied to the following items extrapolated from a page object:

site: obtained from page._link._site.

title: obtained from page._link._title.

loc_title: obtained from page._link.canonical_title().

can_title: obtained from page._link.ns_title().
    based either the canonical namespace name or on the namespace name
    in the language specified by the -trans param;
    a default value '******' will be used if no ns is found.

onsite: obtained from pywikibot.Site(outputlang, self.site.family).

trs_title: obtained from page._link.ns_title(onsite=onsite).
    If selected format requires trs_title, outputlang must be set.

This script supports use of pywikibot.pagegenerators arguments.

class scripts.listpages.Formatter(page, outputlang=None, default='******')[source]

Bases: object

Structure with Page attributes exposed for formatting from cmd line.

__init__(page, outputlang=None, default='******')[source]

Initializer.

Parameters:
  • page (Page object.) – the page to be formatted.
  • outputlang (str or None, if no translation is wanted.) –

    language code in which namespace before title should be translated.

    Page ns will be searched in Site(outputlang, page.site.family) and, if found, its custom name will be used in page.title().

  • default – default string to be used if no corresponding namespace is found when outputlang is not None.
__module__ = 'scripts.listpages'
fmt_need_lang = ['7']
fmt_options = {'1': '{num:4d} {page.title}', '2': '{num:4d} [[{page.title}]]', '3': '{page.title}', '4': '[[{page.title}]]', '5': '{num:4d} \x03{{lightred}}{page.loc_title:<40}\x03{{default}}', '6': '{num:4d} {page.loc_title:<40} {page.can_title:<40}', '7': '{num:4d} {page.loc_title:<40} {page.trs_title:<40}'}
output(num=None, fmt=1)[source]

Output formatted string.

scripts.listpages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.login script

Script to log the bot in to a wiki account.

Suggestion is to make a special account to use for bot use only. Make sure this bot account is well known on your home wiki before using.

The following parameters are supported:

-family:FF
-lang:LL     Log in to the LL language of the FF family.
             Example: -family:wiktionary -lang:fr will log you in at
             fr.wiktionary.org.

-all         Try to log in on all sites where a username is defined in
             user-config.py.

-logout      Log out of the current site. Combine with -all to log out of
             all sites, or with -family and -lang to log out of a specific
             site.

-force       Ignores if the user is already logged in, and tries to log in.

-pass        Useful in combination with -all when you have accounts for
             several sites and use the same password for all of them.
             Asks you for the password, then logs in on all given sites.

-pass:XXXX   Uses XXXX as password. Be careful if you use this
             parameter because your password will be shown on your
             screen, and will probably be saved in your command line
             history. This is NOT RECOMMENDED for use on computers
             where others have either physical or remote access.
             Use -pass instead.

-sysop       Log in with your sysop account.

-oauth       Generate OAuth authentication information.
             NOTE: Need to copy OAuth tokens to your user-config.py
             manually. -logout, -pass, -force, -pass:XXXX and -sysop are not
             compatible with -oauth.

-autocreate  Auto-create an account using unified login when necessary.
             Note: the global account must exist already before using this.

If not given as parameter, the script will ask for your username and password (password entry will be hidden), log in to your home wiki using this combination, and store the resulting cookies (containing your password hash, so keep it secured!) in a file in the data subdirectory.

All scripts in this library will be looking for this cookie file and will use the login information if it is present.

To log out, throw away the *.lwp file that is created in the data subdirectory.

scripts.login.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.lonelypages script

This is a script written to add the template “orphan” to pages.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-enable:          Enable or disable the bot via a Wiki Page.

-disambig:        Set a page where the bot saves the name of the disambig
                  pages found (default: skip the pages)

-always           Always say yes, won't ask

Example

python pwb.py lonelypages -enable:User:Bot/CheckBot -always

class scripts.lonelypages.LonelyPagesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot

Orphan page tagging bot.

__init__(generator, **kwargs)[source]

Initializer.

__module__ = 'scripts.lonelypages'
enable_page()[source]

Enable or disable bot via wiki page.

settings

Return the settings for the configured site.

setup()[source]

Setup the bot.

If the enable page is set to disable, set an empty generator which turns off the bot (useful when the bot is run on a server).

treat(page)[source]

Check if page is applicable and not marked and add template then.

class scripts.lonelypages.OrphanTemplate(site, name, parameters, aliases=None, subst=False)[source]

Bases: object

The orphan template configuration.

__init__(site, name, parameters, aliases=None, subst=False)[source]

Initializer.

__module__ = 'scripts.lonelypages'
scripts.lonelypages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.makecat script

Bot to add new or existing categories to pages.

This bot takes as its argument the name of a new or existing category. Multiple categories may be given. It will then try to find new articles for these categories (pages linked to and from pages already in the category), asking the user which pages to include and which not.

The following command line parameters are supported:

-nodates     Automatically skip all pages that are years or dates
             (years only work AD, dates only for certain languages).

-forward     Only check pages linked from pages already in the category,
             not pages linking to them. Is less precise but quite a bit faster.

-exist       Only ask about pages that do actually exist;
             drop any titles of non-existing pages silently.
             If -forward is chosen, -exist is automatically implied.

-keepparent  Do not remove parent categories of the category to be worked on.

-all         Work on all pages (default: only main namespace)

When running the bot, you will get one by one a number by pages. You can choose

  • [y]es - include the page
  • [n]o - do not include the page or
  • [i]gnore - do not include the page, but if you meet it again, ask again.

Other possibilities

  • [m]ore - show more content of the page starting from the beginning
  • sort [k]ey - add with sort key like [[Category|Title]]
  • [s]kip - add the page, but skip checking links to and from it
  • [c]heck - check links to and from the page, but do not add the page itself
  • [o]ther - add another page, which may have been included before
  • [l]ist - show current list of pages to include or to check
class scripts.makecat.MakeCatBot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.NoRedirectPageBot

Bot tries to find new articles for a given category.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.makecat'
asktoadd(pl, summary)[source]

Work on current page and ask to add article to category.

change_category(page, catlist)[source]

Change the category of page.

include(pl, checklinks=True, realinclude=True, linkterm=None, summary='')[source]

Include the current page to the working category.

needcheck(page)[source]

Verify whether the current page may be processed.

skip_page(page)[source]

Check whether the page is to be skipped.

scripts.makecat.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.match_images script

Program to match two images based on histograms.

Usage:

python pwb.py match_images ImageA ImageB

It is essential to provide two images to work on.

Furthermore, the following command line parameters are supported:

-otherfamily        Mentioned family with this parameter will be preferred for
                    fetching file usage details instead of the default
                    family retrieved from user-congig.py script.

-otherlang          Mentioned lang with this parameter will be preferred for
                    fetching file usage details instead of the default
                    mylang retrieved from user-congig.py script.

This is just a first version so that other people can play around with it. Expect the code to change a lot!

scripts.match_images.get_image_from_image_page(imagePage)[source]

Get the image object to work based on an imagePage object.

scripts.match_images.main(*args)[source]

Extracting file page information and initiate matching.

scripts.match_images.match_image_pages(imagePageA, imagePageB)[source]

The function expects two image page objects.

It will return True if the image are the same and False if the images are not the same

scripts.match_images.match_images(imageA, imageB)[source]

Match two image objects. Return the ratio of pixels that match.

scripts.misspelling script

This script fixes links that contain common spelling mistakes.

This is only possible on wikis that have a template for these misspellings.

Command line options:

-always:XY  instead of asking the user what to do, always perform the same
            action. For example, XY can be "r0", "u" or "2". Be careful with
            this option, and check the changes made by the bot. Note that
            some choices for XY don't make sense and will result in a loop,
            e.g. "l" or "m".

-start:XY   goes through all misspellings in the category on your wiki
            that is defined (to the bot) as the category containing
            misspelling pages, starting at XY. If the -start argument is not
            given, it starts at the beginning.

-main       only check pages in the main namespace, not in the talk,
            wikipedia, user, etc. namespaces.
class scripts.misspelling.MisspellingRobot(always, firstPageTitle, main_only)[source]

Bases: scripts.solve_disambiguation.DisambiguationRobot

Spelling bot.

__init__(always, firstPageTitle, main_only)[source]

Initializer.

__module__ = 'scripts.misspelling'
createPageGenerator(firstPageTitle)[source]

Generator to retrieve misspelling pages or misspelling redirects.

Return type:generator
findAlternatives(disambPage)[source]

Append link target to a list of alternative links.

Overrides the DisambiguationRobot method.

Returns:True if alternate link was appended
Return type:bool or None
misspellingCategory = {'da': 'Omdirigeringer af fejlstavninger', 'de': ('Kategorie:Wikipedia:Falschschreibung', 'Kategorie:Wikipedia:Obsolete Schreibung'), 'en': 'Redirects from misspellings', 'hu': 'Átirányítások hibás névről', 'nl': 'Categorie:Wikipedia:Redirect voor spelfout'}
misspellingTemplate = {'de': ('Falschschreibung', 'Obsolete Schreibung')}
setSummaryMessage(disambPage, *args, **kwargs)[source]

Setup the summary message.

Overrides the DisambiguationRobot method.

scripts.misspelling.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.movepages script

This script can move pages.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-from and -to     The page to move from and the page to move to.

-noredirect       Leave no redirect behind.

-notalkpage       Do not move this page's talk page (if it exists)

-prefix           Move pages by adding a namespace prefix to the names of the
                  pages. (Will remove the old namespace prefix if any)
                  Argument can also be given as "-prefix:namespace:".

-always           Don't prompt to make changes, just do them.

-skipredirects    Skip redirect pages (Warning: increases server load)

-summary          Prompt for a custom summary, bypassing the predefined message
                  texts. Argument can also be given as "-summary:XYZ".

-pairsfile        Read pairs of file names from a file. The file must be in a
                  format [[frompage]] [[topage]] [[frompage]] [[topage]] ...
                  Argument can also be given as "-pairsfile:filename"
class scripts.movepages.MovePagesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot

Page move bot.

__init__(generator, **kwargs)[source]

Initializer.

__module__ = 'scripts.movepages'
moveOne(page, newPageTitle)[source]

Move on page to newPageTitle.

treat(page)[source]

Treat a single page.

scripts.movepages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.ndashredir script

A script to create hyphenated redirects for n or m dash pages.

This script collects pages with n or m dash in their title and creates a redirect from the corresponding hyphenated version. If the redirect already exists, it is skipped.

Use -reversed option to create n dash redirects for hyphenated pages. Some communities can decide to use hyphenated titles for templates, modules or categories and in this case this option can be handy.

The following parameters are supported:

-always           don't ask for confirmation when putting a page

-reversed         create n dash redirects for hyphenated pages

-summary:         set custom summary message for the edit

The following generators and filters are supported:

This script supports use of pywikibot.pagegenerators arguments.

class scripts.ndashredir.DashRedirectBot(generator, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot

Bot to create hyphenated or dash redirects.

__init__(generator, **kwargs)[source]

Initializer.

Parameters:generator (generator) – the page generator that determines which pages to work on
__module__ = 'scripts.ndashredir'
treat_page()[source]

Do the magic.

scripts.ndashredir.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.newitem script

This script creates new items on Wikidata based on certain criteria.

  • When was the (Wikipedia) page created?
  • When was the last edit on the page?
  • Does the page contain interwikis?

This script understands various command-line arguments:

-lastedit         The minimum number of days that has passed since the page was
                  last edited.

-pageage          The minimum number of days that has passed since the page was
                  created.

-touch            Do a null edit on every page which has a wikibase item.
                  Be careful, this option can trigger edit rates or captchas
                  if your account is not autoconfirmed.
class scripts.newitem.NewItemRobot(generator, **kwargs)[source]

Bases: pywikibot.bot.WikidataBot

A bot to create new items.

__init__(generator, **kwargs)[source]

Only accepts options defined in availableOptions.

__module__ = 'scripts.newitem'
treat_missing_item = True
treat_page_and_item(page, item)[source]

Treat page/item.

scripts.newitem.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.noreferences script

This script adds a missing references section to pages.

It goes over multiple pages, searches for pages where <references /> is missing although a <ref> tag is present, and in that case adds a new references section.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-xml          Retrieve information from a local XML dump (pages-articles
              or pages-meta-current, see https://dumps.wikimedia.org).
              Argument can also be given as "-xml:filename".

-always       Don't prompt you for each replacement.

-quiet        Use this option to get less output

If neither a page title nor a page generator is given, it takes all pages from the default maintenance category.

It is strongly recommended not to run this script over the entire article namespace (using the -start) parameter, as that would consume too much bandwidth. Instead, use the -xml parameter, or use another way to generate a list of affected articles

class scripts.noreferences.NoReferencesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

References section bot.

__init__(generator, **kwargs)[source]

Initializer.

__module__ = 'scripts.noreferences'
addReferences(oldText)[source]

Add a references tag into an existing section where it fits into.

If there is no such section, creates a new section containing the references tag. Also repair malformed references tags. Set the edit summary accordingly.

Parameters:oldText (str) – page text to be modified
Returns:The modified pagetext
Return type:str
createReferenceSection(oldText, index, ident='==')[source]

Create a reference section and insert it into the given text.

Parameters:
  • oldText (str) – page text that is going to be be amended
  • index (int) – the index of oldText where the reference section should be inserted at
  • ident (str) – symbols to be inserted before and after reference section title
Returns:

the amended page text with reference section added

Return type:

str

lacksReferences(text)[source]

Check whether or not the page is lacking a references tag.

run()[source]

Run the bot.

scripts.noreferences.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.nowcommons script

Script to delete files that are also present on Wikimedia Commons.

Do not run this script on Wikimedia Commons itself. It works based on a given array of templates defined below.

Files are downloaded and compared. If the files match, it can be deleted on the source wiki. If multiple versions of the file exist, the script will not delete. If the SHA1 comparison is not equal, the script will not delete.

A sysop account on the local wiki is required if you want all features of this script to work properly.

This script understands various command-line arguments:

-always         run automatically, do not ask any questions. All files
                that qualify for deletion are deleted. Reduced screen
                output.

-replace        replace links if the files are equal and the file names
                differ

-replacealways  replace links if the files are equal and the file names
                differ without asking for confirmation

-replaceloose   Do loose replacements. This will replace all occurrences
                of the name of the image (and not just explicit image
                syntax).  This should work to catch all instances of the
                file, including where it is used as a template parameter
                or in galleries. However, it can also make more mistakes.

-replaceonly    Use this if you do not have a local sysop account, but do
                wish to replace links from the NowCommons template.

Example

python pwb.py nowcommons -replaceonly -replaceloose -replacealways -replace

class scripts.nowcommons.NowCommonsDeleteBot(**kwargs)[source]

Bases: pywikibot.bot.Bot

Bot to delete migrated files.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.nowcommons'
findFilenameOnCommons(localImagePage)[source]

Find filename on Commons.

generator

Generator method.

ncTemplates()[source]

Return nowcommons templates.

nc_templates

A set of now commons template Page instances.

run()[source]

Run the bot.

scripts.nowcommons.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.pagefromfile script

Bot to upload pages from a text file.

This bot takes its input from the UTF-8 text file that contains a number of pages to be put on the wiki. The pages should all have the same beginning and ending text (which may not overlap). The beginning and ending text is not uploaded with the page content by default.

As a pagename is by default taken the first text block from the page content marked in bold (wrapped between ‘’’ and ‘’’). If you expect the page title not to be present in the text or marked by different markers, use -titlestart, -titleend, and -notitle parameters.

Specific arguments:

-file:xxx       The filename we are getting our material from,
                the default value is "dict.txt"
-begin:xxx      The text that marks the beginning of a page,
                the default value is "{{-start-}}"
-end:xxx        The text that marks the end of the page,
                the default value is "{{-stop-}}"
-include        Include the beginning and end markers to the page
-textonly       Text is given without markers. Only one page text is given.
                -begin and -end options are ignored.
-titlestart:xxx The text used in place of ''' for identifying
                the beginning of a page title
-titleend:xxx   The text used in place of ''' for identifying
                the end of the page title
-notitle        Do not include the page title, including titlestart
                and titleend, to the page. Can be used to specify unique
                page title above the page content
-title:xxx      The page title is given directly. Ignores -titlestart,
                -titleend and -notitle options
-nocontent:xxx  If the existing page contains specified statement,
                the page is skipped from editing
-noredirect     Do not upload on redirect pages
-summary:xxx    The text used as an edit summary for the upload.
                If the page exists, standard messages for prepending,
                appending, or replacement are appended after it
-autosummary    Use MediaWiki's autosummary when creating a new page,
                overrides -summary
-minor          Set the minor edit flag on page edits
-showdiff       Show difference between current page and page to upload,
                also forces the bot to ask for confirmation
                on every edit

If the page to be uploaded already exists, it is skipped by default. But you can override this behavior if you want to:

-appendtop      Add the text to the top of the existing page
-appendbottom   Add the text to the bottom of the existing page
-force          Overwrite the existing page

It is possible to define a separator after the ‘append’ modes which is added between the existing and the new text. For example a parameter -appendtop:foo would add ‘foo’ between them. A new line can be added between them by specifying ‘n’ as a value.

exception scripts.pagefromfile.NoTitle(offset)[source]

Bases: Exception

No title found.

__init__(offset)[source]

Initializer.

__module__ = 'scripts.pagefromfile'
class scripts.pagefromfile.PageFromFileReader(filename, **kwargs)[source]

Bases: pywikibot.bot.OptionHandler

Generator class, responsible for reading the file.

__init__(filename, **kwargs)[source]

Initializer.

Check if self.file name exists. If not, ask for a new filename. User can quit.

__iter__()[source]

Read file and yield a tuple of page title and content.

__module__ = 'scripts.pagefromfile'
availableOptions = {'begin': '{{-start-}}', 'end': '{{-stop-}}', 'include': False, 'notitle': False, 'textonly': False, 'title': None, 'titleend': "'''", 'titlestart': "'''"}
findpage(text)[source]

Find page to work on.

class scripts.pagefromfile.PageFromFileRobot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.CurrentPageBot

Responsible for writing pages to the wiki.

Titles and contents are given by a PageFromFileReader.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.pagefromfile'
init_page(item)[source]

Get the tuple and return the page object to be processed.

treat_page()[source]

Upload page content.

scripts.pagefromfile.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.patrol script

The bot is meant to mark the edits based on info obtained by whitelist.

This bot obtains a list of recent changes and newpages and marks the edits as patrolled based on a whitelist.

WHITELIST FORMAT

The whitelist is formatted as a number of list entries. Any links outside of lists are ignored and can be used for documentation. In a list the first link must be to the username which should be white listed and any other link following is adding that page to the white list of that username. If the user edited a page on their white list it gets patrolled. It will also patrol pages which start with the mentioned link (e.g. [[foo]] will also patrol [[foobar]]).

To avoid redlinks it’s possible to use Special:PrefixIndex as a prefix so that it will list all pages which will be patrolled. The page after the slash will be used then.

On Wikisource, it’ll also check if the page is on the author namespace in which case it’ll also patrol pages which are linked from that page.

An example can be found at https://en.wikisource.org/wiki/User:Wikisource-bot/patrol_whitelist

Commandline parameters:

-namespace         Filter the page generator to only yield pages in
                   specified namespaces
-ask               If True, confirm each patrol action
-whitelist         page title for whitelist (optional)
-autopatroluserns  Takes user consent to automatically patrol
-versionchecktime  Check versionchecktime lapse in sec
-repeat            Repeat run after 60 seconds
-newpages          Run on unpatrolled new pages
                   (default for Wikipedia Projects)
-recentchanges     Run on complete unpatrolled recentchanges
                   (default for any project except Wikipedia Projects)
-usercontribs      Filter generators above to the given user
class scripts.patrol.LinkedPagesRule(page_title)[source]

Bases: object

Matches of page site title and linked pages title.

__init__(page_title)[source]

Initializer.

Parameters:page_title (pywikibot.Page) – The page title for this rule
__module__ = 'scripts.patrol'
match(page_title)[source]

Match page_title to linkedpages elements.

title()[source]

Obtain page title.

class scripts.patrol.PatrolBot(site=True, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot

Bot marks the edits as patrolled based on info obtained by whitelist.

__init__(site=True, **kwargs)[source]

Initializer.

Keyword Arguments:
 
  • ask – If True, confirm each patrol action
  • whitelist – page title for whitelist (optional)
  • autopatroluserns – Takes user consent to automatically patrol
  • versionchecktime – Check versionchecktime lapse in sec
__module__ = 'scripts.patrol'
in_list(pagelist, title)[source]

Check if title present in pagelist.

is_wikisource_author_page(title)[source]

Initialise author_ns if site family is ‘wikisource’ else pass.

load_whitelist()[source]

Load most recent watchlist_page for further processing.

parse_page_tuples(wikitext, user=None)[source]

Parse page details apart from ‘user:’ for use.

run(feed)[source]

Process ‘whitelist’ page absent in generator.

treat(page)[source]

It loads the given page, does some changes, and saves it.

whitelist_subpage_name = {'en': 'patrol_whitelist'}
scripts.patrol.api_feed_repeater(gen, delay=0, repeat=False, namespaces=None, user=None, recent_new_gen=True)[source]

Generator which loads pages details to be processed.

scripts.patrol.main(*args)[source]

Process command line arguments and invoke PatrolBot.

scripts.patrol.verbose_output(string)[source]

Verbose output.

scripts.piper script

This bot uses external filtering programs for munging text.

For example:

python pwb.py piper -filter:"tr A-Z a-z" -page:Wikipedia:Sandbox

Would lower case the article with tr(1).

Muliple -filter commands can be specified:

python pwb.py piper -filter:cat -filter:"tr A-Z a-z" -filter:"tr a-z A-Z" \
    -page:Wikipedia:Sandbox

Would pipe the article text through cat(1) (NOOP) and then lower case it with tr(1) and upper case it again with tr(1)

The following parameters are supported:

-always        Always commit changes without asking you to accept them

-filter:       Filter the article text through this program, can be
               given multiple times to filter through multiple programs in
               the order which they are given

The following generators and filters are supported:

This script supports use of pywikibot.pagegenerators arguments.

class scripts.piper.PiperBot(generator, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot, pywikibot.bot.AutomaticTWSummaryBot

Bot for munging text using external filtering programs.

__init__(generator, **kwargs)[source]

Initializer.

Parameters:generator (generator) – The page generator that determines on which pages to work on.
__module__ = 'scripts.piper'
pipe(program, text)[source]

Pipe a given text through a given program.

Returns:processed text after piping
Return type:str
summary_key = 'piper-edit-summary'
summary_parameters

Return the filter parameter.

treat_page()[source]

Load the given page, do some changes, and save it.

scripts.piper.main(*args)[source]

Create and run a PiperBot instance from the given command arguments.

scripts.protect script

This script can be used to protect and unprotect pages en masse.

Of course, you will need an admin account on the relevant wiki. These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-always           Don't prompt to protect pages, just do it.

-summary:         Supply a custom edit summary. Tries to generate summary from
                  the page selector. If no summary is supplied or couldn't
                  determine one from the selector it'll ask for one.

-expiry:          Supply a custom protection expiry, which defaults to
                  indefinite. Any string understandable by MediaWiki, including
                  relative and absolute, is acceptable. See:
                  https://www.mediawiki.org/wiki/API:Protect#Parameters

-unprotect        Acts like "default:all"

-default:         Sets the default protection level (default 'sysop'). If no
                  level is defined it doesn't change unspecified levels.

-[type]:[level]   Set [type] protection level to [level]

Usual values for [level] are: sysop, autoconfirmed, all; further levels may be provided by some wikis.

For all protection types (edit, move, etc.) it chooses the default protection level. This is “sysop” or “all” if -unprotect was selected. If multiple parameters -unprotect or -default are used, only the last occurrence is applied.

Usage:

python pwb.py protect <OPTIONS>

Examples

Protect everything in the category ‘To protect’ prompting:

python pwb.py protect -cat:"To protect"

Unprotect all pages listed in text file ‘unprotect.txt’ without prompting:

python pwb.py protect -file:unprotect.txt -unprotect -always
class scripts.protect.ProtectionRobot(generator, protections, site=None, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot

This bot allows protection of pages en masse.

__init__(generator, protections, site=None, **kwargs)[source]

Create a new ProtectionRobot.

Parameters:
  • generator (generator) – the page generator
  • protections (dict) – protections as a dict with “type”: “level”
  • site (None, True or Site) – The site to which the protections apply. By default it’s using the site of the first page returned from the generator. If True it’s using the configured site.
  • kwargs – additional arguments directly feed to Bot.__init__()
__module__ = 'scripts.protect'
treat(page)[source]

Run the bot’s action on each page.

Bot.run() loops through everything in the page generator and applies the protections using this function.

scripts.protect.check_protection_level(operation, level, levels, default=None)[source]

Check if the protection level is valid or ask if necessary.

Returns:a valid protection level
Return type:str
scripts.protect.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.redirect script

Script to resolve double redirects, and to delete broken redirects.

Requires access to MediaWiki’s maintenance pages or to a XML dump file. Delete function requires adminship.

Syntax:

python pwb.py redirect action [-arguments ...]

where action can be one of these

double:Shortcut: do. Fix redirects which point to other redirects.
broken:Shortcut: br. Tries to fix redirect which point to nowhere by using the last moved target of the destination page. If this fails and the -delete option is set, it either deletes the page or marks it for deletion depending on whether the account has admin rights. It will mark the redirect not for deletion if there is no speedy deletion template available.
both:Both of the above. Retrieves redirect pages from live wiki, not from a special page.

and arguments can be:

-xml           Retrieve information from a local XML dump
               (https://dumps.wikimedia.org). Argument can also be given as
               "-xml:filename.xml". Cannot be used with -fullscan or -moves.

-fullscan      Retrieve redirect pages from live wiki, not from a special page
               Cannot be used with -xml.

-moves         Use the page move log to find double-redirect candidates. Only
               works with action "double", does not work with -xml.

               NOTE: You may use only one of these options above.
               If neither of -xml -fullscan -moves is given, info will be
               loaded from a special page of the live wiki.

-page:title    Work on a single page

-namespace:n   Namespace to process. Can be given multiple times, for several
               namespaces. If omitted, only the main (article) namespace is
               treated.

-offset:n      With -moves, the number of hours ago to start scanning moved
               pages. With -xml, the number of the redirect to restart with
               (see progress). Otherwise, ignored.

-start:title   The starting page title in each namespace. Page need not exist.

-until:title   The possible last page title in each namespace. Page needs not
               exist.

-limit:n       The maximum count of redirects to work upon. If omitted, there
               is no limit.

-delete        Prompt the user whether broken redirects should be deleted (or
               marked for deletion if the account has no admin rights) instead
               of just skipping them.

-sdtemplate:x  Add the speedy deletion template string including brackets.
               This enables overriding the default template via i18n or
               to enable speedy deletion for projects other than wikipedias.

-always        Don't prompt you for each replacement.
class scripts.redirect.RedirectGenerator(action, **kwargs)[source]

Bases: pywikibot.bot.OptionHandler

Redirect generator.

__init__(action, **kwargs)[source]

Initializer.

__module__ = 'scripts.redirect'
availableOptions = {'fullscan': False, 'limit': None, 'moves': False, 'namespaces': {0}, 'offset': -1, 'page': None, 'start': None, 'until': None, 'xml': None}
get_moved_pages_redirects()[source]

Generate redirects to recently-moved pages.

get_redirect_pages_via_api()[source]

Yield Pages that are redirects.

get_redirects_from_dump(alsoGetPageTitles=False)[source]

Extract redirects from dump.

Load a local XML dump file, look at all pages which have the redirect flag set, and find out where they’re pointing at. Return a dictionary where the redirect names are the keys and the redirect targets are the values.

get_redirects_via_api(maxlen=8)[source]

Return a generator that yields tuples of data about redirect Pages.

The description of returned tuple items is as follows:

[0]:

page title of a redirect page

[1]:

type of redirect:

None:start of a redirect chain of unknown length, or loop
[0]:broken redirect, target page title missing
[1]:normal redirect, target page exists and is not a redirect
[2:maxlen]:start of a redirect chain of that many redirects (currently, the API seems not to return sufficient data to make these return values possible, but that may change)
[maxlen+1]:start of an even longer chain, or a loop (currently, the API seems not to return sufficient data to allow this return values, but that may change)
[2]:

target page title of the redirect, or chain (may not exist)

[3]:

target page of the redirect, or end of chain, or page title where chain or loop detecton was halted, or None if unknown

retrieve_broken_redirects()[source]

Retrieve broken redirects.

retrieve_double_redirects()[source]

Retrieve double redirects.

class scripts.redirect.RedirectRobot(action, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.RedirectPageBot

Redirect bot.

__init__(action, **kwargs)[source]

Initializer.

__module__ = 'scripts.redirect'
delete_1_broken_redirect()[source]

Treat one broken redirect.

delete_redirect(page, summary_key)[source]

Delete the redirect page.

fix_1_double_redirect()[source]

Treat one double redirect.

fix_double_or_delete_broken_redirect()[source]

Treat one broken or double redirect.

get_sd_template()[source]

Look for speedy deletion template and return it.

Returns:A valid speedy deletion template.
Return type:str or None
init_page(item)[source]

Ensure that we process page objects.

treat(page)[source]

Treat a page.

scripts.redirect.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.redirect.space_to_underscore(link)[source]

Convert spaces to underscore.

scripts.replace script

This bot will make direct text replacements.

It will retrieve information on which pages might need changes either from an XML dump or a text file, or only change a single page.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-mysqlquery       Retrieve information from a local database mirror.
                  If no query specified, bot searches for pages with
                  given replacements.

-xml              Retrieve information from a local XML dump
                  (pages-articles or pages-meta-current, see
                  https://dumps.wikimedia.org). Argument can also
                  be given as "-xml:filename".

-regex            Make replacements using regular expressions. If this argument
                  isn't given, the bot will make simple text replacements.

-nocase           Use case insensitive regular expressions.

-dotall           Make the dot match any character at all, including a newline.
                  Without this flag, '.' will match anything except a newline.

-multiline        '^' and '$' will now match begin and end of each line.

-xmlstart         (Only works with -xml) Skip all articles in the XML dump
                  before the one specified (may also be given as
                  -xmlstart:Article).

-addcat:cat_name  Adds "cat_name" category to every altered page.

-excepttitle:XYZ  Skip pages with titles that contain XYZ. If the -regex
                  argument is given, XYZ will be regarded as a regular
                  expression.

-requiretitle:XYZ Only do pages with titles that contain XYZ. If the -regex
                  argument is given, XYZ will be regarded as a regular
                  expression.

-excepttext:XYZ   Skip pages which contain the text XYZ. If the -regex
                  argument is given, XYZ will be regarded as a regular
                  expression.

-exceptinside:XYZ Skip occurrences of the to-be-replaced text which lie
                  within XYZ. If the -regex argument is given, XYZ will be
                  regarded as a regular expression.

-exceptinsidetag:XYZ Skip occurrences of the to-be-replaced text which lie
                 within an XYZ tag.

-summary:XYZ      Set the summary message text for the edit to XYZ, bypassing
                  the predefined message texts with original and replacements
                  inserted. Can't be used with -automaticsummary.

-automaticsummary Uses an automatic summary for all replacements which don't
                  have a summary defined. Can't be used with -summary.

-sleep:123        If you use -fix you can check multiple regex at the same time
                  in every page. This can lead to a great waste of CPU because
                  the bot will check every regex without waiting using all the
                  resources. This will slow it down between a regex and another
                  in order not to waste too much CPU.

-fix:XYZ          Perform one of the predefined replacements tasks, which are
                  given in the dictionary 'fixes' defined inside the files
                  fixes.py and user-fixes.py.

                 The available fixes are listed in :py:mod:`pywikibot.fixes`.

-manualinput      Request manual replacements via the command line input even
                  if replacements are already defined. If this option is set
                  (or no replacements are defined via -fix or the arguments)
                  it'll ask for additional replacements at start.

-pairsfile        Lines from the given file name(s) will be read as replacement
                  arguments. i.e. a file containing lines "a" and "b", used as:

                      python pwb.py replace -page:X -pairsfile:file c d

                  will replace 'a' with 'b' and 'c' with 'd'.

-always           Don't prompt you for each replacement

-recursive        Recurse replacement as long as possible. Be careful, this
                  might lead to an infinite loop.

-allowoverlap     When occurrences of the pattern overlap, replace all of them.
                  Be careful, this might lead to an infinite loop.

-fullsummary      Use one large summary for all command line replacements.
other: First argument is the old text, second argument is the new
text. If the -regex argument is given, the first argument will be regarded as a regular expression, and the second argument might contain expressions like 1 or g<name>. It is possible to introduce more than one pair of old text and replacement.

Examples

If you want to change templates from the old syntax, e.g. {{msg:Stub}}, to the new syntax, e.g. {{Stub}}, download an XML dump file (pages-articles) from https://dumps.wikimedia.org, then use this command:

python pwb.py replace -xml -regex "{{msg:(.*?)}}" "{{\1}}"

If you have a dump called foobar.xml and want to fix typos in articles, e.g. Errror -> Error, use this:

python pwb.py replace -xml:foobar.xml "Errror" "Error" -namespace:0

If you want to do more than one replacement at a time, use this:

python pwb.py replace -xml:foobar.xml "Errror" "Error" "Faail" "Fail" \
    -namespace:0

If you have a page called ‘John Doe’ and want to fix the format of ISBNs, use:

python pwb.py replace -page:John_Doe -fix:isbn

This command will change ‘referer’ to ‘referrer’, but not in pages which talk about HTTP, where the typo has become part of the standard:

python pwb.py replace referer referrer -file:typos.txt -excepttext:HTTP

Please type “python pwb.py replace -help | more” if you can’t read the top of the help.

class scripts.replace.ReplaceRobot(generator, replacements, exceptions={}, allowoverlap=False, recursive=False, addedCat=None, sleep=None, summary='', **kwargs, acceptall='[deprecated name of always]')[source]

Bases: pywikibot.bot.Bot

A bot that can do text replacements.

Parameters:
  • generator (generator) – generator that yields Page objects
  • replacements (list) – a list of Replacement instances or sequences of length 2 with the original text (as a compiled regular expression) and replacement text (as a string).
  • exceptions (dict) –

    a dictionary which defines when not to change an occurrence. This dictionary can have these keys:

    title
    A list of regular expressions. All pages with titles that are matched by one of these regular expressions are skipped.
    text-contains
    A list of regular expressions. All pages with text that contains a part which is matched by one of these regular expressions are skipped.
    inside
    A list of regular expressions. All occurrences are skipped which lie within a text region which is matched by one of these regular expressions.
    inside-tags
    A list of strings. These strings must be keys from the dictionary in textlib._create_default_regexes() or must be accepted by textlib._get_regexes().
  • allowoverlap (bool) – when matches overlap, all of them are replaced.
  • recursive – Recurse replacement as long as possible.
  • addedCat (pywikibot.Category or str or None) – category to be added to every page touched
  • sleep (int) – slow down between processing multiple regexes
  • summary (str) – Set the summary message text bypassing the default
Warning:

Be careful, this might lead to an infinite loop.

Keyword Arguments:
 
  • always – the user won’t be prompted before changes are made
  • site – Site the bot is working on.
Warning:

site parameter should be passed to constructor. Otherwise the bot takes the current site and warns the operator about the missing site

__init__(generator, replacements, exceptions={}, allowoverlap=False, recursive=False, addedCat=None, sleep=None, summary='', **kwargs, acceptall='[deprecated name of always]')[source]

Initializer.

__module__ = 'scripts.replace'
apply_replacements(original_text, applied, page=None)[source]

Apply all replacements to the given text.

Return type:str, set
generate_summary(applied_replacements)[source]

Generate a summary message for the replacements.

isTextExcepted(original_text)[source]

Return True iff one of the exceptions applies for the given text.

Return type:bool
isTitleExcepted(title, exceptions=None)[source]

Return True iff one of the exceptions applies for the given title.

Return type:bool
run()[source]

Start the bot.

class scripts.replace.Replacement(old, new, use_regex=None, exceptions=None, case_insensitive=None, edit_summary=None, default_summary=True)[source]

Bases: scripts.replace.ReplacementBase

A single replacement with it’s own data.

__init__(old, new, use_regex=None, exceptions=None, case_insensitive=None, edit_summary=None, default_summary=True)[source]

Create a single replacement entry unrelated to a fix.

__module__ = 'scripts.replace'
case_insensitive

Return whether the search text is case insensitive.

classmethod from_compiled(old_regex, new, **kwargs)[source]

Create instance from already compiled regex.

get_inside_exceptions()[source]

Get exceptions on text (inside exceptions).

use_regex

Return whether the search text is using regex.

class scripts.replace.ReplacementBase(old, new, edit_summary=None, default_summary=True)[source]

Bases: object

The replacement instructions.

__init__(old, new, edit_summary=None, default_summary=True)[source]

Create a basic replacement instance.

__module__ = 'scripts.replace'
compile(use_regex, flags)[source]

Compile the search text.

container

Container object which contains this replacement.

A container object is an object that groups one or more replacements together and provides some properties that are common to all of them. For example, containers may define a common name for a group of replacements, or a common edit summary.

Container objects must have a “name” attribute.

description

Description of the changes that this replacement applies.

This description is used as the default summary of the replacement. If you do not specify an edit summary on the command line or in some other way, whenever you apply this replacement to a page and submit the changes to the MediaWiki server, the edit summary includes the descriptions of each replacement that you applied to the page.

edit_summary

Return the edit summary for this fix.

class scripts.replace.ReplacementList(use_regex, exceptions, case_insensitive, edit_summary, name)[source]

Bases: list

A list of replacements which all share some properties.

The shared properties are: * use_regex * exceptions * case_insensitive

Each entry in this list should be a ReplacementListEntry. The exceptions are compiled only once.

__init__(use_regex, exceptions, case_insensitive, edit_summary, name)[source]

Create a fix list which can contain multiple replacements.

__module__ = 'scripts.replace'
class scripts.replace.ReplacementListEntry(old, new, fix_set, edit_summary=None, default_summary=True)[source]

Bases: scripts.replace.ReplacementBase

A replacement entry for ReplacementList.

__init__(old, new, fix_set, edit_summary=None, default_summary=True)[source]

Create a replacement entry inside a fix set.

__module__ = 'scripts.replace'
case_insensitive

Return whether the fix set is case insensitive.

container

Container object which contains this replacement.

A container object is an object that groups one or more replacements together and provides some properties that are common to all of them. For example, containers may define a common name for a group of replacements, or a common edit summary.

Container objects must have a “name” attribute.

edit_summary

Return this entry’s edit summary or the fix’s summary.

exceptions

Return the exceptions of the fix set.

get_inside_exceptions()[source]

Get exceptions on text (inside exceptions).

use_regex

Return whether the fix set is using regex.

class scripts.replace.XmlDumpReplacePageGenerator(xmlFilename, xmlStart, replacements, exceptions, site)[source]

Bases: object

Iterator that will yield Pages that might contain text to replace.

These pages will be retrieved from a local XML dump file.

Parameters:
  • xmlFilename (str) – The dump’s path, either absolute or relative
  • xmlStart (str) – Skip all articles in the dump before this one
  • replacements (list of 2-tuples) – A list of 2-tuples of original text (as a compiled regular expression) and replacement text (as a string).
  • exceptions (dict) – A dictionary which defines when to ignore an occurrence. See docu of the ReplaceRobot initializer below.
__init__(xmlFilename, xmlStart, replacements, exceptions, site)[source]

Initializer.

__iter__()[source]

Iterator method.

__module__ = 'scripts.replace'
isTextExcepted(text)[source]

Return True iff one of the exceptions applies for the given text.

Return type:bool
isTitleExcepted(title)[source]

Return True iff one of the exceptions applies for the given title.

Return type:bool
scripts.replace.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.replace.precompile_exceptions(exceptions, use_regex, flags)[source]

Compile the exceptions with the given flags.

scripts.replace.prepareRegexForMySQL(pattern)[source]

Convert regex to MySQL syntax.

scripts.replicate_wiki script

This bot replicates pages in a wiki to a second wiki within one family.

Example

python pwb.py replicate_wiki [-r] -ns 10 -family:wikipedia -o nl li fy

or:

python pwb.py replicate_wiki [-r] -ns 10 -family:wikipedia -lang:nl li fy

to copy all templates from an nlwiki to liwiki and fywiki. It will show which pages have to be changed if -r is not present, and will only actually write pages if -r /is/ present.

You can add replicate_replace to your user-config.py, which has the following format:

replicate_replace = {
    'wikipedia:li': {'Hoofdpagina': 'Veurblaad'}
}

to replace all occurrences of ‘Hoofdpagina’ with ‘Veurblaad’ when writing to liwiki. Note that this does not take the origin wiki into account.

The following parameters are supported:

-r, --replace           actually replace pages (without this option
                        you will only get an overview page)

-o, --original          original wiki (you may use -lang:<code> option
                        instead)
-ns, --namespace        specify namespace

-dns, --dest-namespace  destination namespace (if different)

destination_wiki       destination wiki(s)
class scripts.replicate_wiki.SyncSites(options)[source]

Bases: object

Work is done in here.

__init__(options)[source]

Initializer.

__module__ = 'scripts.replicate_wiki'
check_namespace(namespace)[source]

Check an entire namespace.

check_namespaces()[source]

Check all namespaces, to be ditched for clarity.

check_page(pagename)[source]

Check one page.

check_sysops()[source]

Check if sysops are the same on all wikis.

generate_overviews()[source]

Create page on wikis with overview of bot results.

put_message(site)[source]

Return synchronization message.

scripts.replicate_wiki.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.replicate_wiki.multiple_replace(text, word_dict)[source]

Replace all occurrences in text of key value pairs in word_dict.

scripts.revertbot script

This script can be used for reverting certain edits.

The following command line parameters are supported:

-username         Edits of which user need to be reverted.
                  Default is bot's username (site.username())

-rollback         Rollback edits instead of reverting them.
                  Note that in rollback, no diff would be shown.

-limit:num        Use the last num contributions to be checked for revert.
                  Default is 500.

Users who want to customize the behaviour should subclass the BaseRevertBot and override its callback method. Here is a sample:

class myRevertBot(BaseRevertBot)::

    '''Example revert bot.'''

    def callback(self, item)::
        '''Sample callback function for 'private' revert bot.

        :param item: an item from user contributions
        :type item: dict
        :rtype: bool
        '''
        if 'top' in item::
            page = pywikibot.Page(self.site, item['title'])
            text = page.get(get_redirect=True)
            pattern = re.compile(r'\[\[.+?:.+?\..+?\]\]', re.UNICODE)
            return bool(pattern.search(text))
        return False
class scripts.revertbot.BaseRevertBot(site=None, **kwargs)[source]

Bases: pywikibot.bot.OptionHandler

Base revert bot.

Subclass this bot and override callback to get it to do something useful.

__init__(site=None, **kwargs)[source]

Initializer.

__module__ = 'scripts.revertbot'
availableOptions = {'comment': '', 'limit': 500, 'rollback': False}
callback(item)[source]

Callback function.

get_contributions(total=500, ns=None, max='[deprecated name of total]')[source]

Get contributions.

log(msg)[source]

Log the message msg.

revert(item)[source]

Revert a single item.

revert_contribs(callback=None)[source]

Revert contributions.

scripts.revertbot.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.revertbot.myRevertBot

alias of scripts.revertbot.BaseRevertBot

scripts.shell script

Spawns an interactive Python shell and imports the pywikibot library.

The following local option is supported:

-noimport Do not import the pywikibot library. All other arguments are
          ignored in this case.

Usage:

python pwb.py shell [args]

If no arguments are given, the pywikibot library will not be loaded.

scripts.shell.main(*args)[source]

Script entry point.

scripts.solve_disambiguation script

Script to help a human solve disambiguations by presenting a set of options.

Specify the disambiguation page on the command line.

The program will pick up the page, and look for all alternative links, and show them with a number adjacent to them. It will then automatically loop over all pages referring to the disambiguation page, and show 30 characters of context on each side of the reference to help you make the decision between the alternatives. It will ask you to type the number of the appropriate replacement, and perform the change.

It is possible to choose to replace only the link (just type the number) or replace both link and link-text (type ‘r’ followed by the number).

Multiple references in one page will be scanned in order, but typing ‘n’ (next) on any one of them will leave the complete page unchanged. To leave only some reference unchanged, use the ‘s’ (skip) option.

Command line options:

-pos:XXXX   adds XXXX as an alternative disambiguation

-just       only use the alternatives given on the command line, do not
            read the page for other possibilities

-dnskip     Skip links already marked with a disambiguation-needed
            template (e.g., {{dn}})

-primary    "primary topic" disambiguation (Begriffsklärung nach Modell 2).
            That's titles where one topic is much more important, the
            disambiguation page is saved somewhere else, and the important
            topic gets the nice name.

-primary:XY like the above, but use XY as the only alternative, instead of
            searching for alternatives in [[Keyword (disambiguation)]].
            Note: this is the same as -primary -just -pos:XY

-file:XYZ   reads a list of pages from a text file. XYZ is the name of the
            file from which the list is taken. If XYZ is not given, the
            user is asked for a filename. Page titles should be inside
            [[double brackets]]. The -pos parameter won't work if -file
            is used.

-always:XY  instead of asking the user what to do, always perform the same
            action. For example, XY can be "r0", "u" or "2". Be careful with
            this option, and check the changes made by the bot. Note that
            some choices for XY don't make sense and will result in a loop,
            e.g. "l" or "m".

-main       only check pages in the main namespace, not in the talk,
            wikipedia, user, etc. namespaces.

-first      Uses only the first link of every line on the disambiguation
            page that begins with an asterisk. Useful if the page is full
            of irrelevant links that are not subject to disambiguation.
            You won't get all af them as options, just the first on each
            line. For a moderated example see
            http://en.wikipedia.org/wiki/Szerdahely
            A really exotic one is
            http://hu.wikipedia.org/wiki/Brabant_(egyértelműsítő lap)

-start:XY   goes through all disambiguation pages in the category on your
            wiki that is defined (to the bot) as the category containing
            disambiguation pages, starting at XY. If only '-start' or
            '-start:' is given, it starts at the beginning.

-min:XX     (XX being a number) only work on disambiguation pages for which
            at least XX are to be worked on.

To complete a move of a page, one can use:

python pwb.py solve_disambiguation -just -pos:New_Name Old_Name
class scripts.solve_disambiguation.AddAlternativeOption(option, shortcut, output, **kwargs)[source]

Bases: pywikibot.bot_choice.OutputProxyOption

Add a new alternative.

__module__ = 'scripts.solve_disambiguation'
result(value)[source]

Add the alternative and then list them.

class scripts.solve_disambiguation.AliasOption(option, shortcuts, stop=True)[source]

Bases: pywikibot.bot_choice.StandardOption

An option allowing multiple aliases which also select it.

__init__(option, shortcuts, stop=True)[source]

Initializer.

__module__ = 'scripts.solve_disambiguation'
test(value)[source]

Test aliases and combine it with the original test.

class scripts.solve_disambiguation.DisambiguationRobot(always, alternatives, getAlternatives, dnSkip, generator, primary, main_only, first_only=False, minimum=0)[source]

Bases: pywikibot.bot.SingleSiteBot

Disambiguation bot.

__init__(always, alternatives, getAlternatives, dnSkip, generator, primary, main_only, first_only=False, minimum=0)[source]

Initializer.

__module__ = 'scripts.solve_disambiguation'
checkContents(text)[source]

Check if the text matches any of the ignore regexes.

Parameters:text (str) – wikitext of a page
Returns:None if none of the regular expressions given in the dictionary at the top of this class matches a substring of the text, otherwise the matched substring
Return type:str or None
findAlternatives(disambPage)[source]

Extend self.alternatives using correctcap of disambPage.linkedPages.

Parameters:disambPage (pywikibot.Page) – the disambiguation page
Returns:True if everything goes fine, False otherwise
Return type:bool
firstize(page, links)[source]

Call firstlinks and remove extra links.

This will remove a lot of silly redundant links from overdecorated disambiguation pages and leave the first link of each asterisked line only. This must be done if -first is used in command line.

Return a list of first links of every line beginning with *.

When a disambpage is full of unnecessary links, this may be useful to sort out the relevant links. E.g. from line * [[Jim Smith (smith)|Jim Smith]] ([[1832]]-[[1932]]) [[English]] it returns only ‘Jim Smith (smith)’ Lines without an asterisk at the beginning will be disregarded. No check for page existence, it has already been done.

ignore_contents = {'de': ('{{[Ii]nuse}}', '{{[Ll]öschen}}'), 'fi': ('{{[Tt]yöstetään}}',), 'kk': ('{{[Ii]nuse}}', '{{[Pp]rocessing}}'), 'nl': ('{{wiu2}}', '{{nuweg}}'), 'ru': ('{{[Ii]nuse}}', '{{[Pp]rocessing}}')}
makeAlternativesUnique()[source]

Remove duplicate items from self.alternatives.

Preserve the order of alternatives. :rtype: None

primary_redir_template = {'hu': 'Egyért-redir'}
setSummaryMessage(disambPage, new_targets=[], unlink_counter=0, dn=False)[source]

Setup i18n summary message.

setupRegexes()[source]

Compile regular expressions.

treat(page)[source]

Work on a single disambiguation page.

treat_disamb_only(refPage, disambPage)[source]

Resolve the links to disambPage but don’t look for its redirects.

Parameters:
  • disambPage (pywikibot.Page) – the disambiguation page or redirect we don’t want anything to link to
  • refPage (pywikibot.Page) – a page linking to disambPage
Returns:

“nextpage” if the user enters “n” to skip this page, “nochange” if the page needs no change, and “done” if the page is processed successfully

Return type:

str

Resolve the links to disambPage or its redirects.

Parameters:
  • disambPage (pywikibot.Page) – the disambiguation page or redirect we don’t want anything to link to
  • refPage (pywikibot.Page) – a page linking to disambPage
Return type:

None

class scripts.solve_disambiguation.EditOption(option, shortcut, text, start, title)[source]

Bases: pywikibot.bot_choice.StandardOption

Edit the text.

__init__(option, shortcut, text, start, title)[source]

Initializer.

Return type:None
__module__ = 'scripts.solve_disambiguation'
result(value)[source]

Open a text editor and let the user change it.

stop

Return whether if user didn’t press cancel and changed it.

Return type:bool
class scripts.solve_disambiguation.PrimaryIgnoreManager(disambPage, enabled=False)[source]

Bases: object

Primary ignore manager.

If run with the -primary argument, reads from a file which pages should not be worked on; these are the ones where the user pressed n last time. If run without the -primary argument, doesn’t ignore any pages.

__init__(disambPage, enabled=False)[source]

Initializer.

Return type:None
__module__ = 'scripts.solve_disambiguation'
ignore(refPage)[source]

Write page to ignorelist.

Return type:None
isIgnored(refPage)[source]

Return if refPage is to be ignored.

Return type:bool
class scripts.solve_disambiguation.ReferringPageGeneratorWithIgnore(disambPage, primary=False, minimum=0, main_only=False)[source]

Bases: object

Referring Page generator, with an ignore manager.

__init__(disambPage, primary=False, minimum=0, main_only=False)[source]

Initializer.

Return type:None
__iter__()[source]

Yield pages.

__module__ = 'scripts.solve_disambiguation'
class scripts.solve_disambiguation.ShowPageOption(option, shortcut, start, page)[source]

Bases: pywikibot.bot_choice.StandardOption

Show the page’s contents in an editor.

__init__(option, shortcut, start, page)[source]

Initializer.

__module__ = 'scripts.solve_disambiguation'
result(value)[source]

Open a text editor and show the text.

scripts.solve_disambiguation.correctcap(link, text)[source]

Return the link capitalized/uncapitalized according to the text.

Parameters:
  • link (pywikibot.Page) – link page
  • text (str) – the wikitext that is supposed to refer to the link
Returns:

uncapitalized title of the link if the text links to the link with an uncapitalized title, else capitalized

Return type:

str

scripts.solve_disambiguation.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.spamremove script

Script to remove links that are being or have been spammed.

Usage:

python pwb.py spamremove www.spammedsite.com

It will use Special:Linksearch to find the pages on the wiki that link to that site, then for each page make a proposed change consisting of removing all the lines where that url occurs. You can choose to

  • accept the changes as proposed
  • edit the page yourself to remove the offending link
  • not change the page in question

Command line options:

-always           Do not ask, but remove the lines automatically. Be very
                  careful in using this option!

-protocol:        The protocol prefix (default: "http")

-summary:         A string to be used instead of the default summary

In addition, these arguments can be used to restrict changes to some pages:

This script supports use of pywikibot.pagegenerators arguments.

class scripts.spamremove.SpamRemoveBot(generator, spam_external_url, **kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot, pywikibot.bot.AutomaticTWSummaryBot

Bot to remove links that are being or have been spammed.

Parameters:
  • generator (generator) – page generator with preloaded pages.
  • spam_external_url (str) – an external url
Keyword Arguments:
 
  • summary – summary message when given. Otherwise the default summary will be used
  • always – Don’t ask for text replacements
__init__(generator, spam_external_url, **kwargs)[source]

Initializer.

__module__ = 'scripts.spamremove'
summary_key = 'spamremove-remove'
summary_parameters

A dictionary of all parameters for i18n.

treat_page()[source]

Process a single page.

scripts.spamremove.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.standardize_interwiki script

Loop over all pages in the home wiki, standardizing the interwiki links.

Parameters:

-start:     - Set from what page you want to start
scripts.standardize_interwiki.main(*args)[source]

Process command line arguments and run the script.

scripts.states_redirect script

Create country sub-division redirect pages.

Check if they are in the form Something, State, and if so, create a redirect from Something, ST.

Specific arguments:

-start:xxx Specify the place in the alphabet to start searching
-force: Don't ask whether to create pages, just create them.

PRE-REQUISITE : Need to install python-pycountry library.

class scripts.states_redirect.StatesRedirectBot(start, force)[source]

Bases: pywikibot.bot.SingleSiteBot

Bot class used for implementation of re-direction norms.

__init__(start, force)[source]

Initializer.

:param start:xxx Specify the place in the alphabet to start searching. :type start: str :param force: Don’t ask whether to create pages, just create them. :type force: bool

__module__ = 'scripts.states_redirect'
generator

Generator used by run() method.

setup()[source]

Create abbrev from pycountry data base.

treat(page)[source]

Re-directing process.

Check if pages are in the given form Something, State, and if so, create a redirect from Something, ST..

scripts.states_redirect.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.surnames_redirects script

Bot to create redirects based on name order.

By default it creates a “Surnames, Given Names” redirect version of a given page where title consists of 2 or 3 titlecased words.

Command-line arguments:

-surnames_last    Creates a "Given Names Surnames" redirect version of a
                  given page where title is "Surnames, Given Names".

This script supports use of pywikibot.pagegenerators arguments.

Example

python pwb.py surnames_redirects -start:B

class scripts.surnames_redirects.SurnamesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.ExistingPageBot, pywikibot.bot.FollowRedirectPageBot

Surnames Bot.

__init__(generator, **kwargs)[source]

Initializer.

:param : param generator: The page generator that determines on
which pages to work.
:param : kwarg surnames-last: Redirect “Surnames, Given Names” to
“Given Names Surnames”.
__module__ = 'scripts.surnames_redirects'
treat_page()[source]

Suggest redirects by reordering names in titles.

scripts.surnames_redirects.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.table2wiki script

Nifty script to convert HTML-tables to MediaWiki’s own syntax.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

The following parameters are supported:

-always           The bot won't ask for confirmation when putting
                  a page.

-skipwarning      Skip processing a page when a warning occurred.
                  Only used when -always is or becomes True.

-quiet            Don't show diffs in -always mode.

-mysqlquery       Retrieve information from a local database mirror.
                  If no query specified, bot searches for pages with
                  HTML tables, and tries to convert them on the live
                  wiki.

-xml              Retrieve information from a local XML dump
                  (pages_current, see https://dumps.wikimedia.org).
                  Argument can also be given as "-xml:filename".
                  Searches for pages with HTML tables, and tries
                  to convert them on the live wiki.

Example

python pwb.py table2wiki -xml:20050713_pages_current.xml -lang:de

FEATURES

Save against missing </td> Corrects attributes of tags

KNOWN BUGS

Broken HTML tables will most likely result in broken wiki tables! Please check every article you change.

class scripts.table2wiki.Table2WikiRobot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.ExistingPageBot, pywikibot.bot.NoRedirectPageBot

Bot to convert HTML tables to wiki syntax.

Parameters:generator (generator) – the page generator that determines on which pages to work
__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.table2wiki'
convertAllHTMLTables(text)[source]

Convert all HTML tables in text to wiki syntax.

Returns the converted text, the number of converted tables and the number of warnings that occurred.

convertTable(table)[source]

Convert an HTML table to wiki syntax.

If the table already is a wiki table or contains a nested wiki table, tries to beautify it. Returns the converted table, the number of warnings that occurred and a list containing these warnings. Hint: if you give an entire page text as a parameter instead of a table only, this function will convert all HTML tables and will also try to beautify all wiki tables already contained in the text.

findTable(text)[source]

Find the first HTML table (which can contain nested tables).

Returns the table and the start and end position inside the text.

markActiveTables(text)[source]

Mark all hidden table start and end tags.

Mark all table start and end tags that are not disabled by nowiki tags, comments etc. We will then later only work on these marked tags.

treat_page()[source]

Convert all HTML tables in text to wiki syntax and save it.

class scripts.table2wiki.TableXmlDumpPageGenerator(xmlfilename)[source]

Bases: object

Generator to yield all pages that seem to contain an HTML table.

__init__(xmlfilename)[source]

Initializer.

__iter__()[source]
__module__ = 'scripts.table2wiki'
scripts.table2wiki.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.template script

Very simple script to replace a template with another one.

It also converts the old MediaWiki boilerplate format to the new format.

Syntax:

python pwb.py template [-remove] [xml[:filename]] oldTemplate \
    [newTemplate]

Specify the template on the command line. The program will pick up the template page, and look for all pages using it. It will then automatically loop over them, and replace the template.

Command line options:

-remove      Remove every occurrence of the template from every article

-subst       Resolves the template by putting its text directly into the
             article. This is done by changing {{...}} or {{msg:...}} into
             {{subst:...}}.
             Substitution is not available inside <ref>...</ref>,
             <gallery>...</gallery>, <poem>...</poem> and <pagelist ... />
             tags.

-assubst     Replaces the first argument as old template with the second
             argument as new template but substitutes it like -subst does.
             Using both options -remove and -subst in the same command line has
             the same effect.

-xml         retrieve information from a local dump
             (https://dumps.wikimedia.org). If this argument isn't given,
             info will be loaded from the maintenance page of the live wiki.
             argument can also be given as "-xml:filename.xml".

-onlyuser:   Only process pages edited by a given user

-skipuser:   Only process pages not edited by a given user

-timestamp:  (With -onlyuser or -skipuser). Only check for a user where his
             edit is not older than the given timestamp. Timestamp must be
             written in MediaWiki timestamp format which is "%Y%m%d%H%M%S".
             If this parameter is missed, all edits are checked but this is
             restricted to the last 100 edits.

-summary:    Lets you pick a custom edit summary. Use quotes if edit summary
             contains spaces.

-always      Don't bother asking to confirm any of the changes, Just Do It.

-addcat:     Appends the given category to every page that is edited. This is
             useful when a category is being broken out from a template
             parameter or when templates are being upmerged but more
             information must be preserved.
other: First argument is the old template name, second one is the new
name. If you want to address a template which has spaces, put quotation marks around it, or use underscores.

Examples

If you have a template called [[Template:Cities in Washington]] and want to change it to [[Template:Cities in Washington state]], start:

python pwb.py template "Cities in Washington" "Cities in Washington state"

Move the page [[Template:Cities in Washington]] manually afterwards.

If you have a template called [[Template:test]] and want to substitute it only on pages in the User: and User talk: namespaces, do:

python pwb.py template test -subst -namespace:2 -namespace:3

Note that -namespace: is a global Pywikibot parameter

This next example substitutes the template lived with a supplied edit summary. It only performs substitutions in main article namespace and doesn’t prompt to start replacing. Note that -putthrottle: is a global Pywikibot parameter:

python pwb.py template -putthrottle:30 -namespace:0 lived -subst -always \
    -summary:"BOT: Substituting {{lived}}, see [[WP:SUBST]]."

This next example removes the templates {{cfr}}, {{cfru}}, and {{cfr-speedy}} from five category pages as given:

python pwb.py template cfr cfru cfr-speedy -remove -always \
    -page:"Category:Mountain monuments and memorials" \
    -page:"Category:Indian family names" \
    -page:"Category:Tennis tournaments in Belgium" \
    -page:"Category:Tennis tournaments in Germany" \
    -page:"Category:Episcopal cathedrals in the United States" \
    -summary:"Removing Cfd templates from category pages that survived."

This next example substitutes templates test1, test2, and space test on all user talk pages (namespace #3):

python pwb.py template test1 test2 "space test" -subst -ns:3 -always
class scripts.template.TemplateRobot(generator, templates, **kwargs)[source]

Bases: scripts.replace.ReplaceRobot

This bot will replace, remove or subst all occurrences of a template.

__init__(generator, templates, **kwargs)[source]

Initializer.

Parameters:
  • generator (iterable) – the pages to work on
  • templates (dict) – a dictionary which maps old template names to their replacements. If remove or subst is True, it maps the names of the templates that should be removed/resolved to None.
__module__ = 'scripts.template'
scripts.template.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.templatecount script

Display the list of pages transcluding a given list of templates.

It can also be used to simply count the number of pages (rather than listing each individually).

Syntax:

python pwb.py templatecount options templates

Command line options:

-count        Counts the number of times each template (passed in as an
              argument) is transcluded.

-list         Gives the list of all of the pages transcluding the templates
              (rather than just counting them).

-namespace:   Filters the search to a given namespace. If this is specified
              multiple times it will search all given namespaces

Examples

Counts how many times {{ref}} and {{note}} are transcluded in articles:

python pwb.py templatecount -count -namespace:0 ref note

Lists all the category pages that transclude {{cfd}} and {{cfdu}}:

python pwb.py templatecount -list -namespace:14 cfd cfdu
class scripts.templatecount.TemplateCountRobot[source]

Bases: object

Template count bot.

__module__ = 'scripts.templatecount'
classmethod countTemplates(templates, namespaces)[source]

Display number of transclusions for a list of templates.

Displays the number of transcluded page in the given ‘namespaces’ for each template given by ‘templates’ list.

Parameters:
  • templates (list) – list of template names
  • namespaces (list) – list of namespace numbers
classmethod listTemplates(templates, namespaces)[source]

Display transcluded pages for a list of templates.

Displays each transcluded page in the given ‘namespaces’ for each template given by ‘templates’ list.

Parameters:
  • templates (list) – list of template names
  • namespaces (list) – list of namespace numbers
classmethod template_dict(templates, namespaces)[source]

Create a dict of templates and its transcluded pages.

The names of the templates are the keys, and lists of pages transcluding templates in the given namespaces are the values.

Parameters:
  • templates (list) – list of template names
  • namespaces (list) – list of namespace numbers
Return type:

dict

static template_dict_generator(templates, namespaces)[source]

Yield transclusions of each template in ‘templates’.

For each template in ‘templates’, yield a tuple (template, transclusions), where ‘transclusions’ is a list of all pages in ‘namespaces’ where the template has been transcluded.

Parameters:
  • templates (list) – list of template names
  • namespaces (list) – list of namespace numbers
Return type:

generator

scripts.templatecount.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.touch script

This bot goes over multiple pages of a wiki, and edits them without changes.

This is for example used to get category links in templates working.

This script understands various command-line arguments:

-purge                    Do not touch but purge the page

This script supports use of pywikibot.pagegenerators arguments.

Touch arguments:

-botflag                  Force botflag in case of edits with changes.

Purge arguments:

-converttitles            Convert titles to other variants if necessary
-forcelinkupdate          Update the links tables
-forcerecursivelinkupdate Update the links table, and update the links tables
                          for any page that uses this page as a template
-redirects                Automatically resolve redirects
class scripts.touch.PurgeBot(generator, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot

Purge each page on the generator.

__init__(generator, **kwargs)[source]

Initialize a PurgeBot instance with the options and generator.

__module__ = 'scripts.touch'
treat(page)[source]

Purge the given page.

class scripts.touch.TouchBot(generator, **kwargs)[source]

Bases: pywikibot.bot.MultipleSitesBot

Page touch bot.

__init__(generator, **kwargs)[source]

Initialize a TouchBot instance with the options and generator.

__module__ = 'scripts.touch'
treat(page)[source]

Touch the given page.

scripts.touch.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.transferbot script

This script transfers pages from a source wiki to a target wiki.

It also copies edit history to a subpage.

The following parameters are supported:

-tolang:          The target site code.

-tofamily:        The target site family.

-prefix:          Page prefix on the new site.

-overwrite:       Existing pages are skipped by default. Use his option to
                  overwrite pages.

Internal links are not repaired!

Pages to work on can be specified using any of:

This script supports use of pywikibot.pagegenerators arguments.

Examples

Transfer all pages in category “Query service” from the English Wikipedia to the Arabic Wiktionary, adding “Wiktionary:Import enwp/” as prefix:

python pwb.py transferbot -family:wikipedia -lang:en -cat:"Query service" \
    -tofamily:wiktionary -tolang:ar -prefix:"Wiktionary:Import enwp/"

Copy the template “Query service” from the English Wikipedia to the Arabic Wiktionary:

python pwb.py transferbot -family:wikipedia -lang:en \
    -tofamily:wiktionary -tolang:ar -page:"Template:Query service"
scripts.transferbot.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.unusedfiles script

This bot appends some text to all unused images and notifies uploaders.

Parameters:

-always         Don't be asked every time.
-nouserwarning  Do not warn uploader about orphaned file.
-limit          Specify number of pages to work on with "-limit:n" where
                n is the maximum number of articles to work on.
                If not used, all pages are used.
class scripts.unusedfiles.UnusedFilesBot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.AutomaticTWSummaryBot, pywikibot.bot.ExistingPageBot

Unused files bot.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.unusedfiles'
append_text(page, apptext)[source]

Append apptext to the page.

summary_key = 'unusedfiles-comment'
treat(image)[source]

Process one image page.

scripts.unusedfiles.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.upload script

Script to upload images to wikipedia.

The following parameters are supported:

-keep         Keep the filename as is
-filename:    Target filename without the namespace prefix
-prefix:      Add specified prefix to every filename.
-noverify     Do not ask for verification of the upload description if one
              is given
-abortonwarn: Abort upload on the specified warning type. If no warning type
              is specified, aborts on any warning.
-ignorewarn:  Ignores specified upload warnings. If no warning type is
              specified, ignores all warnings. Use with caution
-chunked:     Upload the file in chunks (more overhead, but restartable). If
              no value is specified the chunk size is 1 MiB. The value must
              be a number which can be preceded by a suffix. The units are::

                  No suffix: Bytes
                  'k': Kilobytes (1000 B)
                  'M': Megabytes (1000000 B)
                  'Ki': Kibibytes (1024 B)
                  'Mi': Mebibytes (1024x1024 B)

              The suffixes are case insensitive.
-always       Don't ask the user anything. This will imply -keep and
              -noverify and require that either -abortonwarn or -ignorewarn
              is defined for all. It will also require a valid file name and
              description. It'll only overwrite files if -ignorewarn includes
              the 'exists' warning.
-recursive    When the filename is a directory it also uploads the files from
              the subdirectories.
-summary:     Pick a custom edit summary for the bot.
-descfile:    Specify a filename where the description is stored

It is possible to combine -abortonwarn and -ignorewarn so that if the specific warning is given it won’t apply the general one but more specific one. So if it should ignore specific warnings and abort on the rest it’s possible by defining no warning for -abortonwarn and the specific warnings for -ignorewarn. The order does not matter. If both are unspecific or a warning is specified by both, it’ll prefer aborting.

If any other arguments are given, the first is either URL, filename or directory to upload, and the rest is a proposed description to go with the upload. If none of these are given, the user is asked for the directory, file or URL to upload. The bot will then upload the image to the wiki.

The script will ask for the location of an image(s), if not given as a parameter, and for a description.

scripts.upload.get_chunk_size(match)[source]

Get chunk size.

scripts.upload.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments

scripts.version script

Script to determine the Pywikibot version (tag, revision and date).

scripts.version.check_environ(environ_name)[source]

Print environment variable.

scripts.version.main(*args)[source]

Print pywikibot version and important settings.

scripts.watchlist script

Allows access to the bot account’s watchlist.

The watchlist can be updated manually by running this script.

Syntax:

python pwb.py watchlist [-all \| -new]

Command line options:

-all         Reloads watchlists for all wikis where a watchlist is already
             present
-new         Load watchlists for all wikis where accounts is setting in
             user-config.py
scripts.watchlist.get(site=None)[source]

Load the watchlist, fetching it if necessary.

scripts.watchlist.isWatched(pageName, site=None)[source]

Check whether a page is being watched.

scripts.watchlist.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.watchlist.refresh(site, sysop=False)[source]

Fetch the watchlist.

scripts.watchlist.refresh_all(sysop=False)[source]

Reload watchlists for all wikis where a watchlist is already present.

scripts.watchlist.refresh_new(sysop=False)[source]

Load watchlists of all wikis for accounts set in user-config.py.

scripts.weblinkchecker script

This bot is used for checking external links found at the wiki.

It checks several pages at once, with a limit set by the config variable max_external_links, which defaults to 50.

The bot won’t change any wiki pages, it will only report dead links such that people can fix or remove the links themselves.

The bot will store all links found dead in a .dat file in the deadlinks subdirectory. To avoid the removing of links which are only temporarily unavailable, the bot ONLY reports links which were reported dead at least two times, with a time lag of at least one week. Such links will be logged to a .txt file in the deadlinks subdirectory.

The .txt file uses wiki markup and so it may be useful to post it on the wiki and then exclude that page from subsequent runs. For example if the page is named Broken Links, exclude it with ‘-titleregexnot:^Broken Links$’

After running the bot and waiting for at least one week, you can re-check those pages where dead links were found, using the -repeat parameter.

In addition to the logging step, it is possible to automatically report dead links to the talk page of the article where the link was found. To use this feature, set report_dead_links_on_talk = True in your user-config.py, or specify “-talk” on the command line. Adding “-notalk” switches this off irrespective of the configuration variable.

When a link is found alive, it will be removed from the .dat file.

These command line parameters can be used to specify which pages to work on:

-repeat      Work on all pages were dead links were found before. This is
             useful to confirm that the links are dead after some time (at
             least one week), which is required before the script will report
             the problem.

-namespace   Only process templates in the namespace with the given number or
             name. This parameter may be used multiple times.

-xml         Should be used instead of a simple page fetching method from
             pagegenerators.py for performance and load issues

-xmlstart    Page to start with when using an XML dump

-ignore      HTTP return codes to ignore. Can be provided several times :
                -ignore:401 -ignore:500

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-talk        Overrides the report_dead_links_on_talk config variable, enabling
             the feature.

-notalk      Overrides the report_dead_links_on_talk config variable, disabling
             the feature.

-day         Do not report broken link if the link is there only since
             x days or less. If not set, the default is 7 days.

The following config variables are supported:

max_external_links         The maximum number of web pages that should be
                           loaded simultaneously. You should change this
                           according to your Internet connection speed.
                           Be careful: if it is set too high, the script
                           might get socket errors because your network
                           is congested, and will then think that the page
                           is offline.

report_dead_links_on_talk  If set to true, causes the script to report dead
                           links on the article's talk page if (and ONLY if)
                           the linked page has been unavailable at least two
                           times during a timespan of at least one week.

weblink_dead_days          sets the timespan (default: one week) after which
                           a dead link will be reported

Examples

Loads all wiki pages in alphabetical order using the Special:Allpages feature:

python pwb.py weblinkchecker -start:!

Loads all wiki pages using the Special:Allpages feature, starting at “Example page”:

python pwb.py weblinkchecker -start:Example_page

Loads all wiki pages that link to www.example.org:

python pwb.py weblinkchecker -weblink:www.example.org

Only checks links found in the wiki page “Example page”:

python pwb.py weblinkchecker Example page

Loads all wiki pages where dead links were found during a prior run:

python pwb.py weblinkchecker -repeat
class scripts.weblinkchecker.DeadLinkReportThread[source]

Bases: threading.Thread

A Thread that is responsible for posting error reports on talk pages.

There is only one DeadLinkReportThread, and it is using a semaphore to make sure that two LinkCheckerThreads can not access the queue at the same time.

__init__()[source]

Initializer.

__module__ = 'scripts.weblinkchecker'
kill()[source]

Kill thread.

report(url, errorReport, containingPage, archiveURL)[source]

Report error on talk page of the page containing the dead link.

run()[source]

Run thread.

shutdown()[source]

Finish thread.

class scripts.weblinkchecker.History(reportThread, site=None)[source]

Bases: object

Store previously found dead links.

The URLs are dictionary keys, and values are lists of tuples where each tuple represents one time the URL was found dead. Tuples have the form (title, date, error) where title is the wiki page where the URL was found, date is an instance of time, and error is a string with error code and message.

We assume that the first element in the list represents the first time we found this dead link, and the last element represents the last time.

Example:

dict = {
    'https://www.example.org/page': [
        ('WikiPageTitle', DATE, '404: File not found'),
        ('WikiPageName2', DATE, '404: File not found'),
    ]
}
__init__(reportThread, site=None)[source]

Initializer.

__module__ = 'scripts.weblinkchecker'
log(url, error, containingPage, archiveURL)[source]

Log an error report to a text file in the deadlinks subdirectory.

save()[source]

Save the .dat file to disk.

setLinkAlive(url)[source]

Record that the link is now alive.

If link was previously found dead, remove it from the .dat file.

Returns:True if previously found dead, else returns False.
setLinkDead(url, error, page, weblink_dead_days)[source]

Add the fact that the link was found dead to the .dat file.

class scripts.weblinkchecker.LinkCheckThread(page, url, history, HTTPignore, day)[source]

Bases: threading.Thread

A thread responsible for checking one URL.

After checking the page, it will die.

__init__(page, url, history, HTTPignore, day)[source]

Initializer.

__module__ = 'scripts.weblinkchecker'
run()[source]

Run the bot.

exception scripts.weblinkchecker.NotAnURLError[source]

Bases: BaseException

The link is not an URL.

__module__ = 'scripts.weblinkchecker'
scripts.weblinkchecker.RepeatPageGenerator()[source]

Generator for pages in History.

class scripts.weblinkchecker.WeblinkCheckerRobot(generator, HTTPignore=None, day=7, site=True)[source]

Bases: pywikibot.bot.SingleSiteBot, pywikibot.bot.ExistingPageBot

Bot which will search for dead weblinks.

It uses several LinkCheckThreads at once to process pages from generator.

__init__(generator, HTTPignore=None, day=7, site=True)[source]

Initializer.

__module__ = 'scripts.weblinkchecker'
treat_page()[source]

Process one page.

scripts.weblinkchecker.countLinkCheckThreads()[source]

Count LinkCheckThread threads.

Returns:number of LinkCheckThread threads
Return type:int
scripts.weblinkchecker.get_archive_url(url)[source]

Get archive URL.

scripts.weblinkchecker.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (str) – command line arguments
scripts.weblinkchecker.weblinksIn(text, withoutBracketed=False, onlyBracketed=False)[source]

Yield web links from text.

TODO: move to textlib

scripts.welcome script

Script to welcome new users.

This script works out of the box for Wikis that have been defined in the script.

Ensure you have community support before running this bot!

Everything that needs customisation to support additional projects is indicated by comments.

Description of basic functionality

  • Request a list of new users every period (default: 3600 seconds) You can choose to break the script after the first check (see arguments)
  • Check if new user has passed a threshold for a number of edits (default: 1 edit)
  • Optional: check username for bad words in the username or if the username consists solely of numbers; log this somewhere on the wiki (default: False) Update: Added a whitelist (explanation below).
  • If user has made enough edits (it can be also 0), check if user has an empty talk page
  • If user has an empty talk page, add a welcome message.
  • Optional: Once the set number of users have been welcomed, add this to the configured log page, one for each day (default: True)
  • If no log page exists, create a header for the log page first.

This script (by default not yet implemented) uses two templates that need to be on the local wiki

  • {{WLE}}: contains mark up code for log entries (just copy it from Commons)
  • {{welcome}}: contains the information for new users

This script understands the following command-line arguments:

-edit[:#]       Define how many edits a new user needs to be welcomed
                (default: 1, max: 50)

-time[:#]       Define how many seconds the bot sleeps before restart
                (default: 3600)

-break          Use it if you don't want that the Bot restart at the end
                (it will break) (default: False)

-nlog           Use this parameter if you do not want the bot to log all
                welcomed users (default: False)

-limit[:#]      Use this parameter to define how may users should be
                checked (default:50)

-offset[:TIME]  Skip the latest new users (those newer than TIME)
                to give interactive users a chance to welcome the
                new users (default: now)
                Timezone is the server timezone, GMT for Wikimedia
                TIME format : yyyymmddhhmmss or yyyymmdd

-timeoffset[:#] Skip the latest new users, accounts newer than
                # minutes

-numberlog[:#]  The number of users to welcome before refreshing the
                welcome log (default: 4)

-filter         Enable the username checks for bad names (default: False)

-ask            Use this parameter if you want to confirm each possible
                bad username (default: False)

-random         Use a random signature, taking the signatures from a wiki
                page (for instruction, see below).

-file[:#]       Use a file instead of a wikipage to take the random sign.
                If you use this parameter, you don't need to use -random.

-sign           Use one signature from command line instead of the default

-savedata       This feature saves the random signature index to allow to
                continue to welcome with the last signature used.

-sul            Welcome the auto-created users (default: False)

-quiet          Prevents users without contributions are displayed

***************************** GUIDE *******************************

* Report, Bad and white list guide: *

  1. Set in the code which page it will use to load the badword, the whitelist and the report

  2. In these page you have to add a “tuple” with the names that you want to add in the two list. For example: (‘cat’, ‘mouse’, ‘dog’) You can write also other text in the page, it will work without problem.

  3. What will do the two pages? Well, the Bot will check if a badword is in the username and set the “warning” as True. Then the Bot check if a word of the whitelist is in the username. If yes it remove the word and recheck in the bad word list to see if there are other badword in the username. Example

    • dio is a badword
    • Claudio is a normal name
    • The username is “Claudio90 fuck!”
    • The Bot finds dio and sets “warning”
    • The Bot finds Claudio and sets “ok”
    • The Bot finds fuck at the end and sets “warning”
    • Result: The username is reported.
  4. When a user is reported you have to check him and do

    • If he’s ok, put the {{welcome}}
    • If he’s not, block him
    • You can decide to put a “you are blocked, change another username” template or not.
    • Delete the username from the page.

    IMPORTANT : The Bot check the user in this order

    • Search if he has a talkpage (if yes, skip)
    • Search if he’s blocked, if yes he will be skipped
    • Search if he’s in the report page, if yes he will be skipped
    • If no, he will be reported.

* Random signature guide: *

Some welcomed users will answer to the one who has signed the welcome message. When you welcome many new users, you might be overwhelmed with such answers. Therefore you can define usernames of other users who are willing to receive some of these messages from newbies.

  1. Set the page that the bot will load

  2. Add the signatures in this way:

    *<SPACE>SIGNATURE
    <NEW LINE>
    

Example of signatures:

<pre>
* [[User:Filnik\|Filnik]]
* [[User:Rock\|Rock]]
</pre>
NOTE: The white space and <pre></pre> aren’t required but I suggest you to
use them.

**************************** Badwords ******************************

The list of Badwords of the code is opened. If you think that a word is international and it must be blocked in all the projects feel free to add it. If also you think that a word isn’t so international, feel free to delete it.

However, there is a dinamic-wikipage to load that badwords of your project or you can add them directly in the source code that you are using without adding or deleting.

Some words, like “Administrator” or “Dio” (God in italian) or “Jimbo” aren’t badwords at all but can be used for some bad-nickname.

exception scripts.welcome.FilenameNotSet(arg)[source]

Bases: pywikibot.exceptions.Error

An exception indicating that a signature filename was not specified.

__module__ = 'scripts.welcome'
class scripts.welcome.Global[source]

Bases: object

Container class for global settings.

__module__ = 'scripts.welcome'
attachEditCount = 1
confirm = False
defaultSign = '--~~~~'
dumpToLog = 15
filtBadName = False
makeWelcomeLog = True
offset = None
queryLimit = 50
quiet = False
randomSign = False
recursive = True
saveSignIndex = False
signFileName = None
timeRecur = 3600
timeoffset = 0
welcomeAuto = False
class scripts.welcome.WelcomeBot(**kwargs)[source]

Bases: pywikibot.bot.SingleSiteBot

Bot to add welcome messages on User pages.

__init__(**kwargs)[source]

Initializer.

__module__ = 'scripts.welcome'
badNameFilter(name,