scripts.maintenance utility scripts

Maintenance scripts.

Submodules

scripts.maintenance.cache script

This script runs commands on each entry in the API caches.

Syntax:

python pwb.py cache [-password] [-delete] [-c "..."] [-o "..."] [dir ...]

If no directory are specified, it will detect the API caches.

If no command is specified, it will print the filename of all entries. If only -delete is specified, it will delete all entries.

The following parameters are supported:

-delete           Delete each command filtered. If that option is set the
                  default output will be nothing.

-c                Filter command in python syntax. It must evaluate to True to
                  output anything.

-o                Output command which is output when the filter evaluated to
                  True. If it returns None it won't output anything.

Examples

Print the filename of any entry with ‘wikidata’ in the key:

-c "wikidata" in entry._uniquedescriptionstr()

Customised output if the site code is ‘ar’:

-c entry.site.code == "ar"
-o uniquedesc(entry)

Or the state of the login:

-c entry.site._loginstatus == LoginStatus.NOT_ATTEMPTED
-o uniquedesc(entry)

If the function only uses one parameter for the entry it can be omitted:

-c has_password
-o uniquedesc

Available filter commands:

  has_password(entry)
  is_logout(entry)
  empty_response(entry)
  not_accessed(entry)
  incorrect_hash(entry)
  older_than_one_day(entry)
  recent(entry)

There are helper functions which can be part of a command::

  older_than(entry, interval)
  newer_than(entry, interval)

Available output commands:

uniquedesc(entry)
class scripts.maintenance.cache.User(source, title='', site='[deprecated name of source]', name='[deprecated name of title]')[source]

Bases: pywikibot.page.Page

A class that represents a Wiki user.

This class also represents the Wiki page User:<username>

__init__(source, title='', site='[deprecated name of source]', name='[deprecated name of title]')[source]

Initializer for a User object.

All parameters are the same as for Page() Initializer.

__module__ = 'pywikibot.page'
block(*args, **kwargs)[source]

Block user.

Refer APISite.blockuser method for parameters.

Returns

None

contributions(total=500, **kwargs, limit='[deprecated name of total]', namespace='[deprecated name of namespaces]')[source]

Yield tuples describing this user edits.

Each tuple is composed of a pywikibot.Page object, the revision id (int), the edit timestamp (as a pywikibot.Timestamp object), and the comment (unicode). Pages returned are not guaranteed to be unique.

Parameters

total (int) – limit result to this number of pages

Keyword Arguments
  • start – Iterate contributions starting at this Timestamp

  • end – Iterate contributions ending at this Timestamp

  • reverse – Iterate oldest contributions first (default: newest)

  • namespaces – only iterate pages in these namespaces

  • showMinor – if True, iterate only minor edits; if False and not None, iterate only non-minor edits (default: iterate both)

  • top_only – if True, iterate only edits which are the latest revision (default: False)

Returns

tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment

Return type

tuple

editCount(force=False)[source]

Return edit count for a registered user.

Always returns 0 for ‘anonymous’ users.

Parameters

force (bool) – if True, forces reloading the data from API

Return type

int

editedPages(total=500, limit='[deprecated name of total]')[source]

DEPRECATED. Use contributions().

Yields pywikibot.Page objects that this user has edited, with an upper bound of ‘total’. Pages returned are not guaranteed to be unique.

Parameters

total (int.) – limit result to this number of pages.

property first_edit

Return first user contribution.

Returns

first user contribution entry

Returns

tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment

Return type

tuple or None

gender(force=False)[source]

Return the gender of the user.

Parameters

force (bool) – if True, forces reloading the data from API

Returns

return ‘male’, ‘female’, or ‘unknown’

Return type

str

getUserPage(subpage='')[source]

Return a Page object relative to this user’s main page.

Parameters

subpage (str) – subpage part to be appended to the main page title (optional)

Returns

Page object of user page or user subpage

Return type

pywikibot.Page

getUserTalkPage(subpage='')[source]

Return a Page object relative to this user’s main talk page.

Parameters

subpage (str) – subpage part to be appended to the main talk page title (optional)

Returns

Page object of user talk page or user talk subpage

Return type

pywikibot.Page

getprops(force=False)[source]

Return a properties about the user.

Parameters

force (bool) – if True, forces reloading the data from API

Return type

dict

groups(force=False)[source]

Return a list of groups to which this user belongs.

The list of groups may be empty.

Parameters

force (bool) – if True, forces reloading the data from API

Returns

groups property

Return type

list

isAnonymous()[source]

Determine if the user is editing as an IP address.

Return type

bool

isBlocked(force=False)[source]

Determine whether the user is currently blocked.

Parameters

force (bool) – if True, forces reloading the data from API

Return type

bool

isEmailable(force=False)[source]

Determine whether emails may be send to this user through MediaWiki.

Parameters

force (bool) – if True, forces reloading the data from API

Return type

bool

isRegistered(force=False)[source]

Determine if the user is registered on the site.

It is possible to have a page named User:xyz and not have a corresponding user with username xyz.

The page does not need to exist for this method to return True.

Parameters

force (bool) – if True, forces reloading the data from API

Return type

bool

property is_thankable

Determine if the user has thanks notifications enabled.

NOTE: This doesn’t accurately determine if thanks is enabled for user.

Privacy of thanks preferences is under discussion, please see https://phabricator.wikimedia.org/T57401#2216861, and https://phabricator.wikimedia.org/T120753#1863894

Return type

bool

property last_edit

Return last user contribution.

Returns

last user contribution entry

Returns

tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment

Return type

tuple or None

property last_event

Return last user activity.

Returns

last user log entry

Return type

LogEntry or None

logevents(**kwargs)[source]

Yield user activities.

Keyword Arguments
  • logtype – only iterate entries of this type (see mediawiki api documentation for available types)

  • page – only iterate entries affecting this page

  • namespace – namespace to retrieve logevents from

  • start – only iterate entries from and after this Timestamp

  • end – only iterate entries up to and through this Timestamp

  • reverse – if True, iterate oldest entries first (default: newest)

  • tag – only iterate entries tagged with this tag

  • total – maximum number of events to iterate

Return type

iterable

name()[source]

The username.

DEPRECATED: use username instead.

Return type

str

registration(force=False)[source]

Fetch registration date for this user.

Parameters

force (bool) – if True, forces reloading the data from API

Return type

pywikibot.Timestamp or None

registrationTime(force=False)[source]

DEPRECATED. Fetch registration date for this user.

Parameters

force (bool) – if True, forces reloading the data from API

Returns

int (MediaWiki’s internal timestamp format) or 0

Return type

int

rights(force=False)[source]

Return user rights.

Parameters

force (bool) – if True, forces reloading the data from API

Returns

return user rights

Return type

list

send_email(subject, text, ccme=False)[source]

Send an email to this user via MediaWiki’s email interface.

Parameters
  • subject (str) – the subject header of the mail

  • text (str) – mail body

  • ccme (bool) – if True, sends a copy of this email to the bot

Raises
Returns

operation successful indicator

Return type

bool

unblock(reason=None)[source]

Remove the block for the user.

Parameters

reason (basestring) – Reason for the unblock.

uploadedImages(total=10, number='[deprecated name of total]')[source]

Yield tuples describing files uploaded by this user.

Each tuple is composed of a pywikibot.Page, the timestamp (str in ISO8601 format), comment (unicode) and a bool for pageid > 0. Pages returned are not guaranteed to be unique.

Parameters

total (int) – limit result to this number of pages

property username

The username.

Convenience method that returns the title of the page with namespace prefix omitted, which is the username.

Return type

str

class scripts.maintenance.cache.APISite(code, fam=None, user=None, sysop=None)[source]

Bases: pywikibot.site.BaseSite

API interface to MediaWiki site.

Do not instantiate directly; use pywikibot.Site function.

class OnErrorExc(exception, on_new_page)

Bases: tuple

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__module__ = 'pywikibot.site'
static __new__(_cls, exception, on_new_page)

Create new instance of OnErrorExc(exception, on_new_page)

__repr__()

Return a nicely formatted representation string

__slots__ = ()
property exception

Alias for field number 0

property on_new_page

Alias for field number 1

__getstate__()[source]

Remove TokenWallet before pickling, for security reasons.

__init__(code, fam=None, user=None, sysop=None)[source]

Initializer.

__module__ = 'pywikibot.site'
__setstate__(attrs)[source]

Restore things removed in __getstate__.

allcategories(start='!', prefix='', total=None, reverse=False, content=False, step=NotImplemented)[source]

Iterate categories used (which need not have a Category page).

Iterator yields Category objects. Note that, in practice, links that were found on pages that have been deleted may not have been removed from the database table, so this method can return false positives.

See

https://www.mediawiki.org/wiki/API:Allcategories

Parameters
  • start – Start at this category title (category need not exist).

  • prefix – Only yield categories starting with this string.

  • reverse – if True, iterate in reverse Unicode lexigraphic order (default: iterate in forward order)

  • content – if True, load the current content of each iterated page (default False); note that this means the contents of the category description page, not the pages that are members of the category

allimages(start='!', prefix='', minsize=None, maxsize=None, reverse=False, sha1=None, sha1base36=None, total=None, content=False, step=NotImplemented)[source]

Iterate all images, ordered by image title.

Yields FilePages, but these pages need not exist on the wiki.

See

https://www.mediawiki.org/wiki/API:Allimages

Parameters
  • start – start at this title (name need not exist)

  • prefix – only iterate titles starting with this substring

  • minsize – only iterate images of at least this many bytes

  • maxsize – only iterate images of no more than this many bytes

  • reverse – if True, iterate in reverse lexigraphic order

  • sha1 – only iterate image (it is theoretically possible there could be more than one) with this sha1 hash

  • sha1base36 – same as sha1 but in base 36

  • content – if True, load the current content of each iterated page (default False); note that this means the content of the image description page, not the image itself

Iterate all links to pages (which need not exist) in one namespace.

Note that, in practice, links that were found on pages that have been deleted may not have been removed from the links table, so this method can return false positives.

See

https://www.mediawiki.org/wiki/API:Alllinks

Parameters
  • start – Start at this title (page need not exist).

  • prefix – Only yield pages starting with this string.

  • namespace (int or Namespace) – Iterate pages from this (single) namespace

  • unique – If True, only iterate each link title once (default: iterate once for each linking page)

  • fromids – if True, include the pageid of the page containing each link (default: False) as the ‘_fromid’ attribute of the Page; cannot be combined with unique

Raises
  • KeyError – the namespace identifier was not resolved

  • TypeError – the namespace identifier has an inappropriate type such as bool, or an iterable with more than one namespace

allpages(start='!', prefix='', namespace=0, filterredir=None, filterlanglinks=None, minsize=None, maxsize=None, protect_type=None, protect_level=None, reverse=False, total=None, content=False, step=NotImplemented, includeredirects='[deprecated name of filterredir]', throttle=NotImplemented, limit='[deprecated name of total]')[source]

Iterate pages in a single namespace.

See

https://www.mediawiki.org/wiki/API:Allpages

Parameters
  • start – Start at this title (page need not exist).

  • prefix – Only yield pages starting with this string.

  • namespace (int or Namespace.) – Iterate pages from this (single) namespace

  • filterredir – if True, only yield redirects; if False (and not None), only yield non-redirects (default: yield both)

  • filterlanglinks – if True, only yield pages with language links; if False (and not None), only yield pages without language links (default: yield both)

  • minsize – if present, only yield pages at least this many bytes in size

  • maxsize – if present, only yield pages at most this many bytes in size

  • protect_type (str) – only yield pages that have a protection of the specified type

  • protect_level – only yield pages that have protection at this level; can only be used if protect_type is specified

  • reverse – if True, iterate in reverse Unicode lexigraphic order (default: iterate in forward order)

  • content – if True, load the current content of each iterated page (default False)

Raises
  • KeyError – the namespace identifier was not resolved

  • TypeError – the namespace identifier has an inappropriate type such as bool, or an iterable with more than one namespace

allusers(start='!', prefix='', group=None, total=None, step=NotImplemented)[source]

Iterate registered users, ordered by username.

Iterated values are dicts containing ‘name’, ‘editcount’, ‘registration’, and (sometimes) ‘groups’ keys. ‘groups’ will be present only if the user is a member of at least 1 group, and will be a list of unicodes; all the other values are unicodes and should always be present.

See

https://www.mediawiki.org/wiki/API:Allusers

Parameters
  • start – start at this username (name need not exist)

  • prefix – only iterate usernames starting with this substring

  • group (str) – only iterate users that are members of this group

ancientpages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages, datestamps from Special:Ancientpages.

Parameters

total – number of pages to return

property article_path

Get the nice article path without $1.

assert_valid_iter_params(msg_prefix, start, end, reverse, is_ts=True)[source]

Validate iterating API parameters.

Parameters
  • msg_prefix (str) – The calling method name

  • start – The start value to compare

  • end – The end value to compare

  • reverse (bool) – The reverse option

  • is_ts (bool) – When comparing timestamps (with is_ts=True) the start is usually greater than end. Comparing titles this is vice versa.

Raises

AssertionError – start/end values are in wrong order

blocks(starttime=None, endtime=None, reverse=False, blockids=None, users=None, iprange=None, total=None, step=NotImplemented)[source]

Iterate all current blocks, in order of creation.

The iterator yields dicts containing keys corresponding to the block properties.

See

https://www.mediawiki.org/wiki/API:Blocks

Note

logevents only logs user blocks, while this method iterates all blocks including IP ranges.

Note

userid key will be given for mw 1.18+ only

Note

iprange parameter cannot be used together with users.

Parameters
  • starttime (pywikibot.Timestamp) – start iterating at this Timestamp

  • endtime (pywikibot.Timestamp) – stop iterating at this Timestamp

  • reverse (bool) – if True, iterate oldest blocks first (default: newest)

  • blockids (basestring, tuple or list) – only iterate blocks with these id numbers. Numbers must be separated by ‘|’ if given by a basestring.

  • users (basestring, tuple or list) – only iterate blocks affecting these usernames or IPs

  • iprange (str) – a single IP or an IP range. Ranges broader than IPv4/16 or IPv6/19 are not accepted.

  • total (int) – total amount of block entries

blockuser(user, expiry, reason, anononly=True, nocreate=True, autoblock=True, noemail=False, reblock=False, allowusertalk=False)[source]

Block a user for certain amount of time and for a certain reason.

See

https://www.mediawiki.org/wiki/API:Block

Parameters
  • user (pywikibot.User) – The username/IP to be blocked without a namespace.

  • expiry (Timestamp/datetime (absolute), basestring (relative/absolute) or False ('never')) –

    The length or date/time when the block expires. If ‘never’, ‘infinite’, ‘indefinite’ it never does. If the value is given as a basestring it’s parsed by php’s strtotime function:

    The relative format is described there:

    It is recommended to not use a basestring if possible to be independent of the API.

  • reason (basestring) – The reason for the block.

  • anononly (boolean) – Disable anonymous edits for this IP.

  • nocreate (boolean) – Prevent account creation.

  • autoblock (boolean) – Automatically block the last used IP address and all subsequent IP addresses from which this account logs in.

  • noemail (boolean) – Prevent user from sending email through the wiki.

  • reblock (boolean) – If the user is already blocked, overwrite the existing block.

  • allowusertalk (boolean) – Whether the user can edit their talk page while blocked.

Returns

The data retrieved from the API request.

Return type

dict

botusers(total=None, step=NotImplemented)[source]

Iterate bot users.

Iterated values are dicts containing ‘name’, ‘userid’, ‘editcount’, ‘registration’, and ‘groups’ keys. ‘groups’ will be present only if the user is a member of at least 1 group, and will be a list of unicodes; all the other values are unicodes and should always be present.

broken_redirects(total=None, step=NotImplemented)[source]

Yield Pages with broken redirects from Special:BrokenRedirects.

Parameters

total – number of pages to return

categoryinfo(category)[source]

Retrieve data on contents of category.

categorymembers(category, namespaces=None, sortby=None, reverse=False, starttime=None, endtime=None, startsort=None, endsort=None, total=None, content=False, member_type=None, startprefix=None, endprefix=None, step=NotImplemented)[source]

Iterate members of specified category.

See

https://www.mediawiki.org/wiki/API:Categorymembers

Parameters
  • category – The Category to iterate.

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – If present, only return category members from these namespaces. To yield subcategories or files, use parameter member_type instead.

  • sortby (str) – determines the order in which results are generated, valid values are “sortkey” (default, results ordered by category sort key) or “timestamp” (results ordered by time page was added to the category)

  • reverse – if True, generate results in reverse order (default False)

  • starttime (pywikibot.Timestamp) – if provided, only generate pages added after this time; not valid unless sortby=”timestamp”

  • endtime – if provided, only generate pages added before this time; not valid unless sortby=”timestamp”

  • startsort (str) – if provided, only generate pages that have a sortkey >= startsort; not valid if sortby=”timestamp” (Deprecated in MW 1.24)

  • endsort (str) – if provided, only generate pages that have a sortkey <= endsort; not valid if sortby=”timestamp” (Deprecated in MW 1.24)

  • startprefix (str) – if provided, only generate pages >= this title lexically; not valid if sortby=”timestamp”; overrides “startsort” (requires MW 1.18+)

  • endprefix (str) – if provided, only generate pages < this title lexically; not valid if sortby=”timestamp”; overrides “endsort” (requires MW 1.18+)

  • content (bool) – if True, load the current content of each iterated page (default False)

  • member_type (str or iterable of str; values: page, subcat, file) – member type; if member_type includes ‘page’ and is used in conjunction with sortby=”timestamp”, the API may limit results to only pages in the first 50 namespaces.

Return type

typing.Iterable[pywikibot.Page]

Raises
  • KeyError – a namespace identifier was not resolved

  • NotImplementedError – startprefix or endprefix parameters are given but site.version is less than 1.18.

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

checkBlocks(sysop=False)[source]

Raise an exception when the user is blocked. DEPRECATED.

Parameters

sysop (bool) – If true, log in to sysop account (if available)

Raises

pywikibot.exceptions.UserBlocked – The logged in user/sysop account is blocked.

compare(old, diff)[source]

Corresponding method to the ‘action=compare’ API action.

See

https://www.mediawiki.org/wiki/API:Compare

See: https://en.wikipedia.org/w/api.php?action=help&modules=compare Use pywikibot.diff’s html_comparator() method to parse result. :param old: starting revision ID, title, Page, or Revision :type old: int, str, pywikibot.Page, or pywikibot.Page.Revision :param diff: ending revision ID, title, Page, or Revision :type diff: int, str, pywikibot.Page, or pywikibot.Page.Revision :return: Returns an HTML string of a diff between two revisions. :rtype: str

create_new_topic(page, title, content, format)[source]

Create a new topic on a Flow board.

Parameters
  • page (Board) – A Flow board

  • title (str) – The title of the new topic (must be in plaintext)

  • content (str) – The content of the topic’s initial post

  • format (str (either 'wikitext' or 'html')) – The content format of the value supplied for content

Returns

The metadata of the new topic

Return type

dict

Return a shortened link.

Note that on Wikimedia wikis only metawiki supports this action, and this wiki can process links to all WM domains.

Parameters

url (str) – The link to reduce, with propotol prefix.

Returns

The reduced link, without protocol prefix.

Return type

str

data_repository()[source]

Return the data repository connected to this site.

Returns

The data repository if one is connected or None otherwise.

Return type

pywikibot.site.DataSite or None

dbName()[source]

Return this site’s internal id.

deadendpages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Page objects retrieved from Special:Deadendpages.

Parameters

total – number of pages to return

delete_post(post, reason)[source]

Delete a Flow post.

Parameters
  • post (Post) – A Flow post

  • reason (str) – The reason to delete the post

Returns

Metadata returned by the API

Return type

dict

delete_topic(page, reason)[source]

Delete a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • reason (str) – The reason to delete the topic

Returns

Metadata returned by the API

Return type

dict

deletedrevs(titles=None, start=None, end=None, reverse=False, content=False, total=None, **kwargs, page='[deprecated name of titles]', step=NotImplemented, get_text='[deprecated name of content]', limit='[deprecated name of total]')[source]

Iterate deleted revisions.

Each value returned by the iterator will be a dict containing the ‘title’ and ‘ns’ keys for a particular Page and a ‘revisions’ key whose value is a list of revisions in the same format as recentchanges plus a ‘content’ element with key ‘*’ if requested when ‘content’ parameter is set. For older wikis a ‘token’ key is also given with the content request.

See

https://www.mediawiki.org/wiki/API:Deletedrevisions

Parameters

titles (str (multiple titles delimited with '|') or pywikibot.Page or typing.Iterable[pywikibot.Page] or typing.Iterable[str]) – The page titles to check for deleted revisions

Keyword Arguments

revids – Get revisions by their ID

:note either titles or revids must be set but not both

Parameters
  • start – Iterate revisions starting at this Timestamp

  • end – Iterate revisions ending at this Timestamp

  • reverse (bool) – Iterate oldest revisions first (default: newest)

  • content – If True, retrieve the content of each revision

  • total – number of revisions to retrieve

Keyword Arguments
  • user – List revisions by this user

  • excludeuser – Exclude revisions by this user

  • tag – Only list revision tagged with this tag

  • prop – Which properties to get. Defaults are ids, user, comment, flags and timestamp

deletepage(page, reason, summary='[deprecated name of reason]')[source]

Delete page from the wiki. Requires appropriate privilege level.

See

https://www.mediawiki.org/wiki/API:Delete

Parameters
double_redirects(total=None, step=NotImplemented)[source]

Yield Pages with double redirects from Special:DoubleRedirects.

Parameters

total – number of pages to return

editpage(page, summary=None, minor=True, notminor=False, bot=True, recreate=True, createonly=False, nocreate=False, watch=None, **kwargs)[source]

Submit an edit to be saved to the wiki.

See

https://www.mediawiki.org/wiki/API:Edit

Parameters
  • page – The Page to be saved. By default its .text property will be used as the new text to be saved to the wiki

  • summary – the edit summary

  • minor – if True (default), mark edit as minor

  • notminor – if True, override account preferences to mark edit as non-minor

  • recreate – if True (default), create new page even if this title has previously been deleted

  • createonly – if True, raise an error if this title already exists on the wiki

  • nocreate – if True, raise an error if the page does not exist

  • watch – Specify how the watchlist is affected by this edit, set to one of “watch”, “unwatch”, “preferences”, “nochange”: * watch: add the page to the watchlist * unwatch: remove the page from the watchlist The following settings are supported by mw >= 1.16 only * preferences: use the preference settings (default) * nochange: don’t change the watchlist

  • bot – if True, mark edit with bot flag

Keyword Arguments
  • text – Overrides Page.text

  • section – Edit an existing numbered section or a new section (‘new’)

  • prependtext – Prepend text. Overrides Page.text

  • appendtext – Append text. Overrides Page.text.

  • undo – Revision id to undo. Overrides Page.text

Returns

True if edit succeeded, False if it failed

Return type

bool

Raises
expand_text(text, title=None, includecomments=None, string='[deprecated name of text]')[source]

Parse the given text for preprocessing and rendering.

e.g expand templates and strip comments if includecomments parameter is not True. Keeps text inside <nowiki></nowiki> tags unchanges etc. Can be used to parse magic parser words like {{CURRENTTIMESTAMP}}.

Parameters
  • text (str) – text to be expanded

  • title (str) – page title without section

  • includecomments (bool) – if True do not strip comments

Return type

str

exturlusage(url=None, protocol='http', namespaces=None, total=None, content=False, step=NotImplemented)[source]

Iterate Pages that contain links to the given URL.

See

https://www.mediawiki.org/wiki/API:Exturlusage

Parameters
  • url – The URL to search for (without the protocol prefix); this may include a ‘*’ as a wildcard, only at the start of the hostname

  • protocol – The protocol prefix (default: “http”)

filearchive(start=None, end=None, reverse=False, total=None, **kwargs, limit='[deprecated name of total]')[source]

Iterate archived files.

Yields dict of file archive informations.

See

https://www.mediawiki.org/wiki/API:filearchive

Parameters
  • start – start at this title (name need not exist)

  • end – end at this title (name need not exist)

  • reverse – if True, iterate in reverse lexigraphic order

  • total – maximum number of pages to retrieve in total

Keyword Arguments
  • prefix – only iterate titles starting with this substring

  • sha1 – only iterate image with this sha1 hash

  • sha1base36 – same as sha1 but in base 36

  • prop – Image information to get. Default is timestamp

forceLogin(**kw)
classmethod fromDBName(dbname, site=None)[source]

Create a site from a database name using the sitematrix.

Parameters
  • dbname (str) – database name

  • site (pywikibot.site.APISite) – Site to load sitematrix from. (Default meta.wikimedia.org)

Returns

site object for the database name

Return type

pywikibot.site.APISite

getExpandedString(**kw)
getFilesFromAnHash(hash_found=None)[source]

Return all files that have the same hash.

DEPRECATED: Use APISite.allimages instead using ‘sha1’.

getImagesFromAnHash(hash_found=None)[source]

Return all images that have the same hash.

DEPRECATED: Use APISite.allimages instead using ‘sha1’.

get_parsed_page(page)[source]

Retrieve parsed text of the page using action=parse.

See

https://www.mediawiki.org/wiki/API:Parse

get_property_names(force=False)[source]

Get property names for pages_with_property().

See

https://www.mediawiki.org/wiki/API:Pagepropnames

Parameters

force (bool) – force to retrieve userinfo ignoring cache

get_searched_namespaces(force=False)[source]

Retrieve the default searched namespaces for the user.

If no user is logged in, it returns the namespaces used by default. Otherwise it returns the user preferences. It caches the last result and returns it, if the username or login status hasn’t changed.

Parameters

force – Whether the cache should be discarded.

Returns

The namespaces which are searched by default.

Return type

set of Namespace

get_tokens(types, all=False)[source]

Preload one or multiple tokens.

For all MediaWiki versions prior to 1.20, only one token can be retrieved at once. For MediaWiki versions since 1.24wmfXXX a new token system was introduced which reduced the amount of tokens available. Most of them were merged into the ‘csrf’ token. If the token type in the parameter is not known it will default to the ‘csrf’ token.

The other token types available are:
  • deleteglobalaccount

  • patrol (*)

  • rollback

  • setglobalaccountstatus

  • userrights

  • watch

(*) Patrol was added in v1.14.

Until v1.16, the patrol token is same as the edit token. For v1.17-19, the patrol token must be obtained from the query list recentchanges.

See

https://www.mediawiki.org/wiki/API:Tokens

Parameters
  • types (iterable) – the types of token (e.g., “edit”, “move”, “delete”); see API documentation for full list of types

  • all (bool) – load all available tokens, if None only if it can be done in one request.

return: a dict with retrieved valid tokens. rtype: dict

getcategoryinfo(category)[source]

Retrieve data on contents of category.

See

https://www.mediawiki.org/wiki/API:Categoryinfo

getcurrenttime(**kw)
getcurrenttimestamp()[source]

Return the server time as a MediaWiki timestamp string.

It calls server_time first so it queries the server to get the current server time.

Returns

the server time

Return type

str (as ‘yyyymmddhhmmss’)

getglobaluserinfo()[source]

Retrieve globaluserinfo from site and cache it.

self._globaluserinfo will be a dict with the following keys and values:

  • id: user id (numeric str)

  • home: dbname of home wiki

  • registration: registration date as Timestamp

  • groups: list of groups (could be empty)

  • rights: list of rights (could be empty)

  • editcount: global editcount

getmagicwords(word)[source]

Return list of localized “word” magic words for the site.

getredirtarget(page)[source]

Return page object for the redirect target of page.

Parameters

page (pywikibot.page.BasePage) – page to search redirects for

Returns

redirect target of page

Return type

pywikibot.Page

Raises
getuserinfo(force=False)[source]

Retrieve userinfo from site and store in _userinfo attribute.

self._userinfo will be a dict with the following keys and values:

  • id: user id (numeric str)

  • name: username (if user is logged in)

  • anon: present if user is not logged in

  • groups: list of groups (could be empty)

  • rights: list of rights (could be empty)

  • message: present if user has a new message on talk page

  • blockinfo: present if user is blocked (dict)

https://www.mediawiki.org/wiki/API:Userinfo

Parameters

force (bool) – force to retrieve userinfo ignoring cache

globalusage(page, total=None)[source]

Iterate global image usage for a given FilePage.

Parameters
  • page – the page to return global image usage for.

  • total – iterate no more than this number of pages in total.

Raises
property globaluserinfo

Retrieve userinfo from site and store in _userinfo attribute.

self._userinfo will be a dict with the following keys and values:

  • id: user id (numeric str)

  • name: username (if user is logged in)

  • anon: present if user is not logged in

  • groups: list of groups (could be empty)

  • rights: list of rights (could be empty)

  • message: present if user has a new message on talk page

  • blockinfo: present if user is blocked (dict)

https://www.mediawiki.org/wiki/API:Userinfo

Parameters

force (bool) – force to retrieve userinfo ignoring cache

has_all_mediawiki_messages(keys)[source]

Confirm that the site defines a set of MediaWiki messages.

Parameters

keys (set of str) – names of MediaWiki messages

Return type

bool

property has_data_repository

Return True if site has a shared data repository like Wikidata.

has_extension(name)[source]

Determine whether extension name is loaded.

Parameters

name (str) – The extension to check for, case sensitive

Returns

If the extension is loaded

Return type

bool

has_group(group, sysop=False)[source]

Return true if and only if the user is a member of specified group.

Possible values of ‘group’ may vary depending on wiki settings, but will usually include bot. https://www.mediawiki.org/wiki/API:Userinfo

property has_image_repository

Return True if site has a shared image repository like Commons.

has_mediawiki_message(key)[source]

Determine if the site defines a MediaWiki message.

Parameters

key (str) – name of MediaWiki message

Return type

bool

has_right(right, sysop=False)[source]

Return true if and only if the user has a specific right.

Possible values of ‘right’ may vary depending on wiki settings, but will usually include:

  • Actions: edit, move, delete, protect, upload

  • User levels: autoconfirmed, sysop, bot

https://www.mediawiki.org/wiki/API:Userinfo

hide_post(post, reason)[source]

Hide a Flow post.

Parameters
  • post (Post) – A Flow post

  • reason (str) – The reason to hide the post

Returns

Metadata returned by the API

Return type

dict

hide_topic(page, reason)[source]

Hide a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • reason (str) – The reason to hide the topic

Returns

Metadata returned by the API

Return type

dict

image_repository()[source]

Return Site object for image repository e.g. commons.

imageusage(image, namespaces=None, filterredir=None, total=None, content=False, step=NotImplemented)[source]

Iterate Pages that contain links to the given FilePage.

See

https://www.mediawiki.org/wiki/API:Imageusage

Parameters
  • image (pywikibot.FilePage) – the image to search for (FilePage need not exist on the wiki)

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – If present, only iterate pages in these namespaces

  • filterredir – if True, only yield redirects; if False (and not None), only yield non-redirects (default: yield both)

  • content – if True, load the current content of each iterated page (default False)

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

isAllowed(**kw)
isBlocked(**kw)
isBot(username)[source]

Return True is username is a bot user.

is_blocked(sysop=False)[source]

Return True when logged in user is blocked.

To check whether a user can perform an action, the method has_right should be used. https://www.mediawiki.org/wiki/API:Userinfo

Parameters

sysop (bool) – If true, log in to sysop account (if available)

Return type

bool

is_data_repository()[source]

Return True if its data repository is itself.

is_image_repository()[source]

Return True if Site object is the image repository.

is_oauth_token_available()[source]

Check whether OAuth token is set for this site.

Return type

bool

is_uploaddisabled()[source]

Return True if upload is disabled on site.

When the version is at least 1.27wmf9, uses general siteinfo. If not called directly, it is cached by the first attempted upload action.

property lang

Return the code for the language of this Site.

linter_pages(lint_categories=None, total=None, namespaces=None, pageids=None, lint_from=None)[source]

Return a generator to pages containing linter errors.

Parameters
  • lint_categories – categories of lint errors

  • total (int) – if not None, yielding this many items in total

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – only iterate pages in these namespaces

  • pageids (an iterable that returns pageids (str or int), or a comma- or pipe-separated string of pageids (e.g. '945097,1483753, 956608' or '945097|483753|956608')) – only include lint errors from the specified pageids

  • lint_from (str representing digit or integer) – Lint ID to start querying from

Returns

pages with Linter errors.

Return type

typing.Iterable[pywikibot.Page]

list_to_text(args)[source]

Convert a list of strings into human-readable text.

The MediaWiki messages ‘and’ and ‘word-separator’ are used as separator between the last two arguments. If more than two arguments are given, other arguments are joined using MediaWiki message ‘comma-separator’.

Parameters

args (typing.Iterable[unicode]) – text to be expanded

Return type

str

load_board(page)[source]

Retrieve the data for a Flow board.

Parameters

page (Board) – A Flow board

Returns

A dict representing the board’s metadata.

Return type

dict

load_pages_from_pageids(pageids)[source]

Return a page generator from pageids.

Pages are iterated in the same order than in the underlying pageids.

Pageids are filtered and only one page is returned in case of duplicate pageids.

Parameters

pageids – an iterable that returns pageids (str or int), or a comma- or pipe-separated string of pageids (e.g. ‘945097,1483753, 956608’ or ‘945097|483753|956608’)

load_post_current_revision(page, post_id, format)[source]

Retrieve the data for a post to a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • post_id (str) – The UUID of the Post

  • format (str (either 'wikitext', 'html', or 'fixed-html')) – The content format used for the returned content

Returns

A dict representing the post data for the given UUID.

Return type

dict

load_topic(page, format)[source]

Retrieve the data for a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • format (str (either 'wikitext', 'html', or 'fixed-html')) – The content format to request the data in.

Returns

A dict representing the topic’s data.

Return type

dict

load_topiclist(page, format='wikitext', limit=100, sortby='newest', toconly=False, offset=None, offset_id=None, reverse=False, include_offset=False)[source]

Retrieve the topiclist of a Flow board.

Parameters
  • page (Board) – A Flow board

  • format (str (either 'wikitext', 'html', or 'fixed-html')) – The content format to request the data in.

  • limit (int) – The number of topics to fetch in each request.

  • sortby (str (either 'newest' or 'updated')) – Algorithm to sort topics by.

  • toconly (bool) – Whether to only include information for the TOC.

  • offset (Timestamp or equivalent str) – The timestamp to start at (when sortby is ‘updated’).

  • offset_id (str (in the form of a UUID)) – The topic UUID to start at (when sortby is ‘newest’).

  • reverse (bool) – Whether to reverse the topic ordering.

  • include_offset (bool) – Whether to include the offset topic.

Returns

A dict representing the board’s topiclist.

Return type

dict

loadcoordinfo(page)[source]

Load [[mw:Extension:GeoData]] info.

loadimageinfo(page, history=False, url_width=None, url_height=None, url_param=None)[source]

Load image info from api and save in page attributes.

Parameters correspond to iiprops in: [1] https://www.mediawiki.org/wiki/API:Imageinfo

Parameters validation and error handling left to the API call.

Parameters
  • history – if true, return the image’s version history

  • url_width – see iiurlwidth in [1]

  • url_height – see iiurlheigth in [1]

  • url_param – see iiurlparam in [1]

loadpageimage(page)[source]

Load [[mw:Extension:PageImages]] info.

Parameters

page (pywikibot.Page) – The page for which to obtain the image

Raises

APIError – PageImages extension is not installed

loadpageinfo(page, preload=False)[source]

Load page info from api and store in page attributes.

See

https://www.mediawiki.org/wiki/API:Info

loadpageprops(page)[source]

Load page props for the given page.

loadrevisions(page, content=False, revids=None, startid=None, endid=None, starttime=None, endtime=None, rvdir=None, user=None, excludeuser=None, section=None, sysop=False, step=None, total=None, rollback=False, getText='[deprecated name of content]')[source]

Retrieve revision information and store it in page object.

By default, retrieves the last (current) revision of the page, unless any of the optional parameters revids, startid, endid, starttime, endtime, rvdir, user, excludeuser, or limit are specified. Unless noted below, all parameters not specified default to False.

If rvdir is False or not specified, startid must be greater than endid if both are specified; likewise, starttime must be greater than endtime. If rvdir is True, these relationships are reversed.

See

https://www.mediawiki.org/wiki/API:Revisions

Parameters
  • page (pywikibot.Page) – retrieve revisions of this Page and hold the data.

  • content (bool) – if True, retrieve the wiki-text of each revision; otherwise, only retrieve the revision metadata (default)

  • section (int) – if specified, retrieve only this section of the text (content must be True); section must be given by number (top of the article is section 0), not name

  • revids (an int, a str or a list of ints or strings) – retrieve only the specified revision ids (raise Exception if any of revids does not correspond to page)

  • startid – retrieve revisions starting with this revid

  • endid – stop upon retrieving this revid

  • starttime – retrieve revisions starting at this Timestamp

  • endtime – stop upon reaching this Timestamp

  • rvdir – if false, retrieve newest revisions first (default); if true, retrieve earliest first

  • user – retrieve only revisions authored by this user

  • excludeuser – retrieve all revisions not authored by this user

  • sysop – if True, switch to sysop account (if available) to retrieve this page

Raises
  • ValueError – invalid startid/endid or starttime/endtime values

  • pywikibot.Error – revids belonging to a different page

lock_topic(page, lock, reason)[source]

Lock or unlock a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • lock (bool (True corresponds to locking the topic.)) – Whether to lock or unlock the topic

  • reason (str) – The reason to lock or unlock the topic

Returns

Metadata returned by the API

Return type

dict

logevents(logtype=None, user=None, page=None, namespace=None, start=None, end=None, reverse=False, tag=None, total=None, step=NotImplemented)[source]

Iterate all log entries.

See

https://www.mediawiki.org/wiki/API:Logevents

Note

logevents with logtype=’block’ only logs user blocks whereas site.blocks iterates all blocks including IP ranges.

Parameters
  • logtype (basestring) – only iterate entries of this type (see mediawiki api documentation for available types)

  • user (basestring) – only iterate entries that match this user name

  • page (pywikibot.page.Page or basestring) – only iterate entries affecting this page

  • namespace (int or Namespace or an iterable of them) – namespace(s) to retrieve logevents from

  • start (Timestamp or ISO date string) – only iterate entries from and after this Timestamp

  • end (Timestamp or ISO date string) – only iterate entries up to and through this Timestamp

  • reverse (bool) – if True, iterate oldest entries first (default: newest)

  • tag (basestring) – only iterate entries tagged with this tag

  • total (int) – maximum number of events to iterate

Note

due to an API limitation, if namespace param contains multiple namespaces, log entries from all namespaces will be fetched from the API and will be filtered later during iteration.

Return type

iterable

Raises
  • KeyError – the namespace identifier was not resolved

  • TypeError – the namespace identifier has an inappropriate type such as bool, or an iterable with more than one namespace

logged_in(sysop=False)[source]

Verify the bot is logged into the site as the expected user.

The expected usernames are those provided as either the user or sysop parameter at instantiation.

Parameters

sysop (bool) – if True, test if user is logged in as the sysop user instead of the normal user.

Return type

bool

login(sysop=False, autocreate=False)[source]

Log the user in if not already logged in.

Parameters
  • sysop (bool) – if true, log in with the sysop account.

  • autocreate (bool) – if true, allow auto-creation of the account using unified login

Raises

pywikibot.exceptions.NoUsername – Username is not recognised by the site.

See

https://www.mediawiki.org/wiki/API:Login

logout()[source]

Logout of the site and load details for the logged out user.

Also logs out of the global account if linked to the user. https://www.mediawiki.org/wiki/API:Logout

Raises

APIError – Logout is not available when OAuth enabled.

logpages(number=50, mode=None, title=None, user=None, namespace=None, start=None, end=None, tag=None, newer=False, dump=False, offset=None, repeat=NotImplemented)[source]

Iterate log pages. DEPRECATED.

When dump is enabled, the raw API dict is returned.

Return type

tuple of Page, str, int, str

property logtypes

Return a set of log types available on current site.

lonelypages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages retrieved from Special:Lonelypages.

Parameters

total – number of pages to return

longpages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages and lengths from Special:Longpages.

Yields a tuple of Page object, length(int).

Parameters

total – number of pages to return

mediawiki_message(key, forceReload=NotImplemented)[source]

Fetch the text for a MediaWiki message.

Parameters

key (str) – name of MediaWiki message

:rtype unicode

mediawiki_messages(keys)[source]

Fetch the text of a set of MediaWiki messages.

If keys is ‘*’ or [‘*’], all messages will be fetched. (deprecated)

The returned dict uses each key to store the associated message.

See

https://www.mediawiki.org/wiki/API:Allmessages

Parameters

keys (set of str, '*' or ['*']) – MediaWiki messages to fetch

:rtype dict

merge_history(source, dest, timestamp=None, reason=None)[source]

Merge revisions from one page into another.

See

https://www.mediawiki.org/wiki/API:Mergehistory

Revisions dating up to the given timestamp in the source will be moved into the destination page history. History merge fails if the timestamps of source and dest revisions overlap (all source revisions must be dated before the earliest dest revision).

Parameters
  • source (pywikibot.Page) – Source page from which revisions will be merged

  • dest (pywikibot.Page) – Destination page to which revisions will be merged

  • timestamp (pywikibot.Timestamp) – Revisions from this page dating up to this timestamp will be merged into the destination page (if not given or False, all revisions will be merged)

  • reason (str) – Optional reason for the history merge

messages(sysop=False)[source]

Return true if the user has new messages, and false otherwise.

moderate_post(post, state, reason)[source]

Moderate a Flow post.

Parameters
  • post (Post) – A Flow post

  • state (str) – The new moderation state

  • reason (str) – The reason to moderate the topic

Returns

Metadata returned by the API

Return type

dict

moderate_topic(page, state, reason)[source]

Moderate a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • state (str) – The new moderation state

  • reason (str) – The reason to moderate the topic

Returns

Metadata returned by the API

Return type

dict

property months_names

Obtain month names from the site messages.

The list is zero-indexed, ordered by month in calendar, and should be in the original site language.

Returns

list of tuples (month name, abbreviation)

Return type

list

movepage(page, newtitle, summary, movetalk=True, noredirect=False)[source]

Move a Page to a new title.

See

https://www.mediawiki.org/wiki/API:Move

Parameters
  • page – the Page to be moved (must exist)

  • newtitle (str) – the new title for the Page

  • summary – edit summary (required!)

  • movetalk – if True (default), also move the talk page if possible

  • noredirect – if True, suppress creation of a redirect from the old title to the new one

Returns

Page object with the new title

Return type

pywikibot.Page

property mw_version

Return self.version() as a MediaWikiVersion object.

Cache the result for 24 hours. :rtype: MediaWikiVersion

namespace(num, all=False)[source]

Return string containing local name of namespace ‘num’.

If optional argument ‘all’ is true, return all recognized values for this namespace.

Parameters
  • num (int) – Namespace constant.

  • all – If True return a Namespace object. Otherwise return the namespace name.

Returns

local name or Namespace object

Return type

str or Namespace

newfiles(user=None, start=None, end=None, reverse=False, total=None, lestart='[deprecated name of start]', number='[deprecated name of total]', leuser='[deprecated name of user]', step=NotImplemented, leend='[deprecated name of end]', repeat=NotImplemented, letitle=NotImplemented)[source]

Yield information about newly uploaded files.

DEPRECATED: Use logevents(logtype=’upload’) instead.

Yields a tuple of FilePage, Timestamp, user(unicode), comment(unicode).

N.B. the API does not provide direct access to Special:Newimages, so this is derived from the “upload” log events instead.

newimages(*args, **kwargs, number='[deprecated name of total]', repeat=NotImplemented)[source]

Yield information about newly uploaded files.

DEPRECATED: Use logevents(logtype=’upload’) instead.

newpages(user=None, returndict=False, start=None, end=None, reverse=False, bot=False, redirect=False, excludeuser=None, patrolled=None, namespaces=None, total=None, showRedirects='[deprecated name of redirect]', rc_show=NotImplemented, showPatrolled='[deprecated name of patrolled]', get_redirect=NotImplemented, repeat=NotImplemented, showBot='[deprecated name of bot]', number='[deprecated name of total]', step=NotImplemented, rcshow=NotImplemented, namespace='[deprecated name of namespaces]')[source]

Yield new articles (as Page objects) from recent changes.

Starts with the newest article and fetches the number of articles specified in the first argument.

The objects yielded are dependent on parameter returndict. When true, it yields a tuple composed of a Page object and a dict of attributes. When false, it yields a tuple composed of the Page object, timestamp (unicode), length (int), an empty unicode string, username or IP address (str), comment (unicode).

Parameters

namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – only iterate pages in these namespaces

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

nice_get_address(title)[source]

Return shorter URL path to retrieve page titled ‘title’.

notifications(**kwargs)[source]

Yield Notification objects from the Echo extension.

Keyword Arguments

format – If specified, notifications will be returned formatted this way. Its value is either ‘model’, ‘special’ or None. Default is ‘special’.

Refer API reference for other keywords.

notifications_mark_read(**kwargs)[source]

Mark selected notifications as read.

Returns

whether the action was successful

Return type

bool

page_can_be_edited(page)[source]

Determine if the page can be edited.

Return True if and only if:
  • page is unprotected, and bot has an account for this site, or

  • page is protected, and bot has a sysop account for this site.

Return type

bool

page_embeddedin(page, filter_redirects=None, namespaces=None, total=None, content=False, step=NotImplemented, filterRedirects='[deprecated name of filter_redirects]')[source]

Iterate all pages that embedded the given page as a template.

See

https://www.mediawiki.org/wiki/API:Embeddedin

Parameters
  • page – The Page to get inclusions for.

  • filter_redirects – If True, only return redirects that embed the given page. If False, only return non-redirect links. If None, return both (no filtering).

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – If present, only return links from the namespaces in this list.

  • content – if True, load the current content of each iterated page (default False)

Return type

typing.Iterable[pywikibot.Page]

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

Iterate all external links on page, yielding URL strings.

See

https://www.mediawiki.org/wiki/API:Extlinks

page_from_repository(item)[source]

Return a Page for this site object specified by wikibase item.

Parameters

item (str) – id number of item, “Q###”,

Returns

Page, or Category object given by wikibase item number for this site object.

Return type

pywikibot.Page or None

Raises
page_isredirect(page)[source]

Return True if and only if page is a redirect.

page_restrictions(page)[source]

Return a dictionary reflecting page protections.

Iterate all pages that link to the given page.

See

https://www.mediawiki.org/wiki/API:Backlinks

Parameters
  • page – The Page to get links to.

  • follow_redirects – Also return links to redirects pointing to the given page.

  • filter_redirects – If True, only return redirects to the given page. If False, only return non-redirect links. If None, return both (no filtering).

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – If present, only return links from the namespaces in this list.

  • total – Maximum number of pages to retrieve in total.

  • content – if True, load the current content of each iterated page (default False)

Return type

typing.Iterable[pywikibot.Page]

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

pagecategories(page, total=None, content=False, withSortKey=NotImplemented, step=NotImplemented)[source]

Iterate categories to which page belongs.

See

https://www.mediawiki.org/wiki/API:Categories

Parameters

content – if True, load the current content of each iterated page (default False); note that this means the contents of the category description page, not the pages contained in the category

pageimages(page, total=None, content=False, step=NotImplemented)[source]

Iterate images used (not just linked) on the page.

See

https://www.mediawiki.org/wiki/API:Images

Parameters

content – if True, load the current content of each iterated page (default False); note that this means the content of the image description page, not the image itself

Iterate all interlanguage links on page, yielding Link objects.

See

https://www.mediawiki.org/wiki/API:Langlinks

Parameters

include_obsolete – if true, yield even Link objects whose site is obsolete

Iterate internal wikilinks contained (or transcluded) on page.

See

https://www.mediawiki.org/wiki/API:Links

Parameters
  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – Only iterate pages in these namespaces (default: all)

  • follow_redirects – if True, yields the target of any redirects, rather than the redirect page

  • content – if True, load the current content of each iterated page (default False)

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

pagename2codes()[source]

Return list of localized PAGENAMEE tags for the site.

pagenamecodes()[source]

Return list of localized PAGENAME tags for the site.

pagereferences(page, follow_redirects=False, filter_redirects=None, with_template_inclusion=True, only_template_inclusion=False, namespaces=None, total=None, content=False, withTemplateInclusion='[deprecated name of with_template_inclusion]', onlyTemplateInclusion='[deprecated name of only_template_inclusion]', step=NotImplemented, followRedirects='[deprecated name of follow_redirects]', filterRedirects='[deprecated name of filter_redirects]')[source]

Convenience method combining pagebacklinks and page_embeddedin.

Parameters

namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – If present, only return links from the namespaces in this list.

Return type

typing.Iterable[pywikibot.Page]

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

pages_with_property(propname, total=None)[source]

Yield Page objects from Special:PagesWithProp.

See

https://www.mediawiki.org/wiki/API:Pageswithprop

Parameters
  • propname (str) – must be a valid property.

  • total (int or None) – number of pages to return

Returns

return a generator of Page objects

Return type

iterator

pagetemplates(page, namespaces=None, total=None, content=False, step=NotImplemented)[source]

Iterate templates transcluded (not just linked) on the page.

See

https://www.mediawiki.org/wiki/API:Templates

Parameters
  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – Only iterate pages in these namespaces

  • content – if True, load the current content of each iterated page (default False)

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

patrol(rcid=None, revid=None, revision=None, token=NotImplemented)[source]

Return a generator of patrolled pages.

See

https://www.mediawiki.org/wiki/API:Patrol

Pages to be patrolled are identified by rcid, revid or revision. At least one of the parameters is mandatory. See https://www.mediawiki.org/wiki/API:Patrol.

Parameters
  • rcid (iterable/iterator which returns a number or string which contains only digits; it also supports a string (as above) or int) – an int/string/iterable/iterator providing rcid of pages to be patrolled.

  • revid (iterable/iterator which returns a number or string which contains only digits; it also supports a string (as above) or int.) – an int/string/iterable/iterator providing revid of pages to be patrolled.

  • revision (iterable/iterator which returns a Revision object; it also supports a single Revision.) – an Revision/iterable/iterator providing Revision object of pages to be patrolled.

Return type

iterator of dict with ‘rcid’, ‘ns’ and ‘title’ of the patrolled page.

prefixindex(prefix, namespace=0, includeredirects=True)[source]

Yield all pages with a given prefix. Deprecated.

Use allpages() with the prefix= parameter instead of this method.

preloadpages(pagelist, groupsize=50, templates=False, langlinks=False, pageprops=False)[source]

Return a generator to a list of preloaded pages.

Pages are iterated in the same order than in the underlying pagelist. In case of duplicates in a groupsize batch, return the first entry.

Parameters
  • pagelist – an iterable that returns Page objects

  • groupsize (int) – how many Pages to query at a time

  • templates (bool) – preload pages (typically templates) transcluded in the provided pages

  • langlinks (bool) – preload all language links from the provided pages to other languages

  • pageprops (bool) – preload various properties defined in page content

property proofread_index_ns

Return Index namespace for the ProofreadPage extension.

property proofread_levels

Return Quality Levels for the ProofreadPage extension.

property proofread_page_ns

Return Page namespace for the ProofreadPage extension.

protect(page, protections, reason, expiry=None, **kwargs, summary='[deprecated name of reason]')[source]

(Un)protect a wiki page. Requires administrator status.

See

https://www.mediawiki.org/wiki/API:Protect

Parameters
  • protections (dict) – A dict mapping type of protection to protection level of that type. Valid types of protection are ‘edit’, ‘move’, ‘create’, and ‘upload’. Valid protection levels (in MediaWiki 1.12) are ‘’ (equivalent to ‘none’), ‘autoconfirmed’, and ‘sysop’. If None is given, however, that protection will be skipped.

  • reason (basestring) – Reason for the action

  • expiry (pywikibot.Timestamp, string in GNU timestamp format (including ISO 8601)) – When the block should expire. This expiry will be applied to all protections. If None, ‘infinite’, ‘indefinite’, ‘never’, or ‘’ is given, there is no expiry.

protectedpages(namespace=0, type='edit', level=False, total=None, lvl='[deprecated name of level]')[source]

Return protected pages depending on protection level and type.

For protection types which aren’t ‘create’ it uses APISite.allpages, while it uses for ‘create’ the ‘query+protectedtitles’ module.

See

https://www.mediawiki.org/wiki/API:Protectedtitles

Parameters
  • namespaces (int or Namespace or str) – The searched namespace.

  • type (str) – The protection type to search for (default ‘edit’).

  • level (str or False) – The protection level (like ‘autoconfirmed’). If False it shows all protection levels.

Returns

The pages which are protected.

Return type

typing.Iterable[pywikibot.Page]

protection_levels()[source]

Return the protection levels available on this site.

Returns

protection types available

Return type

set of unicode instances

See

Siteinfo._get_default()

protection_types()[source]

Return the protection types available on this site.

Returns

protection types available

Return type

set of unicode instances

See

Siteinfo._get_default()

purgepages(pages, forcelinkupdate=False, forcerecursivelinkupdate=False, converttitles=False, redirects=False)[source]

Purge the server’s cache for one or multiple pages.

Parameters
  • pages – list of Page objects

  • redirects (bool) – Automatically resolve redirects.

  • converttitles (bool) – Convert titles to other variants if necessary. Only works if the wiki’s content language supports variant conversion.

  • forcelinkupdate (bool) – Update the links tables.

  • forcerecursivelinkupdate (bool) – Update the links table, and update the links tables for any page that uses this page as a template.

Returns

True if API returned expected response; False otherwise

Return type

bool

querypage(special_page, total=None)[source]

Yield Page objects retrieved from Special:{special_page}.

See

https://www.mediawiki.org/wiki/API:Querypage

Generic function for all special pages supported by the site MW API.

Parameters
  • special_page – Special page to query

  • total – number of pages to return

Raises

AssertionError – special_page is not supported in SpecialPages.

randompage(redirect=False)[source]

DEPRECATED.

Parameters

redirect – Return a random redirect page

Return type

pywikibot.Page

randompages(total=None, namespaces=None, redirects=False, content=False, step=NotImplemented)[source]

Iterate a number of random pages.

See

https://www.mediawiki.org/wiki/API:Random

Pages are listed in a fixed sequence, only the starting point is random.

Parameters
  • total – the maximum number of pages to iterate

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – only iterate pages in these namespaces.

  • redirects (bool or None) – if True, include only redirect pages in results, False does not include redirects and None (MW 1.26+) include both types. (default: False)

  • content – if True, load the current content of each iterated page (default False)

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

  • AssertError – unsupported redirects parameter

randomredirectpage()[source]

DEPRECATED: Use Site.randompages() instead.

Returns

Return a random redirect page

recentchanges(start=None, end=None, reverse=False, namespaces=None, pagelist=None, changetype=None, minor=None, bot=None, anon=None, redirect=None, patrolled=None, top_only=False, total=None, user=None, excludeuser=None, tag=None, revision=NotImplemented, showBot='[deprecated name of bot]', showAnon='[deprecated name of anon]', repeat=NotImplemented, showPatrolled='[deprecated name of patrolled]', returndict=NotImplemented, rcnamespace='[deprecated name of namespaces]', rcstart='[deprecated name of start]', includeredirects='[deprecated name of redirect]', step=NotImplemented, rclimit='[deprecated name of total]', showRedirects='[deprecated name of redirect]', rcprop=NotImplemented, showMinor='[deprecated name of minor]', number='[deprecated name of total]', topOnly='[deprecated name of top_only]', nobots=NotImplemented, rctype='[deprecated name of changetype]', rcend='[deprecated name of end]', namespace='[deprecated name of namespaces]', rcshow=NotImplemented, rcdir=NotImplemented)[source]

Iterate recent changes.

See

https://www.mediawiki.org/wiki/API:RecentChanges

Parameters
  • start (pywikibot.Timestamp) – Timestamp to start listing from

  • end (pywikibot.Timestamp) – Timestamp to end listing at

  • reverse (bool) – if True, start with oldest changes (default: newest)

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – only iterate pages in these namespaces

  • pagelist – iterate changes to pages in this list only

  • pagelist – list of Pages

  • changetype (basestring) – only iterate changes of this type (“edit” for edits to existing pages, “new” for new pages, “log” for log entries)

  • minor (bool or None) – if True, only list minor edits; if False, only list non-minor edits; if None, list all

  • bot (bool or None) – if True, only list bot edits; if False, only list non-bot edits; if None, list all

  • anon (bool or None) – if True, only list anon edits; if False, only list non-anon edits; if None, list all

  • redirect (bool or None) – if True, only list edits to redirect pages; if False, only list edits to non-redirect pages; if None, list all

  • patrolled (bool or None) – if True, only list patrolled edits; if False, only list non-patrolled edits; if None, list all

  • top_only (bool) – if True, only list changes that are the latest revision (default False)

  • user (basestring|list) – if not None, only list edits by this user or users

  • excludeuser (basestring|list) – if not None, exclude edits by this user or users

  • tag (str) – a recent changes tag

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

redirect()[source]

Return the localized #REDIRECT keyword.

redirectRegex()[source]

Return a compiled regular expression matching on redirect pages.

Group 1 in the regex match object will be the target title.

redirectpages(total=None, step=NotImplemented)[source]

Yield redirect pages from Special:ListRedirects.

Parameters

total – number of pages to return

reply_to_post(page, reply_to_uuid, content, format)[source]

Reply to a post on a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • reply_to_uuid (str) – The UUID of the Post to create a reply to

  • content (str) – The content of the reply

  • format (str (either 'wikitext' or 'html')) – The content format used for the supplied content

Returns

Metadata returned by the API

Return type

dict

resolvemagicwords(wikitext)[source]

Replace the {{ns:xx}} marks in a wikitext with the namespace names.

DEPRECATED.

restore_post(post, reason)[source]

Restore a Flow post.

Parameters
  • post (Post) – A Flow post

  • reason (str) – The reason to restore the post

Returns

Metadata returned by the API

Return type

dict

restore_topic(page, reason)[source]

Restore a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • reason (str) – The reason to restore the topic

Returns

Metadata returned by the API

Return type

dict

rollbackpage(page, **kwargs)[source]

Roll back page to version before last user’s edits.

See

https://www.mediawiki.org/wiki/API:Rollback

The keyword arguments are those supported by the rollback API.

As a precaution against errors, this method will fail unless the page history contains at least two revisions, and at least one that is not by the same user who made the last edit.

Parameters

page – the Page to be rolled back (must exist)

search(searchstring, namespaces=None, where='text', get_redirects=False, total=None, content=False, getredirects='[deprecated name of get_redirects]', number='[deprecated name of total]', step=NotImplemented, key='[deprecated name of searchstring]')[source]

Iterate Pages that contain the searchstring.

Note that this may include non-existing Pages if the wiki’s database table contains outdated entries.

See

https://www.mediawiki.org/wiki/API:Search

Parameters
  • searchstring (str) – the text to search for

  • where – Where to search; value must be “text”, “title” or “nearmatch” (many wikis do not support title or nearmatch search)

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – search only in these namespaces (defaults to all)

  • get_redirects – if True, include redirects in results. Since version MediaWiki 1.23 it will always return redirects.

  • content – if True, load the current content of each iterated page (default False)

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

server_time()[source]

Return a Timestamp object representing the current server time.

For wikis with a version newer than 1.16 it uses the ‘time’ property of the siteinfo ‘general’. It’ll force a reload before returning the time. It requests to expand the text ‘{{CURRENTTIMESTAMP}}’ for older wikis.

Returns

the current server time

Return type

Timestamp

shortpages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages and lengths from Special:Shortpages.

Yields a tuple of Page object, length(int).

Parameters

total – number of pages to return

property siteinfo

Site information dict.

stash_info(file_key, props=False)[source]

Get the stash info for a given file key.

See

https://www.mediawiki.org/wiki/API:Stashimageinfo

suppress_post(post, reason)[source]

Suppress a Flow post.

Parameters
  • post (Post) – A Flow post

  • reason (str) – The reason to suppress the post

Returns

Metadata returned by the API

Return type

dict

suppress_topic(page, reason)[source]

Suppress a Flow topic.

Parameters
  • page (Topic) – A Flow topic

  • reason (str) – The reason to suppress the topic

Returns

Metadata returned by the API

Return type

dict

thank_post(post)[source]

Corresponding method to the ‘action=flowthank’ API action.

Parameters

post (Post) – The post to be thanked for.

Raises

APIError – On thanking oneself or other API errors.

Returns

The API response.

thank_revision(revid, source=None)[source]

Corresponding method to the ‘action=thank’ API action.

Parameters
  • revid (int) – Revision ID for the revision to be thanked.

  • source (str) – A source for the thanking operation.

Raises

APIError – On thanking oneself or other API errors.

Returns

The API response.

unblockuser(user, reason=None)[source]

Remove the block for the user.

See

https://www.mediawiki.org/wiki/API:Block

Parameters
  • user (pywikibot.User) – The username/IP without a namespace.

  • reason (basestring) – Reason for the unblock.

uncategorizedcategories(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Categories from Special:Uncategorizedcategories.

Parameters

total – number of pages to return

uncategorizedfiles(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)

Yield FilePages from Special:Uncategorizedimages.

Parameters

total – number of pages to return

uncategorizedimages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield FilePages from Special:Uncategorizedimages.

Parameters

total – number of pages to return

uncategorizedpages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages from Special:Uncategorizedpages.

Parameters

total – number of pages to return

uncategorizedtemplates(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages from Special:Uncategorizedtemplates.

Parameters

total – number of pages to return

unconnected_pages(total=None, step=NotImplemented)[source]

Yield Page objects from Special:UnconnectedPages.

Parameters

total – number of pages to return

undelete_page(page, reason, revisions=None, summary='[deprecated name of reason]')[source]

Undelete page from the wiki. Requires appropriate privilege level.

See

https://www.mediawiki.org/wiki/API:Undelete

Parameters
  • page (pywikibot.BasePage) – Page to be deleted.

  • revisions (list) – List of timestamps to restore. If None, restores all revisions.

  • reason (basestring) – Undeletion reason.

unusedcategories(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Category objects from Special:Unusedcategories.

Parameters

total – number of pages to return

unusedfiles(total=None, step=NotImplemented, number='[deprecated name of total]', extension=NotImplemented, repeat=NotImplemented)[source]

Yield FilePage objects from Special:Unusedimages.

Parameters

total – number of pages to return

unusedimages(total=None, step=NotImplemented, number='[deprecated name of total]', extension=NotImplemented, repeat=NotImplemented)[source]

Yield FilePage objects from Special:Unusedimages.

DEPRECATED: Use APISite.unusedfiles instead.

unwatchedpages(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages from Special:Unwatchedpages (requires Admin privileges).

Parameters

total – number of pages to return

upload(filepage, source_filename=None, source_url=None, comment=None, text=None, watch=False, ignore_warnings=False, chunk_size=0, _file_key=None, _offset=0, _verify_stash=None, report_success=None, imagepage='[deprecated name of filepage]')[source]

Upload a file to the wiki.

See

https://www.mediawiki.org/wiki/API:Upload

Either source_filename or source_url, but not both, must be provided.

Parameters
  • filepage – a FilePage object from which the wiki-name of the file will be obtained.

  • source_filename – path to the file to be uploaded

  • source_url – URL of the file to be uploaded

  • comment – Edit summary; if this is not provided, then filepage.text will be used. An empty summary is not permitted. This may also serve as the initial page text (see below).

  • text – Initial page text; if this is not set, then filepage.text will be used, or comment.

  • watch – If true, add filepage to the bot user’s watchlist

  • ignore_warnings (bool or callable or iterable of str) – It may be a static boolean, a callable returning a boolean or an iterable. The callable gets a list of UploadWarning instances and the iterable should contain the warning codes for which an equivalent callable would return True if all UploadWarning codes are in thet list. If the result is False it’ll not continue uploading the file and otherwise disable any warning and reattempt to upload the file. NOTE: If report_success is True or None it’ll raise an UploadWarning exception if the static boolean is False.

  • chunk_size (int) – The chunk size in bytesfor chunked uploading (see https://www.mediawiki.org/wiki/API:Upload#Chunked_uploading). It will only upload in chunks, if the version number is 1.20 or higher and the chunk size is positive but lower than the file size.

  • _file_key (str or None) – Reuses an already uploaded file using the filekey. If None (default) it will upload the file.

  • _offset (int or bool) – When file_key is not None this can be an integer to continue a previously canceled chunked upload. If False it treats that as a finished upload. If True it requests the stash info from the server to determine the offset. By default starts at 0.

  • _verify_stash (bool or None) – Requests the SHA1 and file size uploaded and compares it to the local file. Also verifies that _offset is matching the file size if the _offset is an int. If _offset is False if verifies that the file size match with the local file. If None it’ll verifies the stash when a file key and offset is given.

  • report_success – If the upload was successful it’ll print a success message and if ignore_warnings is set to False it’ll raise an UploadWarning if a warning occurred. If it’s None (default) it’ll be True if ignore_warnings is a bool and False otherwise. If it’s True or None ignore_warnings must be a bool.

Returns

It returns True if the upload was successful and False otherwise.

Return type

bool

usercontribs(user=None, userprefix=None, start=None, end=None, reverse=False, namespaces=None, minor=None, total=None, top_only=False, step=NotImplemented, showMinor='[deprecated name of minor]')[source]

Iterate contributions by a particular user.

Iterated values are in the same format as recentchanges.

See

https://www.mediawiki.org/wiki/API:Usercontribs

Parameters
  • user – Iterate contributions by this user (name or IP)

  • userprefix – Iterate contributions by all users whose names or IPs start with this substring

  • start – Iterate contributions starting at this Timestamp

  • end – Iterate contributions ending at this Timestamp

  • reverse – Iterate oldest contributions first (default: newest)

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – only iterate pages in these namespaces

  • minor – if True, iterate only minor edits; if False and not None, iterate only non-minor edits (default: iterate both)

  • total (int) – limit result to this number of pages

  • top_only – if True, iterate only edits which are the latest revision (default: False)

Raises
  • pywikibot.exceptions.Error – either user or userprefix must be non-empty

  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

property userinfo

Retrieve userinfo from site and store in _userinfo attribute.

self._userinfo will be a dict with the following keys and values:

  • id: user id (numeric str)

  • name: username (if user is logged in)

  • anon: present if user is not logged in

  • groups: list of groups (could be empty)

  • rights: list of rights (could be empty)

  • message: present if user has a new message on talk page

  • blockinfo: present if user is blocked (dict)

https://www.mediawiki.org/wiki/API:Userinfo

Parameters

force (bool) – force to retrieve userinfo ignoring cache

users(usernames)[source]

Iterate info about a list of users by name or IP.

See

https://www.mediawiki.org/wiki/API:Users

Parameters

usernames (list, or other iterable, of unicodes) – a list of user names

validate_tokens(types)[source]

Validate if requested tokens are acceptable.

Valid tokens depend on mw version.

version()[source]

Return live project version number as a string.

This overwrites the corresponding family method for APISite class. Use pywikibot.site.mw_version to compare MediaWiki versions.

wantedcategories(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages from Special:Wantedcategories.

Parameters

total – number of pages to return

wantedfiles(total=None)[source]

Yield Pages from Special:Wantedfiles.

Parameters

total – number of pages to return

wantedpages(total=None, step=NotImplemented)[source]

Yield Pages from Special:Wantedpages.

Parameters

total – number of pages to return

wantedtemplates(total=None)[source]

Yield Pages from Special:Wantedtemplates.

Parameters

total – number of pages to return

watch(pages, unwatch=False)[source]

Add or remove pages from watchlist.

See

https://www.mediawiki.org/wiki/API:Watch

Parameters
  • pages (A page object, a page-title string, or sequence of them. Also accepts a single pipe-separated string like 'title1|title2'.) – A single page or a sequence of pages.

  • unwatch – If True, remove pages from watchlist; if False add them (default).

Returns

True if API returned expected response; False otherwise

Return type

bool

watched_pages(sysop=False, force=False, total=None, step=NotImplemented)[source]

Return watchlist.

See

https://www.mediawiki.org/wiki/API:Watchlistraw

Parameters
  • sysop (bool) – Returns watchlist of sysop user if true

  • force_reload (bool) – Reload watchlist

  • total (int) – if not None, limit the generator to yielding this many items in total

Returns

list of pages in watchlist

Return type

list of pywikibot.Page objects

watchlist_revs(start=None, end=None, reverse=False, namespaces=None, minor=None, bot=None, anon=None, total=None, showAnon='[deprecated name of anon]', showBot='[deprecated name of bot]', step=NotImplemented, showMinor='[deprecated name of minor]')[source]

Iterate revisions to pages on the bot user’s watchlist.

Iterated values will be in same format as recentchanges.

See

https://www.mediawiki.org/wiki/API:Watchlist

Parameters
  • start – Iterate revisions starting at this Timestamp

  • end – Iterate revisions ending at this Timestamp

  • reverse – Iterate oldest revisions first (default: newest)

  • namespaces (iterable of basestring or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers.) – only iterate pages in these namespaces

  • minor – if True, only list minor edits; if False (and not None), only list non-minor edits

  • bot – if True, only list bot edits; if False (and not None), only list non-bot edits

  • anon – if True, only list anon edits; if False (and not None), only list non-anon edits

Raises
  • KeyError – a namespace identifier was not resolved

  • TypeError – a namespace identifier has an inappropriate type such as NoneType or bool

watchpage(page, unwatch=False)[source]

Add or remove page from watchlist.

DEPRECATED: Use Site().watch() instead.

Parameters
  • page (A page object, a page-title string.) – A single page.

  • unwatch – If True, remove page from watchlist; if False (default), add it.

Returns

True if API returned expected response; False otherwise

Return type

bool

withoutinterwiki(total=None, number='[deprecated name of total]', step=NotImplemented, repeat=NotImplemented)[source]

Yield Pages without language links from Special:Withoutinterwiki.

Parameters

total – number of pages to return

class scripts.maintenance.cache.ClosedSite(code, fam, user=None, sysop=None)[source]

Bases: pywikibot.site.APISite

Site closed to read-only mode.

__init__(code, fam, user=None, sysop=None)[source]

Initializer.

__module__ = 'pywikibot.site'
is_uploaddisabled()[source]

Return True if upload is disabled on site.

newfiles(**kwargs)[source]

An error instead of pointless API call.

newimages(*args, **kwargs)[source]

An error instead of pointless API call.

newpages(**kwargs)[source]

An error instead of pointless API call.

page_can_be_edited(page)[source]

Determine if the page can be edited.

page_restrictions(page)[source]

Return a dictionary reflecting page protections.

recentchanges(**kwargs)[source]

An error instead of pointless API call.

class scripts.maintenance.cache.DataSite(*args, **kwargs)[source]

Bases: pywikibot.site.APISite

Wikibase data capable site.

__getattr__(attr)[source]

Provide data access methods.

Methods provided are get_info, get_sitelinks, get_aliases, get_labels, get_descriptions, and get_urls.

__init__(*args, **kwargs)[source]

Initializer.

__module__ = 'pywikibot.site'
addClaim(entity, claim, bot=True, summary=None)[source]

Add a claim.

Parameters
  • entity (WikibasePage) – Entity to modify

  • claim (pywikibot.Claim) – Claim to be added

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

changeClaimTarget(claim, snaktype='value', bot=True, summary=None)[source]

Set the claim target to the value of the provided claim target.

Parameters
  • claim (pywikibot.Claim) – The source of the claim target value

  • snaktype (str ('value', 'novalue' or 'somevalue')) – An optional snaktype. Default: ‘value’

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

property concept_base_uri

Return the base uri for concepts/entities.

Returns

concept base uri

Return type

str

createNewItemFromPage(page, bot=True, **kwargs)[source]

Create a new Wikibase item for a provided page.

Parameters
  • page (pywikibot.Page) – page to fetch links from

  • bot (bool) – Whether to mark the edit as a bot edit

Returns

pywikibot.ItemPage of newly created item

Return type

pywikibot.ItemPage

data_repository()[source]

Override parent method.

This avoids pointless API queries since the data repository is this site by definition.

Returns

this Site object

Return type

pywikibot.site.DataSite

editEntity(entity, data, bot=True, **kwargs, identification='[deprecated name of entity]')[source]

Edit entity.

Note: This method is unable to create entities other than ‘item’ if dict with API parameters was passed to ‘entity’ parameter.

Parameters
  • entity (WikibasePage or dict) – Page to edit, or dict with API parameters to use for entity identification

  • data (dict) – data updates

  • bot (bool) – Whether to mark the edit as a bot edit

Returns

New entity data

Return type

dict

editQualifier(claim, qualifier, new=False, bot=True, summary=None, baserevid=None)[source]

Create/Edit a qualifier.

Parameters
  • claim (pywikibot.Claim) – A Claim object to add the qualifier to

  • qualifier (pywikibot.Claim) – A Claim object to be used as a qualifier

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

  • baserevid (long) – Base revision id override, used to detect conflicts. When omitted, revision of claim.on_item is used. DEPRECATED.

editSource(claim, source, new=False, bot=True, summary=None, baserevid=None)[source]

Create/Edit a source.

Parameters
  • claim (pywikibot.Claim) – A Claim object to add the source to

  • source (pywikibot.Claim) – A Claim object to be used as a source

  • new (bool) – Whether to create a new one if the “source” already exists

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

  • baserevid (long) – Base revision id override, used to detect conflicts. When omitted, revision of claim.on_item is used. DEPRECATED.

geo_shape_repository()[source]

Return Site object for the geo-shapes repository e.g. commons.

getPropertyType(prop)[source]

Obtain the type of a property.

This is used specifically because we can cache the value for a much longer time (near infinite).

property item_namespace

Return namespace for items.

Returns

item namespace

Return type

Namespace

linkTitles(page1, page2, bot=True)[source]

Link two pages together.

Parameters
  • page1 (pywikibot.Page) – First page to link

  • page2 (pywikibot.Page) – Second page to link

  • bot (bool) – Whether to mark the edit as a bot edit

Returns

dict API output

Return type

dict

loadcontent(identification, *props)[source]

Fetch the current content of a Wikibase item.

This is called loadcontent since wbgetentities does not support fetching old revisions. Eventually this will get replaced by an actual loadrevisions.

Parameters
  • identification (dict) – Parameters used to identify the page(s)

  • props – the optional properties to fetch.

mergeItems(from_item, to_item, ignore_conflicts=None, summary=None, bot=True, ignoreconflicts='[deprecated name of ignore_conflicts]', fromItem='[deprecated name of from_item]', toItem='[deprecated name of to_item]')[source]

Merge two items together.

Parameters
  • from_item (pywikibot.ItemPage) – Item to merge from

  • to_item (pywikibot.ItemPage) – Item to merge into

  • ignore_conflicts (list of str) – Which type of conflicts (‘description’, ‘sitelink’, and ‘statement’) should be ignored

  • summary (str) – Edit summary

  • bot (bool) – Whether to mark the edit as a bot edit

Returns

dict API output

Return type

dict

preload_entities(pagelist, groupsize=50)[source]

Yield subclasses of WikibasePage’s with content prefilled.

Note that pages will be iterated in a different order than in the underlying pagelist.

Parameters
  • pagelist – an iterable that yields either WikibasePage objects, or Page objects linked to an ItemPage.

  • groupsize (int) – how many pages to query at a time

property property_namespace

Return namespace for properties.

Returns

property namespace

Return type

Namespace

removeClaims(claims, bot=True, summary=None, baserevid=None)[source]

Remove claims.

Parameters
  • claims (List[pywikibot.Claim]) – Claims to be removed

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

  • baserevid (long) – Base revision id override, used to detect conflicts. When omitted, revision of claim.on_item is used. DEPRECATED.

removeSources(claim, sources, bot=True, summary=None, baserevid=None)[source]

Remove sources.

Parameters
  • claim (pywikibot.Claim) – A Claim object to remove the sources from

  • sources (pywikibot.Claim) – A list of Claim objects that are sources

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

  • baserevid (long) – Base revision id override, used to detect conflicts. When omitted, revision of claim.on_item is used. DEPRECATED.

remove_qualifiers(claim, qualifiers, bot=True, summary=None, baserevid=None)[source]

Remove qualifiers.

Parameters
  • claim (pywikibot.Claim) – A Claim object to remove the qualifier from

  • qualifiers (List[pywikibot.Claim]) – Claim objects currently used as a qualifiers

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

  • baserevid (long) – Base revision id override, used to detect conflicts. When omitted, revision of claim.on_item is used. DEPRECATED.

save_claim(claim, summary=None, bot=True)[source]

Save the whole claim to the wikibase site.

Parameters
  • claim (pywikibot.Claim) – The claim to save

  • bot (bool) – Whether to mark the edit as a bot edit

  • summary (str) – Edit summary

search_entities(search, language, total=None, **kwargs, limit='[deprecated name of total]')[source]

Search for pages or properties that contain the given text.

Parameters
  • search (str) – Text to find.

  • language (str) – Language to search in.

  • total – Maximum number of pages to retrieve in total, or None in case of no limit.

Returns

‘search’ list from API output.

Return type

api.APIGenerator

set_redirect_target(from_item, to_item, bot=True)[source]

Make a redirect to another item.

Parameters
  • to_item (pywikibot.ItemPage) – title of target item.

  • from_item (pywikibot.ItemPage) – Title of the item to be redirected.

  • bot (bool) – Whether to mark the edit as a bot edit

property sparql_endpoint

Return the sparql endpoint url, if any has been set.

Returns

sparql endpoint url

Return type

str|None

tabular_data_repository()[source]

Return Site object for the tabular-datas repository e.g. commons.

class scripts.maintenance.cache.LoginStatus[source]

Bases: enum.IntEnum

Enum for Login statuses.

>>> LoginStatus.NOT_ATTEMPTED
LoginStatus(-3)
>>> LoginStatus.IN_PROGRESS.value
-2
>>> LoginStatus.NOT_LOGGED_IN.name
NOT_LOGGED_IN
>>> int(LoginStatus.AS_USER)
0
>>> LoginStatus(-3).name
'NOT_ATTEMPTED'
>>> LoginStatus(0).name
'AS_USER'
AS_SYSOP = 1
AS_USER = 0
IN_PROGRESS = -2
NOT_ATTEMPTED = -3
NOT_LOGGED_IN = -1
__module__ = 'pywikibot.site'
exception scripts.maintenance.cache.ParseError[source]

Bases: Exception

Error parsing.

__module__ = 'scripts.maintenance.cache'
class scripts.maintenance.cache.CacheEntry(directory, filename)[source]

Bases: pywikibot.data.api.CachedRequest

A Request cache entry.

__abstractmethods__ = frozenset({})
__init__(directory, filename)[source]

Initializer.

__module__ = 'scripts.maintenance.cache'
__repr__()[source]

Representation of object.

__str__()[source]

Return string equivalent of object.

parse_key()[source]

Parse the key loaded from the cache entry.

scripts.maintenance.cache.process_entries(cache_path, func, use_accesstime=None, output_func=None, action_func=None)[source]

Check the contents of the cache.

This program tries to use file access times to determine whether cache files are being used. However file access times are not always usable. On many modern filesystems, they have been disabled. On unix, check the filesystem mount options. You may need to remount with ‘strictatime’.

Parameters

use_accesstime (bool tristate: - None = detect - False = don't use - True = always use) – Whether access times should be used.

scripts.maintenance.cache.main()[source]

Process command line arguments and invoke bot.

scripts.maintenance.cache.has_password(entry)[source]

Entry has a password in the entry.

scripts.maintenance.cache.is_logout(entry)[source]

Entry is a logout entry.

scripts.maintenance.cache.empty_response(entry)[source]

Entry has no data.

scripts.maintenance.cache.not_accessed(entry)[source]

Entry has never been accessed.

scripts.maintenance.cache.incorrect_hash(entry)[source]

Incorrect hash.

scripts.maintenance.cache.older_than(entry, interval)[source]

Find older entries.

scripts.maintenance.cache.newer_than(entry, interval)[source]

Find newer entries.

scripts.maintenance.cache.older_than_one_day(entry)[source]

Find more than one day old entries.

scripts.maintenance.cache.recent(entry)[source]

Find entries newer than on hour.

scripts.maintenance.cache.uniquedesc(entry)[source]

Return the unique description string.

scripts.maintenance.cache.parameters(entry)[source]

Return a pretty formatted parameters list.

scripts.maintenance.colors script

Utility to show pywikibot colors.

scripts.maintenance.colors.main()[source]

Main function.

scripts.maintenance.compat2core script

A helper script to convert compat 1.0 scripts to the new core 3.0 framework.

NOTE: Please be aware that this script is not able to convert your codes completely. It may support you with some automatic replacements and it gives some warnings and hints for converting. Please refer to the converting guide README-conversion.txt in the core framework folder and check your codes finally.

The scripts asks for the .py file and converts it to <scriptname>-core.py in the same directory. The following option is supported:

-warnonly  Do not convert the source but show warning messages. This is good
           to check already merged scripts.

usage

to convert a script and show warnings about deprecated methods:

python pwb.py compat2core <scriptname>

to show warnings about deprecated methods:

python pwb.py compat2core <scriptname> -warnonly
class scripts.maintenance.compat2core.ConvertBot(filename=None, warnonly=False)[source]

Bases: object

Script conversion bot.

__init__(filename=None, warnonly=False)[source]

Initializer.

__module__ = 'scripts.maintenance.compat2core'
convert()[source]

Convert script.

get_dest()[source]

Ask for destination script name.

get_source()[source]

Get source script.

run()[source]

Run the bot.

warning()[source]

Show warnings and hints.

scripts.maintenance.compat2core.main()[source]

Process command line arguments and invoke bot.

scripts.maintenance.download_dump script

This bot downloads dump from dumps.wikimedia.org.

This script supports the following command line parameters:

-filename:#     The name of the file (e.g. abstract.xml)

-storepath:#    The stored file's path.

-dumpdate:#     The dumpdate date of the dump (default to `latest`)
                formatted as YYYYMMDD.
class scripts.maintenance.download_dump.DownloadDumpBot(site=None, **kwargs)[source]

Bases: pywikibot.bot.Bot

Download dump bot.

__module__ = 'scripts.maintenance.download_dump'
availableOptions = {'dumpdate': 'latest', 'filename': '', 'storepath': './', 'wikiname': ''}
get_dump_name(db_name, typ)[source]

Check if dump file exists locally in a Toolforge server.

run()[source]

Run bot.

scripts.maintenance.download_dump.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters

args (str) – command line arguments

scripts.maintenance.make_i18n_dict script

Generate a i18n file from a given script.

run IDLE at topmost level:: >>> import pwb >>> from scripts.maintenance.make_i18n_dict import i18nBot >>> bot = i18nBot(‘<scriptname>’, ‘<msg dict>’) >>> bot.run()

If you have more than one message dictionary, give all these names to the bot:: >>> bot = i18nBot(‘<scriptname>’, ‘<msg dict1>’, ‘<msg dict2>’, ‘<msg dict3>’)

If you want to rename the message index use keyword arguments. This may be mixed with preleading positonal arguments:: >>> bot = i18nBot(‘<scriptname>’, ‘<msg dict1>’, the_other_msg=’<msg dict2>’)

If you have the messages as instance constants you may call the bot as follows:: >>> bot = i18nBot( … ‘<scriptname>.<class instance>’, ‘<msg dict1>’, ‘<msg dict2>’)

It’s also possible to make json files too by using to_json method after instantiating the bot. It also calls bot.run() to create the dictionaries:: >>> bot.to_json()

class scripts.maintenance.make_i18n_dict.i18nBot(script, *args, **kwargs)[source]

Bases: object

I18n bot.

__init__(script, *args, **kwargs)[source]

Initializer.

__module__ = 'scripts.maintenance.make_i18n_dict'
print_all()[source]

Pretty print the dict as a file content to screen.

read(oldmsg, newmsg=None)[source]

Read a single message from source script.

run(quiet=False)[source]

Run the bot, read the messages from source and print the dict.

Parameters

quiet (bool) – print the result if False

to_json(quiet=True)[source]

Run the bot and create json files.

Parameters

quiet (bool) – Print the result if False

scripts.maintenance.wikimedia_sites script

Script that updates the language lists in Wikimedia family files.

Usage:

python pwb.py wikimedia_sites [ {<family>} ]
scripts.maintenance.wikimedia_sites.update_family(families)[source]

Update family files.