pywikibot.data package¶
Module providing several layers of data access to the wiki.
Submodules¶
pywikibot.data.api module¶
Interface to Mediawiki’s api.php.
- class pywikibot.data.api.APIGenerator(action, continue_name='continue', limit_name='limit', data_name='data', **kwargs)[source]¶
Bases:
pywikibot.data.api._generators._RequestWrapper
Iterator that handle API responses containing lists.
The iterator will iterate each item in the query response and use the continue request parameter to retrieve the next portion of items automatically. If the limit attribute is set, the iterator will stop after iterating that many values.
Initialize an APIGenerator object.
kwargs are used to create a Request object; see that object’s documentation for values.
- Parameters
action (str) – API action name.
continue_name (str) – Name of the continue API parameter.
limit_name (str) – Name of the limit API parameter.
data_name (str) – Name of the data in API response.
- Return type
None
- set_maximum_items(value)[source]¶
Set the maximum number of items to be retrieved from the wiki.
If not called, most queries will continue as long as there is more data to be retrieved from the API.
- Parameters
value (Optional[Union[int, str]]) – The value of maximum number of items to be retrieved in total to set. Ignores None value.
- Return type
None
- class pywikibot.data.api.CachedRequest(expiry, *args, **kwargs)[source]¶
Bases:
pywikibot.data.api._requests.Request
Cached request.
Initialize a CachedRequest object.
- Parameters
expiry – either a number of days or a datetime.timedelta object
- Return type
None
- class pywikibot.data.api.ListGenerator(listaction, **kwargs)[source]¶
Bases:
pywikibot.data.api._generators.QueryGenerator
Iterator for queries of type action=query&list=foo.
See the API documentation for types of lists that can be queried. Lists include both site-wide information (such as ‘allpages’) and page-specific information (such as ‘backlinks’).
This iterator yields a dict object for each member of the list returned by the API, with the format of the dict depending on the particular list command used. For those lists that contain page information, it may be easier to use the PageGenerator class instead, as that will convert the returned information into a Page object.
Required and optional parameters are as for
Request
, except that action=query is assumed and listaction is required.- Parameters
listaction (str) – the “list=” type from api.php
- Return type
None
- class pywikibot.data.api.LogEntryListGenerator(logtype=None, **kwargs)[source]¶
Bases:
pywikibot.data.api._generators.ListGenerator
Iterator for queries of list ‘logevents’.
Yields LogEntry objects instead of dicts.
- Return type
None
- class pywikibot.data.api.LoginManager(password=None, site=None, user=None)[source]¶
Bases:
pywikibot.login.LoginManager
Supply login_to_site method to use API interface.
All parameters default to defaults in user-config.
- Parameters
site (Optional[pywikibot.site.BaseSite]) – Site object to log into
user (Optional[str]) – username to use. If user is None, the username is loaded from config.usernames.
password (Optional[str]) – password to use
- Raises
pywikibot.exceptions.NoUsernameError – No username is configured for the requested site.
- Return type
None
- get_login_token()[source]¶
Fetch login token for MediaWiki 1.27+.
- Returns
login token
- Return type
Optional[str]
- login_to_site()[source]¶
Login to the site.
Note, this doesn’t do anything with cookies. The http module takes care of all the cookie stuff. Throws exception on failure.
- Return type
None
- mapping = {'fail': ('Failed', 'FAIL'), 'ldap': ('lgdomain', 'domain'), 'password': ('lgpassword', 'password'), 'reason': ('reason', 'message'), 'result': ('result', 'status'), 'success': ('Success', 'PASS'), 'token': ('lgtoken', 'logintoken'), 'user': ('lgname', 'username')}¶
- class pywikibot.data.api.OptionSet(site=None, module=None, param=None, dict=None)[source]¶
Bases:
collections.abc.MutableMapping
A class to store a set of options which can be either enabled or not.
If it is instantiated with the associated site, module and parameter it will only allow valid names as options. If instantiated ‘lazy loaded’ it won’t checks if the names are valid until the site has been set (which isn’t required, but recommended). The site can only be set once if it’s not None and after setting it, any site (even None) will fail.
If a site is given, the module and param must be given too.
- Parameters
site (pywikibot.site.APISite or None) – The associated site
module (Optional[str]) – The module name which is used by paraminfo. (Ignored when site is None)
param (Optional[str]) – The parameter name inside the module. That parameter must have a ‘type’ entry. (Ignored when site is None)
dict (Optional[dict]) – The initializing dict which is used for
from_dict
- Return type
None
- from_dict(dictionary)[source]¶
Load options from the dict.
The options are not cleared before. If changes have been made previously, but only the dict values should be applied it needs to be cleared first.
- Parameters
dictionary (dict (keys are strings, values are bool/None)) – a dictionary containing for each entry either the value False, True or None. The names must be valid depending on whether they enable or disable the option. All names with the value None can be in either of the list.
- class pywikibot.data.api.PageGenerator(generator, g_content=False, **kwargs)[source]¶
Bases:
pywikibot.data.api._generators.QueryGenerator
Iterator for response to a request of type action=query&generator=foo.
This class can be used for any of the query types that are listed in the API documentation as being able to be used as a generator. Instances of this class iterate Page objects.
Required and optional parameters are as for
Request
, except that action=query is assumed and generator is required.- Parameters
generator (str) – the “generator=” type from api.php
g_content (bool) – if True, retrieve the contents of the current version of each Page (default False)
- Return type
None
- class pywikibot.data.api.ParamInfo(site, preloaded_modules=None, modules_only_mode=None)[source]¶
Bases:
collections.abc.Sized
,collections.abc.Container
API parameter information data object.
Provides cache aware fetching of parameter information.
It does not support the format modules.
- Parameters
preloaded_modules (set of string) – API modules to preload
modules_only_mode (bool or None to only use default, which True if the site is 1.25wmf4+) – use the ‘modules’ only syntax for API request
- Return type
None
- property action_modules¶
Set of all action modules.
- attributes(attribute, modules=None)[source]¶
Mapping of modules with an attribute to the attribute value.
It will include all modules which have that attribute set, also if that attribute is empty or set to False.
- Parameters
attribute (str) – attribute name
modules (Optional[set]) – modules to include. If None (default), it’ll load all modules including all submodules using the paths.
- Return type
dict using modules as keys
- fetch(modules)[source]¶
Fetch paraminfo for multiple modules.
No exception is raised when paraminfo for a module does not exist. Use __getitem__ to cause an exception if a module does not exist.
- Parameters
modules (iterable or str) – API modules to load
- Return type
None
- init_modules = frozenset({'main', 'paraminfo'})¶
- property module_paths¶
Set of all modules using their paths.
- normalize_modules(modules)[source]¶
Convert the modules into module paths.
Add query+ to any query module name not also in action modules.
- Returns
The modules converted into a module paths
- Return type
set
- classmethod normalize_paraminfo(data)[source]¶
Convert both old and new API JSON into a new-ish data structure.
For duplicate paths, the value will be False.
- parameter(module, param_name)[source]¶
Get details about one modules parameter.
Returns None if the parameter does not exist.
- Parameters
module (str) – API module name
param_name (str) – parameter name in the module
- Returns
metadata that describes how the parameter may be used
- Return type
Optional[Dict[str, Any]]
- paraminfo_keys = frozenset({'formatmodules', 'mainmodule', 'modules', 'pagesetmodule', 'querymodules'})¶
- property prefix_map¶
Mapping of module to its prefix for all modules with a prefix.
This loads paraminfo for all modules.
- property query_modules¶
Set of all query module names without query+ path prefix.
- root_modules = frozenset({'main', 'pageset'})¶
- class pywikibot.data.api.PropertyGenerator(prop, **kwargs)[source]¶
Bases:
pywikibot.data.api._generators.QueryGenerator
Iterator for queries of type action=query&prop=foo.
See the API documentation for types of page properties that can be queried.
This iterator yields one or more dict object(s) corresponding to each “page” item(s) from the API response; the calling module has to decide what to do with the contents of the dict. There will be one dict for each page queried via a titles= or ids= parameter (which must be supplied when instantiating this class).
Required and optional parameters are as for
Request
, except that action=query is assumed and prop is required.- Parameters
prop (str) – the “prop=” type from api.php
- Return type
None
- property props¶
The requested property names.
- class pywikibot.data.api.QueryGenerator(**kwargs)[source]¶
Bases:
pywikibot.data.api._generators._RequestWrapper
Base class for iterators that handle responses to API action=query.
By default, the iterator will iterate each item in the query response, and use the (query-)continue element, if present, to continue iterating as long as the wiki returns additional values. However, if the iterator’s limit attribute is set to a positive int, the iterator will stop after iterating that many values. If limit is negative, the limit parameter will not be passed to the API at all.
Most common query types are more efficiently handled by subclasses, but this class can be used directly for custom queries and miscellaneous types (such as “meta=…”) that don’t return the usual list of pages or links. See the API documentation for specific query options.
Initialize a QueryGenerator object.
kwargs are used to create a Request object; see that object’s documentation for values. ‘action’=’query’ is assumed.
- Return type
None
- set_maximum_items(value)[source]¶
Set the maximum number of items to be retrieved from the wiki.
If not called, most queries will continue as long as there is more data to be retrieved from the API.
If set to -1 (or any negative value), the “limit” parameter will be omitted from the request. For some request types (such as prop=revisions), this is necessary to signal that only current revision is to be returned.
- Parameters
value (Optional[Union[int, str]]) – The value of maximum number of items to be retrieved in total to set. Ignores None value.
- Return type
None
- set_namespace(namespaces)[source]¶
Set a namespace filter on this query.
- Parameters
namespaces (iterable of str or Namespace key, or a single instance of those types. May be a '|' separated list of namespace identifiers. An empty iterator clears any namespace restriction.) – namespace identifiers to limit query results
- Raises
KeyError – a namespace identifier was not resolved
- set_query_increment(value)[source]¶
Set the maximum number of items to be retrieved per API query.
If not called, the default is to ask for “max” items and let the API decide how many to send.
- Return type
None
- support_namespace()[source]¶
Check if namespace is a supported parameter on this query.
Note
this function will be removed when
set_namespace()
will throw TypeError() instead of just giving a warning. See T196619.- Returns
True if yes, False otherwise
- Return type
bool
- class pywikibot.data.api.Request(site=None, mime=None, throttle=True, max_retries=None, retry_wait=None, use_get=None, parameters=<object object>, **kwargs)[source]¶
Bases:
collections.abc.MutableMapping
A request to a Site’s api.php interface.
Attributes of this object (except for the special parameters listed below) get passed as commands to api.php, and can be get or set using the dict interface. All attributes must be strings. Use an empty string for parameters that don’t require a value. For example,
Request(action="query", titles="Foo bar", prop="info", redirects="")
corresponds to the API requestapi.php?action=query&titles=Foo%20bar&prop=info&redirects
This is the lowest-level interface to the API, and can be used for any request that a particular site’s API supports. See the API documentation (https://www.mediawiki.org/wiki/API) and site-specific settings for details on what parameters are accepted for each request type.
Uploading files is a special case: to upload, the parameter
mime
must contain a dict, and the parameterfile
must be set equal to a valid filename on the local computer, not to the content of the file.Returns a dict containing the JSON data returned by the wiki. Normally, one of the dict keys will be equal to the value of the ‘action’ parameter. Errors are caught and raise an APIError exception.
Example:
>>> r = Request(parameters={'action': 'query', 'meta': 'userinfo'}) >>> # This is equivalent to >>> # https://{path}/api.php?action=query&meta=userinfo&format=json >>> # change a parameter >>> r['meta'] = "userinfo|siteinfo" >>> # add a new parameter >>> r['siprop'] = "namespaces" >>> # note that "uiprop" param gets added automatically >>> r.action 'query' >>> sorted(r._params) ['action', 'meta', 'siprop'] >>> r._params['action'] ['query'] >>> r._params['meta'] ['userinfo', 'siteinfo'] >>> r._params['siprop'] ['namespaces'] >>> data = r.submit() >>> isinstance(data, dict) True >>> set(['query', 'batchcomplete', 'warnings']).issuperset(data.keys()) True >>> 'query' in data True >>> sorted(data['query']) ['namespaces', 'userinfo']
Create a new Request instance with the given parameters.
The parameters for the request can be defined via either the ‘parameters’ parameter or the keyword arguments. The keyword arguments were the previous implementation but could cause problems when there are arguments to the API named the same as normal arguments to this class. So the second parameter ‘parameters’ was added which just contains all parameters. When a Request instance is created it must use either one of them and not both at the same time. To have backwards compatibility it adds a parameter named ‘parameters’ to kwargs when both parameters are set as that indicates an old call and ‘parameters’ was originally supplied as a keyword parameter.
If undefined keyword arguments were given AND the ‘parameters’ parameter was supplied as a positional parameter it still assumes ‘parameters’ were part of the keyword arguments.
If a class is using Request and is directly forwarding the parameters,
Request.clean_kwargs
can be used to automatically convert the old kwargs mode into the new parameter mode. This normalizes the arguments so that when the API parameters are modified the changes can always be applied to the ‘parameters’ parameter.- Parameters
site – The Site to which the request will be submitted. If not supplied, uses the user’s configured default Site.
mime (Optional[dict]) – If not None, send in “multipart/form-data” format (default None). Parameters which should only be transferred via mime mode are defined via this parameter (even an empty dict means mime shall be used).
max_retries (Optional[int]) – Maximum number of times to retry after errors, defaults to config.max_retries.
retry_wait (Optional[int]) – Minimum time in seconds to wait after an error, defaults to config.retry_wait seconds (doubles each retry until config.retry_max seconds is reached).
use_get (Optional[bool]) – Use HTTP GET request if possible. If False it uses a POST request. If None, it’ll try to determine via action=paraminfo if the action requires a POST.
parameters (dict) – The parameters used for the request to the API.
kwargs – The parameters used for the request to the API.
throttle (bool) –
- Return type
None
- classmethod clean_kwargs(kwargs)[source]¶
Convert keyword arguments into new parameters mode.
If there are no other arguments in kwargs apart from the used arguments by the class’ initializer it’ll just return kwargs and otherwise remove those which aren’t in the initializer and put them in a dict which is added as a ‘parameters’ keyword. It will always create a shallow copy.
- Parameters
kwargs (dict) – The original keyword arguments which is not modified.
- Returns
The normalized keyword arguments.
- Return type
dict
- classmethod create_simple(req_site, **kwargs)[source]¶
Create a new instance using all args except site for the API.
- pywikibot.data.api.encode_url(query)[source]¶
Encode parameters to pass with a url.
Reorder parameters so that token parameters go last and call wraps
urlencode
. Return an HTTP URL query fragment which complies with API:Edit#Parameters (See the ‘token’ bullet.)- Parameters
query (mapping object or a sequence of two-element tuples) – keys and values to be uncoded for passing with a url
- Returns
encoded parameters with token parameters at the end
- Return type
str
- pywikibot.data.api.update_page(page, pagedict, props=None)[source]¶
Update attributes of Page object page, based on query data in pagedict.
- Parameters
page (pywikibot.page.Page) – object to be updated
pagedict (dict) – the contents of a “page” element of a query response
props (iterable of string) – the property names which resulted in pagedict. If a missing value in pagedict can indicate both ‘false’ and ‘not present’ the property which would make the value present must be in the props parameter.
- Raises
pywikibot.exceptions.InvalidTitleError – Page title is invalid
pywikibot.exceptions.UnsupportedPageError – Page with namespace < 0 is not supported yet
pywikibot.data.memento module¶
Fix ups for memento-client package version 0.6.1.
New in version 7.4.
- class pywikibot.data.memento.MementoClient(*args, **kwargs)[source]¶
Bases:
memento_client.memento_client.MementoClient
A Memento Client.
It makes it straightforward to access the Web of the past as it is to access the current Web.
Changed in version 7.4:
timeout
is used in several methods.Basic usage:
>>> mc = MementoClient() >>> dt = mc.convert_to_datetime("Sun, 01 Apr 2010 12:00:00 GMT") >>> mc = mc.get_memento_info("http://www.bbc.com/", dt) >>> print(mc['original_uri']) http://www.bbc.com/ >>> print(mc['timegate_uri']) http://timetravel.mementoweb.org/timegate/http://www.bbc.com/ >>> print(sorted(mc['mementos'])) ['closest', 'first', 'last', 'next', 'prev'] >>> del mc['mementos']['last'] >>> from pprint import pprint >>> pprint(mc['mementos']) {'closest': {'datetime': datetime.datetime(2010, 2, 28, ...), 'http_status_code': 200, 'uri': ['https://swap.stanford.edu/.../']}, 'first': {'datetime': datetime.datetime(1998, 12, 2, 21, 26, 10), 'uri': ['http://wayback.nli.org.il:8080/19981202212610/http://bbc.com/']}, 'next': {'datetime': datetime.datetime(2010, 5, 23, 13, 47, 38), 'uri': ['https://web.archive.org/web/20100523134738/http://www.bbc.com/']}, 'prev': {'datetime': datetime.datetime(1998, 12, 2, 21, 26, 10), 'uri': ['http://wayback.nli.org.il:8080/19981202212610/http://bbc.com/']}}
The output conforms to the Memento API format explained here: http://timetravel.mementoweb.org/guide/api/#memento-json
By default, MementoClient uses the Memento Aggregator: http://mementoweb.org/depot/
It is also possible to use different TimeGate, simply initialize with a preferred timegate base uri. Toggle check_native_timegate to see if the original uri has its own timegate. The native timegate, if found will be used instead of the timegate_uri preferred. If no native timegate is found, the preferred timegate_uri will be used.
- Parameters
timegate_uri (str) – A valid HTTP base uri for a timegate. Must start with http(s):// and end with a /.
max_redirects (int) – the maximum number of redirects allowed for all HTTP requests to be made.
- Returns
A
MementoClient
obj.
- static convert_to_http_datetime(dt)[source]¶
Converts a datetime object to a date string in HTTP format.
- Parameters
dt (Optional[datetime.datetime]) – A datetime object.
- Returns
The date in HTTP format.
- Raises
TypeError – Expecting dt parameter to be of type datetime.
- Return type
str
- get_memento_info(request_uri, accept_datetime=None, timeout=None, **kwargs)[source]¶
Query the preferred timegate and return the closest memento uri.
Given an original uri and an accept datetime, this method queries the preferred timegate and returns the closest memento uri, along with prev/next/first/last if available.
See also
http://timetravel.mementoweb.org/guide/api/#memento-json for the response format.
- Parameters
request_uri (str) – The input http uri.
accept_datetime (Optional[datetime.datetime]) – The datetime object of the accept datetime. The current datetime is used if none is provided.
timeout (Optional[int]) – the timeout value for the HTTP connection.
- Returns
A map of uri and datetime for the closest/prev/next/first/last mementos.
- Return type
dict
- get_native_timegate_uri(original_uri, accept_datetime, timeout=None, **kwargs)[source]¶
Check the original uri whether the timegate uri is provided.
Given an original URL and an accept datetime, check the original uri to see if the timegate uri is provided in the Link header.
- Parameters
original_uri (str) – An HTTP uri of the original resource.
accept_datetime (Optional[datetime.datetime]) – The datetime object of the accept datetime
timeout (Optional[int]) – the timeout value for the HTTP connection.
- Returns
The timegate uri of the original resource, if provided, else None.
- Return type
Optional[str]
- static is_memento(uri, response=None, session=None, timeout=None)[source]¶
Determines if the URI given is indeed a Memento.
The simple case is to look for a Memento-Datetime header in the request, but not all archives are Memento-compliant yet.
- Parameters
uri (str) – an HTTP URI for testing
response (Optional[requests.models.Response]) – the response object of the uri.
session (Optional[requests.sessions.Session]) – the requests session object.
timeout (Optional[int]) – (int) the timeout value for the HTTP connection.
- Returns
True if a Memento, False otherwise
- Return type
bool
- static is_timegate(uri, accept_datetime=None, response=None, session=None, timeout=None)[source]¶
Checks if the given uri is a valid timegate according to the RFC.
- Parameters
uri (str) – the http uri to check.
accept_datetime (Optional[str]) – the accept datetime string in http date format.
response (Optional[requests.models.Response]) – the response object of the uri.
session (Optional[requests.sessions.Session]) – the requests session object.
timeout (Optional[int]) – the timeout value for the HTTP connection.
- Returns
True if a valid timegate, else False.
- Return type
bool
- static request_head(uri, accept_datetime=None, follow_redirects=False, session=None, timeout=None)[source]¶
Makes HEAD requests.
- Parameters
uri (str) – the uri for the request.
accept_datetime (Optional[str]) – the accept-datetime in the http format.
follow_redirects (bool) – Toggle to follow redirects. False by default, so does not follow any redirects.
session (Optional[requests.sessions.Session]) – the request session object to avoid opening new connections for every request.
timeout (Optional[int]) – the timeout for the HTTP requests.
- Returns
the response object.
- Raises
ValueError – Only HTTP URIs are supported
- Return type
requests.models.Response
pywikibot.data.mysql module¶
Miscellaneous helper functions for mysql queries.
- pywikibot.data.mysql.mysql_query(query, params=None, dbname=None, verbose=None)[source]¶
Yield rows from a MySQL query.
An example query that yields all ns0 pages might look like:
SELECT page_namespace, page_title, FROM page WHERE page_namespace = 0;
Supported MediaWiki projects use Unicode (UTF-8) character encoding. Cursor charset is utf8.
- Parameters
query (str) – MySQL query to execute
params (tuple, list or dict of str) – input parameters for the query, if needed if list or tuple, %s shall be used as placeholder in the query string. if a dict, %(key)s shall be used as placeholder in the query string.
dbname (Optional[str]) – db name
verbose (Optional[bool]) – if True, print query to be executed; if None, config.verbose_output will be used.
- Returns
generator which yield tuples
pywikibot.data.sparql module¶
SPARQL Query interface.
- class pywikibot.data.sparql.Bnode(data, **kwargs)[source]¶
Bases:
pywikibot.data.sparql.SparqlNode
Representation of blank node.
Create Bnode.
- Parameters
data (dict) –
- Return type
None
- class pywikibot.data.sparql.Literal(data, **kwargs)[source]¶
Bases:
pywikibot.data.sparql.SparqlNode
Representation of RDF literal result type.
Create Literal object.
- Parameters
data (dict) –
- Return type
None
- class pywikibot.data.sparql.SparqlNode(value)[source]¶
Bases:
object
Base class for SPARQL nodes.
Create a SparqlNode.
- Return type
None
- class pywikibot.data.sparql.SparqlQuery(endpoint=None, entity_url=None, repo=None, max_retries=None, retry_wait=None)[source]¶
Bases:
object
SPARQL Query class.
This class allows to run SPARQL queries against any SPARQL endpoint.
Create endpoint.
- Parameters
endpoint (Optional[str]) – SPARQL endpoint URL
entity_url (Optional[str]) – URL prefix for any entities returned in a query.
repo (pywikibot.site.DataSite) – The Wikibase site which we want to run queries on. If provided this overrides any value in endpoint and entity_url. Defaults to Wikidata.
max_retries (Optional[int]) – (optional) Maximum number of times to retry after errors, defaults to config.max_retries.
retry_wait (Optional[float]) – (optional) Minimum time in seconds to wait after an error, defaults to config.retry_wait seconds (doubles each retry until config.retry_max is reached).
- Return type
None
- ask(query, headers=None)[source]¶
Run SPARQL ASK query and return boolean result.
- Parameters
query (str) – Query text
headers (Optional[Dict[str, str]]) –
- Return type
bool
- get_items(query, item_name='item', result_type=<class 'set'>)[source]¶
Retrieve items which satisfy given query.
Items are returned as Wikibase IDs.
- Parameters
query – Query string. Must contain ?{item_name} as one of the projected values.
item_name (str) – Name of the value to extract
result_type (iterable) – type of the iterable in which SPARQL results are stored (default set)
- Returns
item ids, e.g. Q1234
- Return type
same as result_type
- get_last_response()[source]¶
Return last received response.
- Returns
Response object from last request or None
- query(query, headers=None)[source]¶
Run SPARQL query and return parsed JSON result.
- Parameters
query (str) – Query text
headers (Optional[Dict[str, str]]) –
- select(query, full_data=False, headers=None)[source]¶
Run SPARQL query and return the result.
The response is assumed to be in format defined by: https://www.w3.org/TR/2013/REC-sparql11-results-json-20130321/
- Parameters
query (str) – Query text
full_data (bool) – Whether return full data objects or only values
headers (Optional[Dict[str, str]]) –
- Return type
Optional[List[Dict[str, str]]]
- class pywikibot.data.sparql.URI(data, entity_url, **kwargs)[source]¶
Bases:
pywikibot.data.sparql.SparqlNode
Representation of URI result type.
Create URI object.
- Parameters
data (dict) –
- Return type
None
pywikibot.data.wikistats module¶
Objects representing WikiStats API.
- class pywikibot.data.wikistats.WikiStats(url='https://wikistats.wmcloud.org/')[source]¶
Bases:
object
Light wrapper around WikiStats data, caching responses and data.
The methods accept a Pywikibot family name as the WikiStats table name, mapping the names before calling the WikiStats API.
- Parameters
url (str) –
- Return type
None
- ALL_KEYS = {'editthis', 'gamepedias', 'gentoo', 'lxde', 'mediawikis', 'metapedias', 'neoseeker', 'opensuse', 'orain', 'pardus', 'referata', 'rodovid', 'scoutwiki', 'shoutwiki', 'sourceforge', 'uncyclomedia', 'w3cwikis', 'wikia', 'wikibooks', 'wikifur', 'wikinews', 'wikipedia', 'wikipedias', 'wikiquote', 'wikiquotes', 'wikisite', 'wikisource', 'wikisources', 'wikitravel', 'wikiversity', 'wikivoyage', 'wikkii', 'wiktionaries', 'wiktionary', 'wmspecials'}¶
- ALL_TABLES = {'editthis', 'gamepedias', 'gentoo', 'lxde', 'mediawikis', 'metapedias', 'neoseeker', 'opensuse', 'orain', 'pardus', 'referata', 'rodovid', 'scoutwiki', 'shoutwiki', 'sourceforge', 'uncyclomedia', 'w3cwikis', 'wikia', 'wikibooks', 'wikifur', 'wikinews', 'wikipedias', 'wikiquotes', 'wikisite', 'wikisources', 'wikitravel', 'wikiversity', 'wikivoyage', 'wikkii', 'wiktionaries', 'wmspecials'}¶
- FAMILY_MAPPING = {'wikipedia': 'wikipedias', 'wikiquote': 'wikiquotes', 'wikisource': 'wikisources', 'wiktionary': 'wiktionaries'}¶
- MISC_SITES_TABLE = 'mediawikis'¶
- OTHER_MULTILANG_TABLES = {'gentoo', 'lxde', 'metapedias', 'opensuse', 'pardus', 'rodovid', 'scoutwiki', 'uncyclomedia', 'wikifur', 'wikitravel'}¶
- OTHER_TABLES = {'editthis', 'gamepedias', 'neoseeker', 'orain', 'referata', 'shoutwiki', 'sourceforge', 'w3cwikis', 'wikia', 'wikisite', 'wikkii', 'wmspecials'}¶
- WMF_MULTILANG_TABLES = {'wikibooks', 'wikinews', 'wikipedias', 'wikiquotes', 'wikisources', 'wikiversity', 'wikivoyage', 'wiktionaries'}¶
- get(table)[source]¶
Get a list of a table of data.
- Parameters
table (str) – table of data to fetch
- Return type
list
- get_dict(table)[source]¶
Get dictionary of a table of data.
- Parameters
table (str) – table of data to fetch
- Return type
dict
- languages_by_size(table)[source]¶
Return ordered list of languages by size from WikiStats.
- Parameters
table (str) –
- sorted(table, key, reverse=None)[source]¶
Reverse numerical sort of data.
- Parameters
table (str) – name of table of data
key (str) – data table key
reverse (Optional[bool]) – If set to True the sorting order is reversed. If None the sorting order for numeric keys are reversed whereas alphanumeric keys are sorted in normal way.
- Returns
The sorted table
- Return type
list