Source code for scripts.create_isbn_edition

#!/usr/bin/env python3
r"""Pywikibot script to load ISBN related data into Wikidata.

Pywikibot script to get ISBN data from a digital library,
and create or amend the related Wikidata item for edition
(with the P212=ISBN number as unique external ID).

Use digital libraries to get ISBN data in JSON format, and integrate the
results into Wikidata.

Then the resulting item number can be used e.g. to generate Wikipedia
references using template Cite_Q.

Parameters:

    All parameters are optional:

        P1:         digital library (default goob "-")

            bnf     Catalogue General (France)
            bol     Bol.com
            dnb     Deutsche National Library
            goob    Google Books
            kb      National Library of the Netherlands
            loc     Library of Congress US
            mcues   Ministerio de Cultura (Spain)
            openl   OpenLibrary.org
            porbase urn.porbase.org Portugal
            sbn     Servizio Bibliotecario Nazionale
            wiki    wikipedia.org
            worldcat    WorldCat

        P2:         ISO 639-1 language code
                    Default LANG; e.g. en, nl, fr, de, es, it, etc.

        P3 P4...:   P/Q pairs to add additional claims (repeated)
                    e.g. P921 Q107643461 (main subject: database
                    management linked to P2163 Fast ID)

    stdin: ISBN numbers (International standard book number)

        Free text (e.g. Wikipedia references list, or publication list)
        is accepted. Identification is done via an ISBN regex expression.

**Functionality:**
    * The ISBN number is used as a primary key (P212 where no duplicates
      are allowed. The item update is not performed when there is no
      unique match
    * Statements are added or merged incrementally; existing data is not
      overwritten.
    * Authors and publishers are searched to get their item number
      (ambiguous items are skipped)
    * Book title and subtitle are separated with '.', ':', or '-'
    * This script can be run incrementally with the same parameters
      Caveat: Take into account the Wikidata Query database
      replication delay. Wait for minimum 5 minutes to avoid creating
      duplicate objects.

**Data quality:**
    * Use https://query.wikidata.org/querybuilder/ to identify P212
      duplicates. Merge duplicate items before running the script
      again.
    * The following properties should only be used for written works
      P5331:  OCLC work ID (editions should only have P243)
      P8383:  Goodreads-identificatiecode for work (editions should
      only have P2969)

Examples:

    Default library (Google Books), language (LANG), no additional
    statements:

        pwb create_isbn_edition.py 9789042925564

    Wikimedia, language Dutch, main subject: database management:

        pwb create_isbn_edition.py wiki en P921 Q107643461 978-0-596-10089-6

Standard ISBN properties:

    P31:Q3331189:   instance of edition
    P50:    author
    P123:   publisher
    P212:   canonical ISBN number (lookup via Wikidata Query)
    P407:   language of work (Qnumber linked to ISO 639-1 language code)
    P577:   date of publication (year)
    P1476:  book title
    P1680:  subtitle

Other ISBN properties:

    P291:   place of publication
    P921:   main subject (inverse lookup from external Fast ID P2163)
    P629:   work for edition
    P747:   edition of work
    P1104:  number of pages

Qualifiers:

    P1545:  (author) sequence number

External identifiers:

    P213:   ISNI ID
    P243:   OCLC ID
    P496:   ORCID iD
    P675:   Google Books-identificatiecode
    P1036:  Dewey Decimal Classification
    P2163:  Fast ID (inverse lookup via Wikidata Query) -> P921: main subject
    P2969:  Goodreads-identificatiecode

    (only for written works)
    P5331:  OCLC work ID (editions should only have P243)
    P8383:  Goodreads-identificatiecode for work (editions should only
            have P2969)

**Author:**
   Geert Van Pamel, 2022-08-04,
   GNU General Public License v3.0, User:Geertivp

**Documentation:**
    * https://en.wikipedia.org/wiki/ISBN
    * https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes
    * https://www.geeksforgeeks.org/searching-books-with-python/
    * https://www.freecodecamp.org/news/python-json-how-to-convert-a-string-to-json/
    * https://pypi.org/project/isbnlib/
    * https://buildmedia.readthedocs.org/media/pdf/isbnlib/v3.4.5/isbnlib.pdf
    * https://isbntools.readthedocs.io/en/latest/info.html
    * https://www.wikidata.org/wiki/Property:P212
    * https://www.wikidata.org/wiki/Wikidata:WikiProject_Books
    * WikiProject Books:  https://www.wikidata.org/wiki/Q21831105
    * https://www.wikidata.org/wiki/Wikidata:List_of_properties/work
    * https://www.wikidata.org/wiki/Template:Book_properties
    * https://www.wikidata.org/wiki/Template:Bibliographic_properties
    * http://classify.oclc.org/classify2/ClassifyDemo
    * https://www.wikidata.org/wiki/Wikidata:WikiProject_Source_MetaData
    * https://www.wikidata.org/wiki/Help:Sources
    * https://www.wikidata.org/wiki/Q22696135
    * https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2021/Wikidata/Bibliographical_references/sources_for_wikidataitems
    * https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.html
    * https://doc.wikimedia.org/pywikibot/master/
    * https://docs.python.org/3/howto/logging.html
    * https://wikitech.wikimedia.org/wiki/Portal:Toolforge
    * http://www.isbn.org/standards/home/isbn/international/hyphenation-instructions.asp
    * https://www.wikidata.org/wiki/Wikidata:Pywikibot\_-_Python_3_Tutorial/Setting_qualifiers
    * https://www.wikidata.org/wiki/Wikidata:Pywikibot\_-_Python_3_Tutorial/Setting_statements

**Prerequisites:**
    pywikibot

    Install the following ISBN lib packages:
    https://pypi.org/search/?q=isbnlib_

        pip install isbnlib (mandatory)

        (optional)
        pip install isbnlib-bol
        pip install isbnlib-bnf
        pip install isbnlib-dnb
        pip install isbnlib-kb
        pip install isbnlib-loc
        pip install isbnlib-worldcat2
        etc.

**Restrictions:**
    * Better use the ISO 639-1 language code parameter as a default
        The language code is not always available from the digital library.
    * SPARQL queries run on a replicated database
        Possible important replication delay; wait 5 minutes before retry
        -- otherwise risk for creating duplicates.

**Algorithm:**
    # Get parameters
    # Validate parameters
    # Get ISBN data
    # Convert ISBN data
    # Get additional data
    # Register ISBN data into Wikidata (create or amend items or claims)

Environment:

    The python script can run on the following platforms:

        Linux client
        Google Chromebook (Linux container)
        Toolforge Portal
        PAWS

    LANG: ISO 639-1 language code


Applications:

    Generate a book reference
        Example: {{Cite Q|Q63413107}} (wp.en)
        See also:
            https://meta.wikimedia.org/wiki/WikiCite
            https://www.wikidata.org/wiki/Q21831105 (WikiCite)
            https://www.wikidata.org/wiki/Q22321052 (Cite_Q)
            https://www.mediawiki.org/wiki/Global_templates
            https://www.wikidata.org/wiki/Wikidata:WikiProject_Source_MetaData
            https://phabricator.wikimedia.org/tag/wikicite/
            https://meta.wikimedia.org/wiki/WikiCite/Shared_Citations

**Wikidata Query:**
    * List of editions about musicians: https://w.wiki/5aaz
    * List of editions having ISBN number: https://w.wiki/5akq

**Related projects:**
    * :phab:`T314942` (this script)
    * :phab:`T282719`
    * :phab:`T214802`
    * :phab:`T208134`
    * :phab:`T138911`
    * :phab:`T20814`
    * https://en.wikipedia.org/wiki/User:Citation_bot
    * https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2021/Wikidata/Bibliographical_references/sources_for_wikidataitems
    * https://zenodo.org/record/55004#.YvwO4hTP1D8

**Other systems:**
    * https://en.wikipedia.org/wiki/bibliographic_database
    * https://www.titelbank.nl/pls/ttb/f?p=103:4012:::NO::P4012_TTEL_ID:3496019&cs=19BB8084860E3314502A1F777F875FE61

.. versionadded:: 7.7
"""  # noqa: E501, W605
#
# (C) Pywikibot team, 2022-2023
#
# Distributed under the terms of the MIT license.
#
from __future__ import annotations

import os  # Operating system
import re  # Regular expressions (very handy!)
from itertools import islice
from pprint import pformat
from typing import Any

import pywikibot  # API interface to Wikidata
from pywikibot import pagegenerators as pg  # Wikidata Query interface
from pywikibot.config import verbose_output as verbose
from pywikibot.data import api


try:
    import isbnlib
except ImportError as e:
    isbnlib = e

try:
    from unidecode import unidecode
except ImportError as e:
    unidecode = e

# Initialisation
booklib = 'goob'  # Default digital library

# ISBN number: 10 or 13 digits with optional dashes (-)
ISBNRE = re.compile(r'[0-9-]{10,17}')
PROPRE = re.compile(r'P[0-9]+')  # Wikidata P-number
QSUFFRE = re.compile(r'Q[0-9]+')  # Wikidata Q-number

# Other statements are added via command line parameters
target = {
    'P31': 'Q3331189',  # Is an instance of an edition
}

# Statement property and instance validation rules
propreqinst = {
    'P50': 'Q5',  # Author requires human
    # Publisher requires publisher
    'P123': {'Q2085381', 'Q1114515', 'Q1320047'},
    # Edition language requires at least one of (living, natural) language
    'P407': {'Q34770', 'Q33742', 'Q1288568'},
}

mainlang = os.getenv('LANG', 'en')[:2]  # Default description language

# Connect to database
transcmt = '#pwb Create ISBN edition'  # Wikidata transaction comment



[docs]
def is_in_list(statement_list, checklist: list[str]) -> bool:
    """Verify if statement list contains at least one item from the checklist.

    :param statement_list: Statement list
    :param checklist: List of values
    :Returns: True when match
    """
    return any(seq.getTarget().getID() in checklist for seq in statement_list)




[docs]
def get_item_list(item_name: str, instance_id):
    """Get list of items by name, belonging to an instance (list).

    :param item_name: Item name (case sensitive)
    :param instance_id: Instance ID (string, set, or list)
    :Returns: Set of items (Q-numbers)
    """
    item_list = set()  # Empty set
    params = {
        'action': 'wbsearchentities',
        'format': 'json',
        'type': 'item',
        'strictlanguage': False,
        # All languages are searched, but labels are in native language
        'language': mainlang,
        'search': item_name,  # Get item list from label
    }
    request = api.Request(site=repo, parameters=params)
    result = request.submit()

    if 'search' in result:
        for res in result['search']:
            item = pywikibot.ItemPage(repo, res['id'])
            item.get(get_redirect=True)
            if 'P31' in item.claims:
                for seq in item.claims['P31']:  # Loop through instances
                    # Matching instance
                    if seq.getTarget().getID() in instance_id:
                        for lang in item.labels:  # Search all languages
                            # Ignore label case and accents
                            if (unidecode(item_name.lower())
                                    == unidecode(item.labels[lang].lower())):
                                item_list.add(item.getID())  # Label math
                        for lang in item.aliases:
                            # Case sensitive for aliases
                            if item_name in item.aliases[lang]:
                                item_list.add(item.getID())  # Alias match
    return item_list




[docs]
def amend_isbn_edition(isbn_number: str) -> None:
    """Amend ISBN registration.

    Amend Wikidata, by registering the ISBN-13 data via P212,
    depending on the data obtained from the digital library.

    :param isbn_number:  ISBN number (10 or 13 digits with optional hyphens)
    """
    isbn_number = isbn_number.strip()
    if not isbn_number:
        return  # Do nothing when the ISBN number is missing

    try:
        isbn_data = isbnlib.meta(isbn_number, service=booklib)
        # {'ISBN-13': '9789042925564',
        #  'Title': 'De Leuvense Vaart - Van De Vaartkom Tot Wijgmaal. '
        #           'Aspecten Uit De Industriele Geschiedenis Van Leuven',
        #  'Authors': ['A. Cresens'],
        #  'Publisher': 'Peeters Pub & Booksellers',
        #  'Year': '2012',
        #  'Language': 'nl'}
    except Exception as error:
        # When the book is unknown the function returns
        pywikibot.error(error)
        return

    if len(isbn_data) < 6:
        pywikibot.error(f'Unknown or incomplete digital library registration '
                        f'for {isbn_number}')
        return

    # Show the raw results
    if verbose:
        pywikibot.info('\n' + pformat(isbn_data))

    add_claims(isbn_data)




[docs]
def add_claims(isbn_data: dict[str, Any]) -> None:  # noqa: C901
    """Inspect isbn_data and add claims if possible."""
    # Get the book language from the ISBN book reference
    booklang = mainlang  # Default language
    if isbn_data['Language']:
        booklang = isbn_data['Language'].strip()
        if booklang == 'iw':  # Obsolete codes
            booklang = 'he'
        lang_list = list(get_item_list(booklang, propreqinst['P407']))

        if not lang_list:
            pywikibot.warning('Unknown language ' + booklang)
            return

        if len(lang_list) != 1:
            pywikibot.warning('Ambiguous language ' + booklang)
            return

        target['P407'] = lang_list[0]

    # Get formatted ISBN number
    isbn_number = isbn_data['ISBN-13']  # Numeric format
    isbn_fmtd = isbnlib.mask(isbn_number)  # Canonical format
    pywikibot.info(isbn_fmtd)  # First one

    # Get (sub)title when there is a dot
    titles = isbn_data['Title'].split('. ')  # goob is using a '.'
    if len(titles) == 1:
        titles = isbn_data['Title'].split(': ')  # Extract subtitle
    if len(titles) == 1:
        titles = isbn_data['Title'].split(' - ')  # Extract subtitle
    objectname = titles[0].strip()
    subtitle = ''
    if len(titles) > 1:
        subtitle = titles[1].strip()

    # pywikibot.info book titles
    pywikibot.debug(objectname)
    pywikibot.debug(subtitle)  # Optional

    # print subsequent subtitles, when available
    for title in islice(titles, 2, None):
        # Not stored in Wikidata...
        pywikibot.debug(title.strip())

    # Search the ISBN number in Wikidata both canonical and numeric
    # P212 should have canonical hyphenated format
    isbn_query = f"""# Get ISBN number
SELECT ?item WHERE {{
  VALUES ?isbn_number {{
    "{isbn_fmtd}"
    "{isbn_number}"
  }}
  ?item wdt:P212 ?isbn_number.
}}
"""

    pywikibot.info(isbn_query)
    generator = pg.WikidataSPARQLPageGenerator(isbn_query, site=repo)

    # Main loop for all DISTINCT items
    rescnt = 0
    for rescnt, item in enumerate(generator, start=1):
        qnumber = item.getID()
        pywikibot.warning(f'Found item: {qnumber}')

    # Create or amend the item
    if rescnt == 1:
        item.get(get_redirect=True)  # Update item
    elif not rescnt:
        label = {booklang: objectname}
        item = pywikibot.ItemPage(repo)  # Create item
        item.editEntity({'labels': label}, summary=transcmt)
        qnumber = item.getID()
        pywikibot.warning(f'Creating item: {qnumber}')
    else:
        pywikibot.critical(f'Ambiguous ISBN number {isbn_fmtd}')
        return

    # Add all P/Q values
    # Make sure that labels are known in the native language
    pywikibot.debug(target)

    # Register statements
    for propty in target:
        if propty not in item.claims:
            if propty not in proptyx:
                proptyx[propty] = pywikibot.PropertyPage(repo, propty)
            targetx[propty] = pywikibot.ItemPage(repo, target[propty])

            try:
                pywikibot.warning('Add {} ({}): {} ({})'.format(
                    proptyx[propty].labels[booklang], propty,
                    targetx[propty].labels[booklang], target[propty]))
            except:  # noqa: B001, E722, H201
                pywikibot.warning(f'Add {propty}:{target[propty]}')

            claim = pywikibot.Claim(repo, propty)
            claim.setTarget(targetx[propty])
            item.addClaim(claim, bot=True, summary=transcmt)

    # Set formatted ISBN number
    if 'P212' not in item.claims:
        pywikibot.warning(f'Add ISBN number (P212): {isbn_fmtd}')
        claim = pywikibot.Claim(repo, 'P212')
        claim.setTarget(isbn_fmtd)
        item.addClaim(claim, bot=True, summary=transcmt)

    # Title
    if 'P1476' not in item.claims:
        pywikibot.warning(f'Add Title (P1476): {objectname}')
        claim = pywikibot.Claim(repo, 'P1476')
        claim.setTarget(
            pywikibot.WbMonolingualText(text=objectname, language=booklang))
        item.addClaim(claim, bot=True, summary=transcmt)

    # Subtitle
    if subtitle and 'P1680' not in item.claims:
        pywikibot.warning(f'Add Subtitle (P1680): {subtitle}')
        claim = pywikibot.Claim(repo, 'P1680')
        claim.setTarget(
            pywikibot.WbMonolingualText(text=subtitle, language=booklang))
        item.addClaim(claim, bot=True, summary=transcmt)

    # Date of publication
    pub_year = isbn_data['Year']
    if pub_year and 'P577' not in item.claims:
        pywikibot.warning(
            f"Add Year of publication (P577): {isbn_data['Year']}")
        claim = pywikibot.Claim(repo, 'P577')
        claim.setTarget(pywikibot.WbTime(year=int(pub_year), precision='year'))
        item.addClaim(claim, bot=True, summary=transcmt)

    # Get the author list
    author_cnt = 0
    for author_name in isbn_data['Authors']:
        author_name = author_name.strip()
        if not author_name:
            continue

        author_cnt += 1
        author_list = list(get_item_list(author_name, propreqinst['P50']))

        if len(author_list) == 1:
            add_author = True
            if 'P50' in item.claims:
                for seq in item.claims['P50']:
                    if seq.getTarget().getID() in author_list:
                        add_author = False
                        break

            if add_author:
                pywikibot.warning(f'Add author {author_cnt} (P50): '
                                  f'{author_name} ({author_list[0]})')
                claim = pywikibot.Claim(repo, 'P50')
                claim.setTarget(pywikibot.ItemPage(repo, author_list[0]))
                item.addClaim(claim, bot=True, summary=transcmt)

                qualifier = pywikibot.Claim(repo, 'P1545')
                qualifier.setTarget(str(author_cnt))
                claim.addQualifier(qualifier, summary=transcmt)
        elif not author_list:
            pywikibot.warning(f'Unknown author: {author_name}')
        else:
            pywikibot.warning(f'Ambiguous author: {author_name}')

    # Get the publisher
    publisher_name = isbn_data['Publisher'].strip()
    if publisher_name:
        publisher_list = list(
            get_item_list(publisher_name, propreqinst['P123']))

        if len(publisher_list) == 1:
            if 'P123' not in item.claims:
                pywikibot.warning(f'Add publisher (P123): {publisher_name} '
                                  f'({publisher_list[0]})')
                claim = pywikibot.Claim(repo, 'P123')
                claim.setTarget(pywikibot.ItemPage(repo, publisher_list[0]))
                item.addClaim(claim, bot=True, summary=transcmt)
        elif not publisher_list:
            pywikibot.warning('Unknown publisher: ' + publisher_name)
        else:
            pywikibot.warning('Ambiguous publisher: ' + publisher_name)

    # Get addional data from the digital library
    isbn_cover = isbnlib.cover(isbn_number)
    isbn_editions = isbnlib.editions(isbn_number, service='merge')
    isbn_doi = isbnlib.doi(isbn_number)
    isbn_info = isbnlib.info(isbn_number)

    if verbose:
        pywikibot.info()
        pywikibot.info(isbn_info)
        pywikibot.info(isbn_doi)
        pywikibot.info(isbn_editions)

    # Book cover images
    pywikibot.info(pformat(isbn_cover))

    # Handle ISBN classification
    isbn_classify = isbnlib.classify(isbn_number)
    pywikibot.debug(pformat(isbn_classify))

    # ./create_isbn_edition.py '978-3-8376-5645-9' - de P407 Q188
    # Q113460204
    # {'owi': '11103651812', 'oclc': '1260160983', 'lcc': 'TK5105.8882',
    #  'ddc': '300', 'fast': {'1175035': 'Wikis (Computer science)',
    #                         '1795979': 'Wikipedia',
    #                         '1122877': 'Social sciences'}}

    # Set the OCLC ID
    if 'oclc' in isbn_classify and 'P243' not in item.claims:
        pywikibot.warning(f"Add OCLC ID (P243): {isbn_classify['oclc']}")
        claim = pywikibot.Claim(repo, 'P243')
        claim.setTarget(isbn_classify['oclc'])
        item.addClaim(claim, bot=True, summary=transcmt)

    # OCLC ID and OCLC work ID should not be both assigned
    if 'P243' in item.claims and 'P5331' in item.claims:
        if 'P629' in item.claims:
            oclcwork = item.claims['P5331'][0]  # OCLC Work should be unique
            # Get the OCLC Work ID from the edition
            oclcworkid = oclcwork.getTarget()
            # Edition should belong to only one single work
            work = item.claims['P629'][0].getTarget()
            # There doesn't exist a moveClaim method?
            pywikibot.warning(
                f'Move OCLC Work ID {oclcworkid} to work {work.getID()}')
            # Keep current OCLC Work ID if present
            if 'P5331' not in work.claims:
                claim = pywikibot.Claim(repo, 'P5331')
                claim.setTarget(oclcworkid)
                work.addClaim(claim, bot=True, summary=transcmt)
            # OCLC Work ID does not belong to edition
            item.removeClaims(oclcwork, bot=True, summary=transcmt)
        else:
            pywikibot.error('OCLC Work ID {} conflicts with OCLC ID {} and no '
                            'work available'
                            .format(item.claims['P5331'][0].getTarget(),
                                    item.claims['P243'][0].getTarget()))

    # OCLC work ID should not be registered for editions, only for works
    if 'owi' not in isbn_classify:
        pass
    elif 'P629' in item.claims:  # Get the work related to the edition
        # Edition should only have one single work
        work = item.claims['P629'][0].getTarget()
        if 'P5331' not in work.claims:  # Assign the OCLC work ID if missing
            pywikibot.warning(
                f"Add OCLC work ID (P5331): {isbn_classify['owi']} to work "
                f'{work.getID()}')
            claim = pywikibot.Claim(repo, 'P5331')
            claim.setTarget(isbn_classify['owi'])
            work.addClaim(claim, bot=True, summary=transcmt)
    elif 'P243' in item.claims:
        pywikibot.warning('OCLC Work ID {} ignored because of OCLC ID {}'
                          .format(isbn_classify['owi'],
                                  item.claims['P243'][0].getTarget()))
    # Assign the OCLC work ID only if there is no work, and no OCLC ID
    # for edition
    elif 'P5331' not in item.claims:
        pywikibot.warning(
            f"Add OCLC work ID (P5331): {isbn_classify['owi']} to edition")
        claim = pywikibot.Claim(repo, 'P5331')
        claim.setTarget(isbn_classify['owi'])
        item.addClaim(claim, bot=True, summary=transcmt)

    # Reverse logic for moving OCLC ID and P212 (ISBN) from work to
    # edition is more difficult because of 1:M relationship...

    # Same logic as for OCLC (work) ID

    # Goodreads-identificatiecode (P2969)

    # Goodreads-identificatiecode for work (P8383) should not be
    # registered for editions; should rather use P2969

    # Library of Congress Classification (works and editions)
    if 'lcc' in isbn_classify and 'P8360' not in item.claims:
        pywikibot.warning(
            'Add Library of Congress Classification for edition (P8360): '
            f"{isbn_classify['lcc']}")
        claim = pywikibot.Claim(repo, 'P8360')
        claim.setTarget(isbn_classify['lcc'])
        item.addClaim(claim, bot=True, summary=transcmt)

    # Dewey Decimale Classificatie
    if 'ddc' in isbn_classify and 'P1036' not in item.claims:
        pywikibot.warning(
            f"Add Dewey Decimale Classificatie (P1036): {isbn_classify['ddc']}"
        )
        claim = pywikibot.Claim(repo, 'P1036')
        claim.setTarget(isbn_classify['ddc'])
        item.addClaim(claim, bot=True, summary=transcmt)

    # Register Fast ID using P921 (main subject) through P2163 (Fast ID)
    # https://www.wikidata.org/wiki/Q3294867
    # https://nl.wikipedia.org/wiki/Faceted_Application_of_Subject_Terminology
    # https://www.oclc.org/research/areas/data-science/fast.html
    # https://www.oclc.org/content/dam/oclc/fast/FAST-quick-start-guide-2022.pdf

    # Authority control identifier from WorldCat's “FAST Linked Data”
    # authority file (external ID P2163)
    # Corresponding to P921 (Wikidata main subject)
    if 'fast' in isbn_classify:
        for fast_id in isbn_classify['fast']:

            # Get the main subject
            main_subject_query = f"""# Search the main subject
SELECT ?item WHERE {{
  ?item wdt:P2163 "{fast_id}".
}}
"""

            pywikibot.info(main_subject_query)
            generator = pg.WikidataSPARQLPageGenerator(main_subject_query,
                                                       site=repo)

            # Main loop for all DISTINCT items
            rescnt = 0
            for rescnt, main_subject in enumerate(generator, start=1):
                qmain_subject = main_subject.getID()
                try:
                    main_subject_label = main_subject.labels[booklang]
                    pywikibot.info(f'Found main subject {main_subject_label} '
                                   f'({qmain_subject}) for Fast ID {fast_id}')
                except:  # noqa: B001, E722, H201
                    main_subject_label = ''
                    pywikibot.info(f'Found main subject ({qmain_subject}) for '
                                   f'Fast ID {fast_id}')
                    pywikibot.error(f'Missing label for item {qmain_subject}')

            # Create or amend P921 statement
            if not rescnt:
                pywikibot.error(
                    f'Main subject not found for Fast ID {fast_id}')
            elif rescnt == 1:
                add_main_subject = True
                if 'P921' in item.claims:  # Check for duplicates
                    for seq in item.claims['P921']:
                        if seq.getTarget().getID() == qmain_subject:
                            add_main_subject = False
                            break

                if add_main_subject:
                    pywikibot.warning(
                        f'Add main subject (P921) {main_subject_label} '
                        f'({qmain_subject})')
                    claim = pywikibot.Claim(repo, 'P921')
                    claim.setTarget(main_subject)
                    item.addClaim(claim, bot=True, summary=transcmt)
                else:
                    pywikibot.info(f'Skipping main subject '
                                   f'{main_subject_label} ({qmain_subject})')
            else:
                pywikibot.error(
                    f'Ambiguous main subject for Fast ID {fast_id}')
    show_final_information(isbn_number, isbn_doi)




[docs]
def show_final_information(number, doi):
    """Print additional information."""
    # Book description
    description = isbnlib.desc(number)
    if description:
        pywikibot.info()
        pywikibot.info(description)

    try:
        bibtex_metadata = isbnlib.doi2tex(doi)
    except Exception as error:
        # Currently does not work (service not available)
        pywikibot.error(error)  # Data not available
        pywikibot.warning('BibTex unavailable')
    else:
        pywikibot.info(bibtex_metadata)




[docs]
def main(*args: str) -> None:
    """
    Process command line arguments and invoke bot.

    If args is an empty list, sys.argv is used.

    :param args: command line arguments
    """
    global booklib
    global mainlang
    global repo
    global proptyx
    global targetx

    # Get optional parameters
    local_args = pywikibot.handle_args(*args)

    # Login to Wikibase instance
    # Required for wikidata object access (item, property, statement)
    repo = pywikibot.Site('wikidata')

    # Get the digital library
    if local_args:
        booklib = local_args.pop(0)
        if booklib == '-':
            booklib = 'goob'

    # Get the native language
    # The language code is only required when P/Q parameters are added,
    # or different from the LANG code
    if local_args:
        mainlang = local_args.pop(0)

    # Get additional P/Q parameters
    while local_args:
        inpar = PROPRE.findall(local_args.pop(0).upper())[0]
        target[inpar] = QSUFFRE.findall(local_args.pop(0).upper())[0]

    # Validate P/Q list
    proptyx = {}
    targetx = {}

    # Validate the propery/instance pair
    for propty in target:
        if propty not in proptyx:
            proptyx[propty] = pywikibot.PropertyPage(repo, propty)
        targetx[propty] = pywikibot.ItemPage(repo, target[propty])
        targetx[propty].get(get_redirect=True)
        if propty in propreqinst and (
                'P31' not in targetx[propty].claims or not is_in_list(
                    targetx[propty].claims['P31'], propreqinst[propty])):
            pywikibot.critical('{} ({}) is not a language'.format(
                targetx[propty].labels[mainlang], target[propty]))
            return

    # check dependencies
    for module in (isbnlib, unidecode):
        if isinstance(module, ImportError):
            raise module

    # Get list of item numbers
    # Typically the Appendix list of references of e.g. a Wikipedia page
    # containing ISBN numbers
    inputfile = pywikibot.input('Get list of item numbers')
    # Extract all ISBN numbers
    itemlist = sorted(set(ISBNRE.findall(inputfile)))

    for isbn_number in itemlist:  # Process the next edition
        amend_isbn_edition(isbn_number)



if __name__ == '__main__':
    main()