User:MargaretRDonald/sandbox/Using queries + OpenRefine to improve biota Wikidata
Enhancing Australian biodiversity using openRefine
[edit]Abstract/description
[edit]- The online database IRMNG is used to download a partial list of an author's taxon names. The author whose taxa we are looking for is Humphreys (William F.)
- The resulting CSV file is uploaded to openRefine where we learn how to
- facet
- split columns to give author names
- reconcile columns
- create a schema
- upload some properties to Wikidata
Examples to be used are the
- openRefine spreadsheet IRMNG taxlist 20220410WattsV2 csv for Chris H.S. Watts species,
- together with the start of a new project for his colleague and collaborator William F. Humphreys based on a query we will form from IRMNG and upload to openRefine
An alternative approach
[edit]Using the following queries for APNI and AFD taxa:
- For genera with APNI ids (and no authority) plus taxon author citation
- For species with APNI ids (and no authority)
- For genera with AFD ids (and no authority) plus taxon author citation
- for AFD arachnid genera (limiting a query)
- For species with AFD ids (and no authority)
Modify these queries
[edit]- to pick a family, genus, order
and download the query result as a CSV file
The tasks thereafter closely match those discussed above and include
- forming links to the APNI and AFD pages for the taxon
- grabbing the authority and the publication from these links
to create lists of authors, taxon year of publication, publication name and page, and again, creating a schema to upload the reconciled authors and publications to wikidata.
What I am hoping to achieve
[edit]At the end of the session, participants will have learned
- how to create a project in openRefine
- why & how to facet
- how to split a column (and how to undo an action)
- how to reconcile a column with its wikidata
- some useful GREL functions
- how to create a schema for uploading data to wikidata
to ultimately create Wikidata entries like that for Illawarra wisharti.
Relationship to Wiki skills or to the theme
[edit]This is a useful way to upload bulk data to Wikidata, and should enhance participants' Wikidata knowledge & skills
Username/s
[edit]- MargaretRDonald (talk) 21:02, 2 August 2024 (UTC)
Session type
[edit]Depending on the participants, this would be a short series of online Zoom one-hour sessions with interactions between participants and presenters