Wikimedia and GLAM-Wiki volunteer mainly active on Wikidata and Wikimedia Commons; OpenGLAM advocate; part of the OpenRefine team
OpenRefine (previously Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
This tutorial introduces the basic functionalities of OpenRefine, with a focus on Wikidata reconciliation and (batch) editing.
A panel discussion to brainstorm about a sustainable and resilient future of Wikidata's key volunteer-built tools.
Many everyday edits and contributions to Wikidata are powered not by Wikidata’s own editing interface, but by an ecosystem of very diverse software tools which are designed and maintained by external parties.
Quite often, individual Wikimedia volunteer developers create and maintain such software in their free time (examples include tools like QuickStatements and the Wikidata reconciliation service). In some cases, tools are part of larger projects that receive occasional funding (examples include OpenRefine, GLAMpipe, the Wikimedia Commons mobile app, the ISA Tool, and many others).
The resulting Wikidata tool ecosystem is extremely rich and interesting, but also notoriously vulnerable. Development on such tools can easily stall when maintainers don’t have (the privilege of) free time (anymore), move on, and/or when temporary funding runs out.
This panel discussion and group brainstorm wants to look at this situation with a practical and solution-oriented mindset. What actions can we take as a community to make this tool ecosystem more resilient? Does the Movement Strategy process offer tools for taking up this challenge? Which tools should be developed centrally to ensure the core practices of the content communities, and how can we simultaneously encourage lightweight experiments in the community?
- Sandra Fauconnier (moderator; ISA Tool; OpenRefine)
- Susanna Ånäs (co-moderator; Wikimaps; GLAMpipe; Wikidocumentaries)
- Kat Thornton (Science Stories)
- Lucie-Aimée Kaffee (ArticlePlaceholder extension; Scribe)
- Antonin Delpeuch (OpenRefine; Wikidata reconciliation service)
- Alicia Fagerving (Wikimedia Sverige)
- Birgit Müller (Wikimedia Foundation)
- Quim Gil (Wikimedia Foundation)
Image credit: Jan Maszkowski (1794-1865) - The Artist's Children (1844). National Museum in Wrocław, Public Domain
OpenRefine is a power tool to clean messy data, popular in a diverse range of communities. It has been serving the needs of journalists, librarians, Wikimedians and scientists for more than 10 years, and is taught in many curricula and workshops around the world.
OpenRefine is quite actively used on Wikidata. In addition, thanks to a Project Grant from the Wikimedia Foundation, OpenRefine is, between September 2021 and August 2022, being extended with structured data functionalities for Wikimedia Commons. This code extension will make it possible to batch edit structured data of existing files on Wikimedia Commons, and to batch upload new Wikimedia Commons files with structured data from the start. In this short lightning talk we explain what we are (and will be) working on.
Discussion of SDC-related talks