Welcome to the WikidataCon 2021! The organizing team will introduce you to the event, its main themes, and provide practical information about how you can follow and get involved.
During this keynote, long-time Wikimedians Anasuya Sengupta and Adele Godoy Vrana will share frames and practices from Whose Knowledge? on knowledge justice and digital decolonizing. As they acknowledge and celebrate what Wikidata has achieved so far, they will call us in for a thought provoking conversation in the spirit of tough love, inviting us all to reflect on how the Wikidata community needs to practice knowledge justice, beyond knowledge representation.
A lot of great things have happened around Wikidata over the past year and the Wikibase Ecosystem has made great strides. We will take a look at the most important developments of the past year and great things people have done in and around Wikidata. We will also give a sneak-peak at what’s coming next.
Reimagining Wikidata from the margins is an effort to reflect upon and to address the inequalities in knowledge representation and in contributors from the Global South in the Wikidata ecosystem. The project started with the organization of WikidataCon 2021, and in this session, you’ll find more about the vision of the project, its process and how to collaborate.
Wikidata has grown in all dimensions since its start 9 years ago. In this talk we will have a look at one of the biggest technical scaling challenges that Wikidata currently faces: the Wikidata Query Service. We provide a high level overview of the technical challenges of scaling Wikidata Query Service (which were also outlined in the August 2021 update) and the current results of the WDQS user survey. Based on this, we introduce some of our current strategies/plans for scaling and how you can help make them reality.
The Wikidata Community Awards celebrate the work of people and groups involved in Wikidata, and highlight some projects nominated by the community. Join us for the ceremony to discover some great projects!
Let's connect with other participants! During this session, we will gather in the Social Room and form groups to have informal discussions with other participants.
This session is taking place on the online venue of the conference (Venueless), in the Social Room, accessible from the menu. You will be able to enable your microphone and camera, walk in the virtual room and connect with participants, forming groups up to 15 people.
In order to access this session, you need to be registered to the conference, in order to get an access link to Venueless.
The Wikimedia Foundation’s vision is to “imagine a world in which every single human being can freely share in the sum of all knowledge”. Abstract Wikipedia imagines a world that closes the gap of the expensive linguistic diversity through abstract representations of knowledge that can get “translated into one of [Wikipedia’s] 300 languages whenever someone wants to read part of it”. In this talk Tochi Precious, bilingual educator and cross-cultural communications expert, and Silvia Gutiérrez, data scientist and computational linguist, will converse with Denny Vrandečić, Product Manager of Abstract Wikipedia, in order to unravel the challenges that lie at the intersection of automation and diversity.
In this session, you will get an overview of the technological maturity in Brazilian museums and understand how Wiki Movimento Brasil developed a process for putting Wikidata at the core of digital dissemination strategies for GLAM partnerships. We’ll dive into the possibilities Wikimedia ecosystem offers for GLAM institutions in different socio economic contexts - from the development of open source technologies to establishing their relevance online. Luciana Conrado, network articulation coordinator of the Tainacan project - a free software for the social construction of digital repositories - and Solange Ferraz de Lima, the former director of the Paulista Museum, an institution closed for almost ten years for renovation that used Wikidata to share and innovate with their collections online, will talk to Marília Carrera, projects manager at Wiki Movimento Brasil.
Brazil and Germany together face global challenges such as preserving biodiversity and combating climate change. In Brazil, the Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) operates mainly in two thematic areas: protection and sustainable use of tropical forests, as well as renewable energy and energy efficiency.
The forms of cooperation are already profoundly impacted by digitization, an axis that GIZ has placed at the center of its corporate strategy. Data-based processes multiply in the company and different projects can serve as a point of contact with the international community that produces free knowledge, enabling a desired exchange in favor of sustainable development.
The Tibetan heritage is important and the data volume is extremely rich. They came from 1000+ years of studies of Buddhism and other sciences, holding ancient Indian academic traditions. There are efforts within Universities and Institutes trying to digitize some of the most important content. However, their information is not easily discovered by the general public.
People find it difficult to find some very basic information on search engines even about some very famous people and places (such as Buddhist masters and monasteries). Through collaborating with the largest Tibetan digital archive, BDRC (Buddhist Digital Resource Center), we used Python + OpenRefine to import people and place entities with names and aliases, along with entity relations and improved search result pages and discoverability of those entities.
Some possible ideas are summarized on where to go next to improve the usability of Wikidata to preserve Tibetan treasuries and other valuable heritages of mankind.
Let’s celebrate Wikidata with Brazilian culture! We’ll have music performed by a female samba group, a tour of the cities of Salvador and São Paulo through afro tourism lenses, and a Brazilian traditional birthday recipe that you can do at home with a few ingredients. Join us!
Samba de Dandara: Samba de Dandara is a women's samba circle whose lyrics deal with the female sociopolitical struggle and the valorization of black Brazilian traditions. "Dandara" is a symbol of black resistance in the country: the group is named in honor of the warrior leader who stood out in the struggle for the freedom of the enslaved black people in colonial Brazil.
Guia Negro: Guia Negro produces independent content about travel, black culture, afrotourism and black business. The platform was created in 2017 to tell stories, inspire and guide you through the most diverse experiences in tourism in Brazil.
In this space, the Latin America and Caribbean grantees will gather to share experiences on how they organized capacity building events on Wikidata before the conference and to discuss possible ways for the empowered agency on Wikidata’s ecosystem from their local contexts. This is a moment to share learnings and build connections! The session is open for everyone.
As Wikidata’s data is used in more and more technology we use every day, the vast majority of the people who are benefiting from our data never come to Wikidata.org and join the project.
 We will need to do what we can to enable them to contribute to Wikidata regardless, for example, to correct mistakes or update outdated data. The alternative would be an ever-increasing amount of work and expectation resting on the shoulders of too few people on Wikidata.org, which is neither healthy nor sustainable. This means we need to prepare for more and more contributions being made by people through special-purpose apps, 3rd-party websites, the Wikidata Bridge and more without ever having visited Wikidata.org. That brings with it a number of fundamental questions we’d like to discuss in this session. Here are some of them:
 * How can we ensure equity in making some of the fundamental decisions that shape Wikidata and that are being made on Wikidata.org?
 * What does it mean for Wikidata to shift more towards being the place of final decision making and arbitration instead of everyone coming there to edit?
 * What processes, tools, guidelines, etc do we need to have in place to make this work?
Templates and modules are the most important tool that the community of editors on all wikis have to enhance the wiki pages with efficient, uniform, structured and nicely-presented data. They are enormously versatile, and they are used on nearly every wiki page, but because their code is stored on every wiki separately, this makes collaboration across wikis and sharing of content and structured harder than it should be. Since 2004, there were several proposals to allow cross-wiki transclusion or a "global templates repository", and this finally seems to be gaining wide support from the editors community and the leadership. However, once the infrastructure is in place, how will the wikis actually collaborate with each other on developing templates? This panel will present the problem, and attempt to brainstorm on this, with a famous example relevant to Wikidata: the auto-filled Infobox. Several wikis developed Wikidata-driven infoboxes with similar functionality, but different internal implementation. Will they be able to merge at least some of the code and improve the collaboration and the sharing? We'll try to answer.
In this session, you will discover some useful gadgets and scripts to enhance the interface of Wikidata and help you edit more efficiently. You will also be able to share your favorite gadgets, so other people can discover them!
 The first part of the session will be run by Vera who will show a few useful tools, then the floor will be yours. You will be able to share your screen to show how your favorite gadgets work.
This is a discussion on our Journey towards building a WIKIDATA community in Nigeria. The Challenges we faced and how we have been able to overcome them.
OpenRefine (previously Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
 This tutorial introduces the basic functionalities of OpenRefine, with a focus on Wikidata reconciliation and (batch) editing.
This will be an exploration of how the Irish community has used Wikidata for Wiki Loves Monuments and what this meant for the representation of Irish built heritage on a number of projects. Both in English and Irish, Wikidata presents a number of interesting opportunities and challenges for building on this work. The incorporation of both Logainm for Irish placenames and a large dataset of Irish lexemes have hugely expanded the Irish language's presence on Wikidata, but where to go from here?
Structured Data on Commons has enabled us to better describe the contents of the millions of files that have been uploaded, harnessing the power of Wikidata. Wikimedia Sverige has been working on adding structured data statements to Wiki Loves Monuments photos from all around the world, which both makes them easier to discover and enables the community to better understand the impact and diversity of the competition.
I will give a short overview of this project and the recent developments, explaining our workflow, our plans for the future and how the community can get involved. Our common goal is for every WLM photo to be enriched with relevant structured data, so that the photographers' work can become even more visible, useful and interesting!
Wiki Loves Monuments India has linked their monuments data with the Wikidocumentaries project in the last two contests in 2020 and 2021. Having structured data about the monument sites as well as the contest, allows creating photo submission banners on the pages of each of the monument site.
Many large companies and organizations are relying heavily on Wikidata and the data treasure we offer as a commons. As they are exploring and using our data they encounter mistakes, inconsistencies, outdated information, vandalism and more. In addition, they get reports from their users about issues in the data. They are willing to give back, even on a large scale. Making this work is crucial for the long-term health and sustainability of Wikidata. In this session, we’d like to discuss how this can work. Which feedback do we want? How do we want it? What tools, processes, guidelines, etc do we need to have in place?
We would like to discuss the usage of Wikidata in the maintenance and structuring of Wiki Loves Monuments, specifically the Brazilian experience since the adoption of it in 2019.
Let's improve the documentation on Wikidata’s data quality tools. We’ll begin this hour with a short presentation on writing documentation pages using a few tools as examples. After that, participants will have the opportunity to edit documentation hands-on and suggest pages that might need editing.
Participants will be presented with a personal narrative of field photography of monuments in the state of Bahia, Brazil that began in 2017. I used a print architectural inventory, the "Inventário de proteção ao acervo cultural", published in the 1970s; and a Google Map produced by the Instituto do Patrimônio Artístico e Cultural da Bahia (IPAC). My experience fundamentally changed in 2019 with the adoption of Wikidata to manage Wiki Loves Monuments in Brazil. In practice, the lists generated by Wikidata led me not only to unphotographed sites in the state; at present the WLM Brazil team's enhancement of Wikidata items is leading us from simply "an image" to a more comprehensive representation--photographic and otherwise--of these sites.
A panel discussion to brainstorm about a sustainable and resilient future of Wikidata's key volunteer-built tools.
Many everyday edits and contributions to Wikidata are powered not by Wikidata’s own editing interface, but by an ecosystem of very diverse software tools which are designed and maintained by external parties.
Quite often, individual Wikimedia volunteer developers create and maintain such software in their free time (examples include tools like QuickStatements and the Wikidata reconciliation service). In some cases, tools are part of larger projects that receive occasional funding (examples include OpenRefine, GLAMpipe, the Wikimedia Commons mobile app, the ISA Tool, and many others).
The resulting Wikidata tool ecosystem is extremely rich and interesting, but also notoriously vulnerable. Development on such tools can easily stall when maintainers don’t have (the privilege of) free time (anymore), move on, and/or when temporary funding runs out.
This panel discussion and group brainstorm wants to look at this situation with a practical and solution-oriented mindset. What actions can we take as a community to make this tool ecosystem more resilient? Does the Movement Strategy process offer tools for taking up this challenge? Which tools should be developed centrally to ensure the core practices of the content communities, and how can we simultaneously encourage lightweight experiments in the community?
Panelists:
- Sandra Fauconnier (moderator; ISA Tool; OpenRefine)
- Susanna Ånäs (co-moderator; Wikimaps; GLAMpipe; Wikidocumentaries)
- Kat Thornton (Science Stories)
- Lucie-Aimée Kaffee (ArticlePlaceholder extension; Scribe)
- Antonin Delpeuch (OpenRefine; Wikidata reconciliation service)
- Alicia Fagerving (Wikimedia Sverige)
- Birgit Müller (Wikimedia Foundation)
- Quim Gil (Wikimedia Foundation)
Image credit: Jan Maszkowski (1794-1865) - The Artist's Children (1844). National Museum in Wrocław, Public Domain
Discussion about Wiki Loves Monuments presentations
Wikidata, although it has a lot of cross-wiki influence by providing data for sister projects, it has a comparatively smaller number of active users in vandalism’s monitoring {1}. Some vandalism, such as changing numbers used in article infoboxes, are usually innocuous and can be unnoticed for a long time whereas description changes are more problematic, as the effects are especially observable on mobile devices and can go unnoticed by vandalism patrols, as they act mostly on desktop. Although drastic measures like the abolition of the Wikidata description done by Wikipedia in English {2} can be considered, comprehensive measures such as deeper integration interwiki with the possibility of simultaneous editing {3} and history synchronization {4} or even simple measures like the display of Wikidata description of articles on desktop could be implemented {5}.
Wikimedia Commons is using Wikidata and SDC based Infoboxes for millions of files. This presentation will explain how is it working, will discuss some of the design choices and future plans.
The Wikidata pink pony session is a meetup where you can share your wishes and feature requests about Wikidata to the development team. It's not only about technical things, some great ideas like the WikidataCon started during a pink pony session! This is also the moment where you can discover tools that answer your needs, scripts to enhance your editing experience, and other great things related to Wikidata.
OpenRefine is a power tool to clean messy data, popular in a diverse range of communities. It has been serving the needs of journalists, librarians, Wikimedians and scientists for more than 10 years, and is taught in many curricula and workshops around the world.
OpenRefine is quite actively used on Wikidata. In addition, thanks to a Project Grant from the Wikimedia Foundation, OpenRefine is, between September 2021 and August 2022, being extended with structured data functionalities for Wikimedia Commons. This code extension will make it possible to batch edit structured data of existing files on Wikimedia Commons, and to batch upload new Wikimedia Commons files with structured data from the start. In this short lightning talk we explain what we are (and will be) working on.
Depictor is a new tool for quickly adding structured data statements to images on Wikimedia Commons using a game-like interface. The goal of the tool is to be able to do this on mobile devices as well, so that you can quickly add statements even when you're on a train or waiting at the dentist. Even though the tool has a basic mode there are lots of other powerful options that we'll explore in this session.
ISA is a mobile-first 'microcontributions' tool, that makes it easy for (groups of inexperienced) people to add structured data to images on Wikimedia Commons.
With ISA, you can choose a pre-defined set of images on Commons and then ask contributors to 'tag' these with multilingual structured metadata. Points are counted for each contribution, and therefore it is possible to organize 'tagging' or microcontributions competitions or challenges with ISA.
ISA received the coolest tool WikidataCon 2019 Award in the Multimedia category.
Wikimedia Deutschland wants to make a step towards creating a collaboration with a movement partner for a common software development infrastructure for Wikidata. With this we want to breathe life into the movement recommendations of increasing the sustainability of our movement, invest in skill and leadership development and empower local communities.
Since language and its use is a highly localized concept we believe that capacity 
 building both for the communities and the partner and growing the contributor base to our software products is best placed with partners in those local contexts. For us this means that the lexicographical part of Wikidata that has been introduced in 2018 and that we see as an important means to digital language equality for underrepresented languages is a good place to start finding new pathways of software collaboration.
Questions we want to discuss with the Wikidata community: 
 * What do we have to consider when building sustainable software development partnerships/ software hubs for lexicographical data from your perspective? What criteria do you think should drive such a decision?
 * Mapping suitable groups or affiliates for the partnership: How to find and reach out to groups interested in partnering in this project? What does it need to be a partner in a shared software development environment?
One of the hurdles to re-using our data more is issues in Wikidata's ontology. We looked into different types of ontology issues in our data and tried to come up with a classification. We'll present the current state and would love your feedback and input. In the future we will use the insights here to build better tools to prevent and fix ontology issues.
With both structured copyright data in Commons and copyright status of creators in Wikidata we could assist users and GLAM partners with uploading and improve and verify public domain works in Commons. Use cases are to assist users in the wizard when uploading a work, to detect errors or set warnings in mass uploads from GLAM partners, and to assist automatic public domain uploads with Wikidata as hub for artworks, currently in use for the Sum of all paintings project. One of the difficulties is that Commons uses the copyright determination of a work in the US and the source country of a work, as well as works with multiple contributors like books. Both not implemented yet in Wikidata.
In this session we would like to discuss the modelling of structured copyright data in WIkidata and Commons and discuss what we could automate with bots and wizards.
Discussion of SDC-related talks
"European Language Equality" (ELE) is an European Commission project consisting of 52 partner organizations that envisions a future where all European languages can achieve full digital language equality by 2030. In order to get there the partners are working to create a convincing roadmap to hand in to the European Commission that will make sure that under-resourced and minority languages in Europe have the technological support to exist and prosper in the digital age.
The Wikimedia movement - and more specifically Wikidata editors, Wiktionary editors, editors of Lexemes, Lingua Libre volunteers, edit-a-thon organizers, editors of small language Wikipedias, editors of European language Wikipedias - have been part of the consultation process for this project. We wanted to make sure the pains, challenges, wishes and needs of the volunteers and communities keeping the multi-language environment of Europe alive everyday are heard on a EU policy level.
In this session, we will present the results of the project so far as well as take the opportunity to discuss the pains of under-resourced language communities that are active in the Wikidata community.
Questions we want to discuss with the Wikidata community: 
 What are our challenges, needs and expectations for the future of Language Technology for under-resourced languages?
 What do we need from policymakers in order to preserve European languages with the Wikimedia projects?
In order to really use reusable data, one need flexible tools that are easy to use. This demonstration introduces GLAMpipe, a generic data tool that can be used for combining ,viewing and exporting data from Wikidata and other sources.
This session provides a forum to discuss the scaling challenges of Wikimedia Query Service (WDQS), including the short term disaster mitigation and long term strategies, such as migrating to an alternative of Blazegraph.
Featured as the new project of Wikimedia Foundation, Wikifunctions aims to create an open database of scientific, logical and linguistic functions. These functions can be exploited later to create new information systems and projects, including Abstract Wikipedia, which combines semantic information in Wikidata with open linguistic functions to automatically generate Wikipedia articles about all Wikidata items in all supported languages. In this presentation, I will present both projects and describe the stages of the development of the two initiatives and their modus operandi, in addition to the importance of the two projects for the Wikimedia Community. After that, I will highlight the critical concerns about the two projects that were raised by the Wikimedia Community in WikiArabia 2021 and that should be efficiently solved to ensure the success of the initiatives.
What is WikiProject Govdirectory, and why should you care? We'll talk about how data from Wikidata can be reused for high-impact projects, while at the same time improves the data in Wikidata. How do you go about to try to model and add data for all the public agencies of an entire country?
Swedish is one of the largest languages in Wikidata's lexicographical namespace, with over 35 thousands lexemes so far and a small but very active community. How did a language with fewer speakers than German, Spanish or Bengali manage to reach this threshold?
In my presentation, I'm going to give an insight into how I work to create and edit large numbers of Swedish lexemes, without relying on bot imports from external sources. The presentation is going to have a very practical focus, demonstrating the various tools the Wikidata community has created and the workflows I've developed.
The quality of Wikidata’s data matters. To ensure that we serve the world with reliable and verifiable data, we need to really understand how high or low the quality of our data is and how it develops over time and in different areas of Wikidata. In this session, we want to look at the ways we already measure data quality and what that does and does not tell us about our data. We’d then love to discuss which aspects of data quality we are still missing and how we could get a better understanding of those aspects.
This session will show participants how indigenous language communities can leverage the power of Wikidata to increase the visibility of their language on the internet.
In the lead-up to Abstract Wikipedia's launch, a sufficient body of linguistic information, requiring more thorough consideration of certain linguistic aspects sooner rather than later, must be in place so that different sets of functions can work together to produce naturally-sounding text.
This session introduces Ninai and Udiron, two related tools with which functions can be built to generate text based on the linguistic information for a given language. In doing so it will discuss the compositionality and manipulability of lexical units, the breadth and interconnectedness of meaning units, and the treatment of variation among a language’s lects broadly construed, and how they can be dealt with in those tools.
Special reference to the handling of these aspects for Bengali and a number of other languages will be presented.
Discussion of the languages sessions
Wikidata aims to provide an identifier and associated data for every concrete or abstract concept. This ambitious goal will facilitate many new use case, but also poses challenges in terms of data completeness and quality.
 The Wolfram Language (WL) is an easy to learn programming language with built-in support for computation, visualisation, machine learning, access to databases and the semantic web and, last but not least, a dedicated Wikidata function.
 In this talk I'll show some of the cool things you can do with Wikidata using the WL, including retrieval, querying, analysis, visualisation, comparison and curation of data.
 Whether you are new to Wikidata or an experienced contributor, whether you know the WL or no programming at all, you'll learn something new.
This is the follow up session of the keynote "Decolonizing Wikidata: why does knowledge justice matter for structured data". Bring your questions and your thoughts to this conversation with Anasuya Sengupta and Adele Godoy Vrana, from Whose Knowledge?
Conversation with Maryana Iskander, incoming CEO of the Wikimedia Foundation, and João Alexandre Peschanski, president of Wiki Movimento Brasil who are co-organizing the WikidataCon 2021.
As part of her listening tour, Maryana is interested in listening to different views from across our movement. She would love to hear your ideas, vision and proposals for Wikidata, how to ensure a sustainable future for the project, and how to reimagine Wikidata to make it more accessible to and inclusive of content and communities currently underrepresented.
During this session, you are invited to engage with the incoming CEO of the Wikimedia Foundation in a Wikidata-focused conversation. Particularly, Maryana is interested in understanding the perspective of Wikidata and its vibrant community:
 * What motivates you to contribute to our work and take part in Wikidata?
 * What makes the Wikidata community special? 
 * How can we increase participation from underrepresented groups in Wikidata? How can we reimagine Wikidata from the margins?
 * What are your questions for Maryana?
The session will be recorded but some time will be left at the end of the session for unrecorded open discussions.
The Organized Crime and Corruption Reporting Project (OCCRP) is using Wikidata to find synonyms for people's names. In this short talk we will present how we use Wikidata's data to support reporting on crime and corruption.
This presentation will provide an overview of how a team of librarians at Texas A&M University used OpenRefine to upload information to Wikidata for a selected sample of mechanical engineering students, their doctoral dissertations and faculty advisors as part of the PCC Wikidata Pilot initiative. The data came from our institutional repository as well as our VIVO database of faculty member profiles. It will also describe our ongoing work to enhance the items that were created both manually and with the help of Mix’n’Match. Finally, it will cover issues encountered and next steps, as well as possible implications that Wikidata may have for some of our more traditional processes to manage personal and organizational entities in our catalog.
Wikimedia Sverige is working in partnership with the Nationalmuseum and the National Historical Museums to explore and use the power of Wikidata together with the museums' authority data files. As Wikidata is positioning itself as the web's central authority hub, this is an exciting example of how GLAMs can do practical work with it.
 The aim of our project is to develop and evaluate methods of linking museums' authority data to Wikidata, with the ultimate goal of making it easier for researchers and other users to find, understand and analyze relevant information distributed across different museum collections.
 The project includes visualisations of relations between historical persons to show the potential of otherwise abstract large amounts of data. Another aim is to equip the museums' staff with the skills and tools to actively engage with the Wikimedia projects, and in the long run increase the knowledge about Wikidata and its possibilities among GLAMs in Sweden.
In the performing arts sector, it is frequent for items automatically created from Wikipedia articles to conflate conceptually distinct P31 values. The most common cases are items stated to be instances of both a building and an organization. As these items are enriched with more statements and identifiers, it becomes very difficult to distinguish which statements or identifiers refer to which entity. For example, an inception (P571) statement could refer either to the date when an organization was founded or to the date when construction for a building began.
In order to reduce and to prevent the occurence of items conflating two distinct concepts, we would like to discuss solutions such as introducing a new constraint for conflicting P31 values.
Currently the QRPedia platform is used to generate QR Code that links when scanning it to wikipedia articles in the same languages as the phone setting.
Nevertheless, in some occasion, listening to the article can be better than reading especially for someone moving that's why instead of linking to a Wikipedia article we can use wikidata and the exiting QRPedia to generate a QRCode that links to an audio file corresponding to the wikipedia article.
In this presentation we show the added value of putting images and metadata of digitised collection highlights of the KB, national library of the Netherlands, into the Wikimedia infrastructure. By putting our collection highlights into Wikidata, Wikimedia Commons and Wikipedia, dozens of new functionalities have been added. As a result of Wikifying this collection, you can now do things with these highlights that were not possible before.
Every year national authorities release data on crimes in their jurisdiction. This report shows the number of cases as per crimes. It also gives us data on cases disposed by the courts. What do these numbers mean? Are they real reflection of justice? What about experience of marginalised communities? How can wikidata use this information? These and a lot more questions will be discussed. The goal is not to find answers but to begin asking the difficult questions.
Wikidocumentaries is a platform that connects information from across Wikimedia projects and other openly available repositories in the web and displays the information in visually engaging pages across all the languages of Wikimedia projects. The goal is to become a maker space for citizen historians, enriching existing and importing new information to Wikimedia projects and making them again available for the public and GLAMs.
This is an invitation for projects, repositories and creators to bring Wikidocumentaries to the next phase together. I will present a selection of possible integrations and I invite anyone to propose new ones before and after the presentation.
In this presentation, we will explore the representation of gender in Wikidata, focusing in particular on non-binary gender identities. We will look at the way gender is currently modeled, at how it has been modeled in the past, an also at the user discussions that shaped the current status of gender representation. We will discuss the inclusion of non-binary gender identities in Wikidata, how it has evolved over time, and the issues that still persist.
Join us for a collaborative schema writing workshop. Members of the ShEx Community Group will write a schema live in this workshop. We'll go over the basis of contributing schemas to Wikidata's E namespace.
The presentation will show how we can use Wikidata as a starting point to generate mobile applications for Museums.
It's possible via a SPARQL query to get all the Wikipedia articles, photos and audio files through Wikidata and use them to compile an offline and multi-language mobile application with the Kiwix technology. The result will be a mobile application that can be used as a guide/voice guide for a museum without the need of an internet connection.
What would structured data with multiple epistemic frames look like? And how do we get there? Wikimedia Deutschland, Wiki Movimento Brasil and Whose Knowledge? in early October, invited thinkers and practioners from different locations and backgrounds to reflect on how to decolonize the Internet's structured data in a conversation that must continue! Join this discussion about the challenges and opportunities of structured data from the perspectives of different systems of knowledge and identities.
The Joan Jonas Knowledge Base (JJKB) is an open source digital resource housing information about the New York-based multimedia and performance artist Joan Jonas (b. 1936) who has been at the vanguard of interdisciplinary art forms such as performance, video art, and new media installations for over five decades. This academic research project (NYU, UQAM, UCLA) is part of the Artist Archive Initiative dedicated to providing useful information to conservators, curators, and other researchers who seek to learn more about the artist’s work.
In this presentation we will highlight the collaboration between the curatorial and conservation teams and the technology team in selecting and preparing a unique research resource drawing on materials from the artist’s personal archive, as well as museum archives, archives of photographers, galleries, university libraries, and other public and private archives and foundations where we found museum and performance documentation of the case studies, photographs, videos, publications, and exhibition ephemera included in the JJKB. Throughout this process it became clear that performance artworks are associated with many components, including objects (such as drawings, audio, video, and other media); performance art happens over time as new iterations are associated with the original work; and performance art may involve collaboration among artists and others. Hence, the materials and research data we worked with does not fit neatly into a traditional hierarchical schema, such as an SQL relational database. We therefore developed a flexible, RDF-compliant data model to capture a curated selection of research data and to upload this to the linked open data cloud via Wikidata, which allows us to support cross-cultural and cross-institutional research and collaboration.
This presentation will further highlight how we use of a variety of data visualization techniques — via Wikidata’s SPARQL endpoint — as an additional approach to present our findings to researchers. As we continue to add data into Wikidata about Joan Jonas and her work over time, the resulting data tables and data visualizations will become more complex and more robust. We hope they will lead to new and interesting ways to study this artist’s exciting works.
Depuis trois ans, la haute école spécialisée bernoise et plusieurs autres institutions à travers le monde collaborent dans le but d’amorcer un écosystème de données ouvertes liées pour les arts de la scène. La communauté s’est organisée autour de groupes de travail qui assument la prise en charge d’une vaste gamme d’activités afin que leur vision partagée deviennent une réalité. L’un de ces groupes de travail est consacré à Wikidata et Wikipédia. Au cours de la dernière année, ce groupe de travail a été particulièrement prolifique. Il a en outre offert plusieurs activités de formation et documenté les meilleures pratiques de modélisation dans deux WikiProjets.
[The Bern University of Applied Sciences and many other institutions around the world have been collaborating over the past three years to bootstrap an international Linked Open Data Ecosystem for the Performing Arts. The community has coalesced around working groups coordinating a range of activities to make their shared vision a reality. One of these Working Groups is dedicated to Wikidata and Wikipedia. During the last year, this working group has been particularly active, delivering training and documenting best practices in two Wikidata WikiProjects.]
Wiki API Connector aims to simplify the extract-transform-load (ETL) process of metadata to Wikimedia projects without requiring complicated coding or software development. It was originally created as a tool to facilitate the import of Smithsonian Institution images and metadata to Commons and Wikidata (as a Wikibase-aware analog of GLAM Wiki Toolset or Pattypan). With the main core written in Python, the use of a familiar YAML configuration file to map an API's JSON fields to Wikidata properties and items might be a general solution useful for other GLAM entities or partner organizations. This session describes the early work done so far and seeks feedback on how it might be useful for other users and applications.
This session aims to reimagine the practices of data structuring in historical and political perspective. Our presentation starts with some discussions made in the Laboratório de Estudos Sobre os Usos Políticos do Passado (Laboratory for Studies on the Political Uses of the Past), hosted by the Federal University of Rio Grande do Sul, and then we invite you to reflect on the possibilities and limits of a database that gives protagonism to the anti-hegemonic struggle by structuring the data of the interventions and protests in monuments that pay tributes to the ones who held positions of power in a past marked by colonialism, slavery and dictatorships.
Statistical agencies are in need of reliable information with which they can assign industry classifications to businesses. Wikidata could provide this kind of information on a silver platter, if only industry classifications were implemented and used in a consistent manner. But that’s not the case presently. The International Standard Industrial Classification of All Economic Activities (ISIC) and various other regional industry classifications are implemented in Wikidata, but their respective properties are modelled differently and used inconsistently. This session will examine current models and will attempt to build consensus on the most efficient way to implement industry classifications in Wikidata, with consideration to research and statistical use cases.
Since the inception of the readymade, any and every material started to be used as a vehicle of the artistic discourse. With the adoption of ephemeral materials, such as chocolate, water, wind, sound, among many others, along with the increasingly dependence on digital technology and performative events, new challenges came to the fore considering the future of works created especially within the domains of conceptual art, installation art, performance art, time-based media, new media art, net art, etc. Given those works performance-based nature, it is the quality of the documentation produced (often by museum professionals) that will determine their long-term sustainability. Much has been said about the importance of the production of a proper documentation (with several projects developed on the subject, starting by the 90s), as well as on the relevance of getting access to that documentation in order to implement moral casuistry, based in a comparative assessment of previous decisions and case studies. Most documentation has, however, not been made available in open access (as it would be preferable), primarily due to copyright issues, and part of the one that is freely available is starting to get lost, since digital (on-line) repositories are ceasing to be maintained, after the end of their funding project(s). With this comes the question: why then to devote so many resources in getting artworks documented for their future preservation if that documentation is not shared and hence not known and available to the overall community of stakeholders?
 The aim of this paper is then to reflect on whether museum professionals are prepared to think more seriously about copyright issues in a way to propose new policy recommendations or to try to find new pathways into open access, as to reach a much more sustainable preservation practice of contemporary art.
Studying networks of political elites in a comparative way across a large number of countries—especially involving multiple sectors like finance, politics, business, military, religious communities and so on—is challenging. Wikidata covers all types of elites—military, finance, clergy, etc.—which is a more realistic representation of who influences politics than simply studying politicians. Another advantage of Wikidata for the study of networks is its inherent graph structure. As a demonstration of Wikidata's use in a political science context, an application to kinship networks with data that covers 219,332 elite actors across 193 UN member countries shows that kinship ties among the elite are more prevalent in more authoritarian countries, in confirmation of theories of coup-proofing.
As part of the Program for Cooperative Cataloging (PCC) Wikidata Pilot, Wheaton College created and enhanced items related to Christian hymns, using one hymnal as the basis for a test data set for the substantial collection of hymnals in our library's special collections. Work has included modeling data for hymn texts, tunes, composers, and hymn writers; using Open Refine to reconcile data from Hymnary.org with Wikidata; and creating/enhancing Wikidata items that link to Hymnary and Library of Congress identifiers. Some data modeling difficulties have yet to be worked out.
The Penn Libraries participation in the PCC Pilot Project for Identity Management in Wikidata (https://www.wikidata.org/wiki/Wikidata:WikiProject_PCC_Wikidata_Pilot) established over 5,000 Wikidata items as serials from the Penn Libraries Deep Backfiles (https://onlinebooks.library.upenn.edu/webbin/backfile/penn-serials). Our project has made sure that serial issues in Penn Libraries Deep Backfiles have Wikidata entries that clearly identify them and distinguish them from other serials. We are now in the process of round-tripping the Wikidata Entities back into our source metadata storage--Alma using a version of OpenRefine incorporated into Alma as a cloud app--AlmaRefine.
This presentation will demonstrate the AlmaRefine process and showcase how the round-tripped data allow us to seed BibCard/Knowledge Panel functionality in our discovery systems.
Digital Humanities projects about racialised chattel slavery and the Transatlantic Slave Trade often make claims of being "decolonial" and "reparative" while modelling data about enslaved people that continues to replicate and encode the dehumanisation and commodification of enslaved persons, while excluding and restricting Afrodescendents from contributing to and accessing these projects, their significant resources and potential benefits.
This talk will expand on Black digital practice (identified by Jessica Marie Johnson in her important paper "MarkUp Bodies") in the Wikimedia communities and introduce the possibilities of 'WikiProject Chattel Enslavement and Freedom' that aims to:
 • improve how data about enslaved people, self-liberated people, and the histories of racialised chattel slavery is modelled and structured
 • address the knowledge debt around the knowledge production and intellectual histories of enslaved persons and their communities 
 • support the development of ethical approaches to data modelling about enslaved persons, enslavement and freedom on Wikidata
In many use cases, projects want to reuse a subset of Wikidata focused on project-specific topics, combining the Wikidata data with project-specific data. Extracting a subgraph of Wikidata is difficult because Wikidata is very large, and it is difficult to specify what to keep and what to discard. This talk presents Knowledge Graph Toolkit (KGTK, https://github.com/usc-isi-i2/kgtk), a toolkit that can process the full Wikidata on a laptop, and provides a rich suite of commands for query and path following that can be used to flexibly extract topic specific subgraphs.
[pre-recorded] This session covers how The Met Museum has contributed object metadata and depiction information to Wikimedia projects and in return, how Wikidata content is brought back into The Met's database and made available via its open access API. We will discuss our recent work with Structured Data on Commons, including the tools, processes, modeling challenges, and the complexities of using references for SDC. We welcome dialogue and discussion on how to improve these practices.
O Museu da Pessoa é um museu virtual que se dedica a registrar, preservar e disseminar histórias de vida. Pressupomos que a narrativa de cada pessoa signifique, em última instância, a expressão de sua singularidade. Cada entrevistado não é entendido como uma mera fonte de informações sobre o assunto, mas sim como uma pessoa que, de alguma maneira, vivenciou um pedaço de um momento histórico e se apropriou de forma pessoal de sua experiência. 
 Nossas histórias são nosso maior legado. São nossas obras-primas. Todas as histórias de vida são patrimônios da humanidade.
[pt-br] Como os projetos da cooperação internacional Brasil e a Alemanha através da GIZ, podem se integrar à comunidade de Dados abertos e colaborar com seus resultados.
Vamos compartilhar algumas iniciativas de projetos no setor de energias renováveis e eficiência energética, além das experiências e desafios na área de digitalização.
[en] Data for energies of the future - How the projects of the international cooperation Brazil and Germany through GIZ, can integrate into the community of Open Data and collaborate with its results.
We will share some project initiatives in the renewable energy and energy efficiency sector, as well as experiences and challenges in the area of digitalization.
CAPACOA has been exploring how Indigenous knowledge can be adequately and respectfully represented over the Web of data. Even though this consultative process isn’t over yet, the project team has observed a few areas of tension emerging between Indigenous modes of communicating information and modes of representing information in Wikidata. Among other things, the use of colonial exonyms for labels of Indigenous nations presents a real problem for many Indigenous artists. This session will take the form of a series of very short presentations on preliminary findings with related questions inviting input from the Wikidata community.
The Israeli Film Archive (IFA), in collaboration with Wikimedia Israel, plans to release the data of their newsreels collection to Wikidata. The IFA holds a treasure trove of historical newsreels from the early 20th century until the mid 70’s, filmed mostly (but not only) in Mandatory Palestine, and later in the State of Israel. These 1,200 film reels comprise some of the earliest, rarest and most completed audiovisual documentation of the events and people who are a core part of Israel’s rich and challenging history. 
 The newsreels have been digitized, tagged and enriched with information such as geographical coordinates, people and events depicted, date or year, and so on. Most of the newsreels have English subtitles too! 
 As this project is the first of its kind – or at least, I couldn’t find a similar Wikidata project – we would very much like to consult Wikidata experts and other Information Scientists on how to model the data and carry out the project, in the hope that it will be helpful in the future for similar projects.
The Vanderbilt University Divinity School maintains a database of over 6000 images of art that represent the practice of Christianity over the period of its existence. This database provides images for the popular Revised Common Lectionary website, but also is a resource for scholars, students, and religious educators. Our team of librarians is working towards making these images more discoverable by eventually linking to or creating Wikidata items for all of the artworks in the database as well as linking those items to openly licensed images uploaded to Wikimedia Commons. We will discuss the challenges we have faced so far in our work and our plans for future stages of the project.
Wikidata is increasingly becoming a part of the infrastructure for the OpenStreetMap project. Find out how Wikidata helps OSM maintain consistency, enriches geodata beyond what OSM considers mappable, and helps tell stories that neither project could tell on its own.
What "reimagining Wikidata from the margins" means to you? Bring your thoughts and let's discuss some reflection points from the project!
- How can Wikidata help you solve real problems in your community?
- Where does the data you add to Wikidata come from? How is the access to other local databases?
- What is the profile of Wikidata contributors in your community?
- Are there main themes in your activities on Wikidata?
- How is your relationship with other language communities?
- How do you feel navigating the broad Wikidata community? Is it different from other Wikimedia projects?
- Have you ever faced systemic biases in both contributors and content on Wikidata?
- What community and technical resources would be needed to make Wikidata be more equitable in reflecting the full breadth of human knowledge?
- What kind of local knowledge aren’t you able (yet) to model on Wikidata?
- How do you think adding oral history or unpublished knowledge to Wikidata would support the representation of your culture on Wikidata?
- Is there any notability or source policies that does not support adding your local community's knowledge on Wikidata?
- What prevents people in your community that edit other Wikimedia from contributing to Wikidata?
- What major knowledge gaps regarding your culture do you encounter on Wikidata?
- Is your community working on covering specific knowledge gaps on Wikidata?
- How can we ensure that Wikidata doesn’t replicate solely colonial knowledge?
- How do local institutions contribute to Wikidata? And what are the challenges they face doing so?
Information about craft is almost nonexistent in Wikipedia and Wikidata. With a view to stitching up that gap, Wikimedia New York City members have started Wikipedia:WikiProject Craft, hosting Craft+Wikipedia Roundtable sessions with the Textile Society of America. Putting together a new WikiProject involves piecing together many different contributions—much like piecing together a quilt. We hope you'll contribute and explore the world of craft!
Craft is the creation of objects using human hands. It is practiced by professional artists, tradespeople, amateurs and enthusiasts with a spectrum of skills and vision. Craft artists work with traditional craft materials and practices in fields such as glassblowing, pottery, jewelry, textile arts, woodworking and metalworking. The studio movement is part of a broader world of craft where boundaries blur between hobbyists, makers, specialists, and artists.
WikiProject sum of all paintings has done an excellent job on a traditionally-valued fine art, but much of craft remains undefined and underdeveloped. Relevant properties for craft on Wikidata include product or material produced (P1056), practiced by (P3095) and intangible cultural heritage status (P3259). Craft information on Wikidata can be used to generate lists of topics, to find craft artists, and to improve or create new Wikipedia articles.
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2021-09-26/Community_view
Vandalism and data quality issues affect community trust. Amazon Alexa’s use of knowledge from Wikidata has yielded several insights that could help the community identify and address data quality issues more efficiently. This talk will showcase our findings and highlight the work Amazon is continuing to do with Wikidata to make a better community-driven knowledge base.
Exploring the ubiquitous cultural institutions beneath our feet. A look at cemeteries and the historical lives they commemorate, as interpreted through GIS and Wikidata, as well as narrative sister projects.
https://meta.wikimedia.org/wiki/Wiki_Cemeteries_User_Group
 http://www.cemeteries.wiki/
 https://wikispore.wmflabs.org/wiki/Bio_Spore
The 2016 joint white paper by the International Federation of Library Associations and Institutions (IFLA) and Wikimedia Foundation, “Opportunities for Academic and Research Libraries and Wikipedia,” states: “The potential of Wikidata to draw linked data and linked data authorities together across the world’s languages and many different ontologies and taxonomies has enormous potential to support researchers around the world.” IFLA’s place as a global organisation for library and information services, with ongoing relationships with organizations like the United Nations shares strong affinities with the aims and scope of the Wikimedia projects.
The IFLA Wikidata Working Group was formed in late 2019 for the purpose of aiding the international library community in organizing around the Wikidata platform to leverage advocacy work and to encourage best practices. This session will discuss the activities of the Working Group, including a series of videos produced through a WikiCite grant, support for #1lib1ref, and ongoing work to support both Wikidata and Wikibase use in the context of library metadata.
Contribución de las Entidades Fiscalizadoras Superiores (EFS) en la publicación, intercambio y uso de datos abiertos relacionados con el control gubernamental.
Are all knowledges equal(ly representable in Wikidata)? What are the mechanics of (de)colonizing digital knowledge? What even is knowledge, and how and where does it fit on the map? I argue that these are questions of context: the contexts knowledge is extracted from, the contexts it is stored in, the contexts is it (re)presented in—in short, questions of de-contextualization and re-contextualization.
Starting from a notion of epistemic agency (the ability to influence one’s knowing and being-known), Wikidata knowledge (in)justice can arguably be understood as the opportunities and obstacles along those de-/re-contextualization paths that conduct knowledge into and out of Wikidata. To do so, this session will introduce a framework and graphical notation for discussing the processes of knowledge transformation involved in building knowledge bases like Wikidata, and it will become clear that we should ask not only what is represented (“Whose knowledge?”) but also how and by whom that ‘what’ is represented (“Whose way of knowing?”).
slides CC BY–SA 4.0
This session examines and discusses some key Wikidata-driven tools in helping women biography projects for the Women in Red project and the Smithsonian Institution American Women's History Initiative including:
 * Listeria – the tool for generating worklists from Wikidata, including Women in Red’s Redlists and the Smithsonian's Funk List of women scientists
 * Infoboxes – how might Wikidata-derived infoboxes be improved on and more widely adopted
 * Mbabel – tool for one-click creation of draft articles based on Wikidata content
 * Translation – how might biographies in other language Wikipedia editions be used in translation tools and accessed in Listeria listings, and what is the current status of machine translation
 * Cradle – forms-based interface for generating new Wikidata items
 * WEF-Framework (and the challenge of women’s names)
 * Humaniki – Wikidata-driven statistics for tracking gender gap progress
Have you ever wanted to share a multimedia biography of a scientist from a diverse background? Meet sciencestories.io, powered by Wikidata! We will demo sciencestories.io and explain how contributions to Wikidata, Wikimedia Commons and English Wikipedia help extend and improve these stories. We will go over the other types of resources that can be added to stories like Vimeo videos, YouTube videos, IIIF images, works from HathiTrust, Internet Archive and more. Join this session to celebrate stories of amazing people.
Everyone can embed Wikidata Query results, including maps, graphs, timelines or... whatever. Everyone except us. Some complex modules and templates allow now buildings graphs, there's a limited option to add automatically generated maps and we can do lists. But, there's no option for timelines. Let's talk about what we can do and what is impossible.
Australia has a small but passionate Wikidata community, and in this session several of the projects undertaken by Wikidatans in Australia and New Zealand will be presented, including comprehensive geographic coverage to enable census updates, the development of tools such as Entity Explosion and the WikidataR package, biota projects documenting Australia's unique natural heritage, and the creation of over 270 properties.
Even though Taiwan is a small island, the rivers during summer time is quite cruel because water flow surged, and worst during Typhoons go though Taiwan with heavily rain. Taiwan government rolls out a river code system to keep track of rivers in Taiwan, and also working with OpenStreetMap Taiwan and Wikidata Taiwan communities. In this talk we will talk about the challenge to keep track of every rivers in Taiwan, and what communities and government are doing to tackle the issue.
The open source programming language ‘R’ is a statistical computing language and environment, which is widely used for data manipulation, analysis and visualisation. It’s also a great match for Wikidata, the massive database of everything.
R is organised into packages that cover different capabilities. This session will be a live demo of the recently expanded package WikidataR that can read from and write to wikidata, disambiguate terms and more!
This package allows R's data handling power to be applied to Wikidata.
In this presentation, we are trying to go through a case study from Kerala, a Indian state. Creating Wikipedia pages and Maps related to Local self govt was started in Wikipedia community around 2010 times. In 2018, Kerala faced a big flood and that time some of our community members started populating datas of local self govts like schools,hospitals, other services like police station and fire station. When Covid19 effected, for the dashboard building purpose, Openstreetmap Kerala volunteers started creating geospatial attribute layer in osm and we linked Qids to OSM and OSM relations to wikidata. It is around 1200 entities and takes months to complete. After we created 21000 wards level under this and the mapping activities is in progress. This is a game changer because it could be used for creating dashboards and analysis include geospatial and adding now more data attributes as Opendata. This effort collectively did by Wikidata Kerala community, OpenStreetMap Kerala, OpenDataKerala, CODD-K & Swathanthra Malayalam Computing.
The beginning of Indonesian Wikidata Community can be traced back until 2012, and the community has expanded a lot since then. In this meetup session, we will discuss about improvements that can be made to our community, what are the challenges that need to be faced, and the future of Indonesian Wikidata Community. This meetup will also be served as a follow-up session from Wikidata Birthday Meetup that we have previously.
This is a meetup between the Wikidata editors and OSM mappers in the Philippines who are interested in an interlinked open dataset of local government units (LGUs) of the Philippines using Wikidata and OpenStreetMap as the data platforms. The meeting will discuss goals, current status and activities, ideas, and issues that needs to be tackled in order for Wikidata and OSM to have the best open-licensed data on Philippine regions, provinces, cities, municipalities, and barangays.
What aspects of Wikidata do you research? Which ones do you find challenging? In this condensed session each researcher will have the opportunity to introduce themselves and their work to other colleagues in just around 5 minutes. They will also be able to ask for help, offer collaboration and find out what other colleagues are working on.
Due to time constraints, please sign up for the session on Wikidata as soon as possible if you want to speak.
Knowledge graphs are being deployed in many enterprises and institutions. An easy-to-use, well-designed infrastructure for such knowledge graphs is not obvious. After the success of Wikidata, many institutions are looking at the software infrastructure behind it, namely Wikibase.
In this paper we detail how Wikibase is used as the infrastructure behind the EU Knowledge Graph. This graph, which is deployed at the European Commission, integrates different structured information about the European Union.
We will show: the current content of the graph, how the knowledge is ingested, how it is maintained up to date, how the data is connected with Wikidata and what services are constructed around.
I present ongoing work on two wikibase instances hosted on wbstack.com.
In the framework of the Elexis project, we are working on LexBib (http://lexbib.elex.is), a digital bibliography for the domain of Lexicography and Dictionary Research. LexBib wikibase brings together bibliographical data from LexBib Zotero group, and LexVoc, a controlled vocabulary of subject headings which is used for content-describing indexation of research articles. Zotero literal values are reconciled against ontology items. Highlighting author disambiguation, and wikidata alignment of LexBib entities and entity data, we explain our workflow, which could well be applicable to other domains.
Funded by Wikimedia Basque Country, we have started to build http://datuak.ahotsak.eus. Our goal is to link dialectal lexical forms from a large Basque oral corpus with standard Basque forms as documented in the largest Standard Basque reference corpus available today, and with Basque lexemes on Wikidata, at the level of lemma and form. Each form is linked to its reference on the respective corpus web portal, and each lemma on Wikidata is described at sense and form level. We will summarize solved and unsolved problems, also regarding the wikidata model for lexical data with Basque as use case, an agglutinative language with up to 4.000 documented inflected forms for a lemma, showing different options of how to deal with that, i.e. the path taken by Basque wikidata community, and our own approach.
As the digital transformation of the GLAM sector is advancing knowledge graphs become more wanted. Authority files like the GND provide handy reference knots that enhance both retrieval and visibility of the digitized data. As the GND has been a tool mainly serving the needs of librarians until now, its modernization demands alterations both on governance, data modelling and last but not least on the technical infrastructure. To address the latter the German National Library is driving two pilots instances in Wikibase. One will content about 9 Mio GND data items and their cross-references. The second will provide the ruleset on which the edition of GND files is based as structured data. The session will give an insight both into the conceptional work and creating a convincing infrastructure combining Wikibase and other applications. The pilot started in 2018 with a proof of concept and was presented at the WikidataCon in 2019. This year we would like to focus on our roadmap and on challenges deploying Wikibase. Which features do favour our goals, what limitations have we to cope with additional software?
Please compare the Wikibase Manifesto https://www.wikimedia.de/the-wikilibrary-manifesto/
In this presentation, I will review developments since the WikiCite 2020 conference from three main angles by considering its community, content and platform aspects, in particular in terms of how they relate to the sustainable future scenarios discussed on Saturday.
Narratives are a fundamental way in which humans make sense of reality. A large general-purpose knowledge base such as Wikidata provides the means to build digital narratives about historical events, which can be useful for both research and educational purposes. In this presentation we will discuss the work that we have done, and the challenges that we have encountered, to extract narratives from Wikidata in the context of the Narratives in Digital Libraries project (https://dlnarratives.eu)
NFDI4Culture is the consortium within the Nationale Forschungsdateninfrastruktur (NFDI) that addresses research data on tangible and intangible cultural assets. We aim to establish a needs-based infrastructure for research data that serves our community of interest, ranging from architecture, art history and musicology to theatre, dance, film and media studies.
Wikibase plays an important role within the consortium as RDM (Research Data Management) software infrastructure. In this session, we will hear from representatives of Task Area 1 and 5 within the consortium. TA1 focuses on Data capture and enrichment of digital cultural assets, and within this task area we are developing a test use-case of Wikibase to structure data around 3D models and reconstructions of cultural assets. The TA1 team from the Open Science Lab at TIB Hannover will present the ongoing work to connect a Wikibase instance with the 3D-viewing and annotation software Kompakkt. This presentation will showcase the potential to extend the capabilities of Wikibase for cultural heritage preservation with additional open source software tools.
In the second part of the presentation, the team from FIZ Karlsruhe will present ongoing work within TA5, which focuses on building a Knowledge graph of research data within the 4Culture consortium. The TA5 team hosted two Wikibase workshops this year, and they will share some results of the workshops - particularly focusing on key requirements for further development. This presentation will also give an outlook on the planned use of knowledge graphs and Wikibase instances in further NFDI consortia.
At India's National Institute for Plant Genome Research we are developing Wikidata as a central tool to mine the plant chemistry literature. As part of our regular student intern program, especially during the pandemic, students from all over India join us in short internships to do research in a collaborative Open Notebook-based project (CEVOpen). We create mini-ontologies from Wikidata as search and annotation tools, which then link back to Wikimedia resources.
Coming from a non-technical background, the interns:
 - have learnt how to use Wikidata
 - are creating dictionaries from scoping the literature
 - are using Text Data-Mining (TDM) tools to make multidisciplinary inferences.
In this short presentation, you will hear each of them speak about various aspects of dictionary creation and TDM, challenges, learnings, their experience and demo some CEVOpen tools.
Because Wikidata supports many languages we have developed our code to support discovery and annotation in non-English languages.
In 2018, a new course opened at Tel Aviv University -- the first for-credit course to feature Wikidata worldwide, called "From Web 2.0 to Web 3.0, from Wikipedia to Wikidata". The course is available to all undergraduates at TAU, from all disciplines, and was approved by the University's Rector Office. 
 Its structure is based on a course model previously designed, developed and implemented at TAU since 2013, in the hopes of scaling up, not only addressing issues such as Data Literacy, but also focusing on more social impact. 
 After 3 iterations of the course, this presentation will focus on: 
 * The course design and structure
 * The course outcomes, including students' learning experience
 * Lessons learned so far, including challenges and opportunities for both learners and faculty
After the ArtBase re-launch earlier this year, this lightning talk presents an update on the latest achievements and challenges of running your own federated Wikibase.
'Shared Citations' is a proposal for the Wikimedia Foundation to create a database of Wikimedia citation records; and associated improvements to cross-wiki monitoring and editing. The 'spiritual successor' to the WikiCite project, this proposal has been available on Meta for a year, and has received many endorsements.
This presentation would outline the proposal in the first portion of the session, then respond to questions from the audience about specific issues/questions arising from the audience in the second portion. In the third portion would be a discussion of the status and potential roadmap for moving forward with the idea itself (and/or its underlying architectural requirements).
The proposed new database, aiming to centralise the hosting and metadata-management of individual references used in any Wikimedia project as structured data "records". Each Wikimedia project could then call upon these records and, according to its own citation style preferences, display them to their readers.
By centralising, structuring, and sharing the content, the following can be achieved:
 - much duplication of content curation effort can be reduced
 - many workflows which improve knowledge integrity can benefit all sister projects simultaneously
 - new processes and research can be undertaken to improve increase our understanding of Wikimedia references; and
 - the architecture of how we store and update reference information can be made much more efficient.
Described on Meta at: https://meta.wikimedia.org/wiki/WikiCite/Shared_Citations
The De Jonge Wiki is a scientific research database on the architectural history of the Arenberg Castle in Heverlee (Belgium). The castle belongs to the KU Leuven and is regularly studied by students as part of their training.
The database consists of a WikiMedia front end and a Wikibase database in the background. The info boxes for the individual items in the front end are automatically fed from the associated data sheets in the background.
This combination of descriptive text and semantic data is intended to serve as a prototype for easy-to-set-up, inexpensive and easy-to-use scientific research databases.
The way Wikidata bridges across many domains of knowledge makes it interesting as a component of education. Here, I will look at this from two main use cases: (i) basic digital and data literacy and domain-general knowledge, (ii) domain-specific literacy and knowledge. Use case i will be mainly discussed at about high school level, and use case ii at undergraduate university level. Zooming in on the latter, I will further distinguish between use cases that are closely related to data science (ii-a) or not (ii-b).
The Digital Archive of Artists’ Publishing (DAAP) is an interactive, user-driven, searchable database of artists’ books and publications, that acts as a hub to engage with others, built by artists, publishers and a community of creative practitioners in contemporary artists’ publishing. It is developed via an ethically-driven design process, with support by Wikimedia UK and Arts Council England.
In this talk, we will highlight how we have drawn upon the working knowledge of users and archivists alike, to develop a database with sufficient complexity, that affords multiple histories to develop, confronting issues of authorship and representation, whilst addressing the challenges of cataloguing often deliberately difficult to categorise materials. DAAP is committed to challenging the politics of traditional archives regarding inclusion and accessibility, from a post-colonial, critical gender and LGBTQI perspective. With an emphasis on inclusivity from the start, we aim to privilege anecdotal histories and multiple perspectives alongside factual data, whilst the wiki style approach means that users can upload and describe their own materials, choose how to describe themselves in relation to these materials, and select appropriate sharing permissions at time of upload.
Utilising Wikibase, DAAP brings to the surface new and unexpected data connections across diverse collection artefacts, providing a resource to link to other archives, and communities. In the talk, we will also show how the DAAP implements a custom frontend interface on top of the Wikibase database, which follows familiar user interface metaphors, increasing accessibility across a broader audience.
Wikidata currently documents 37 million scholarly articles, and the number keeps increasing. It is hard to understand and analyze the main subjects and domains of these articles. Though Wikidata has a property P921 (main subject) which can help find relevant scientific articles of different domains on diverse topics, its current usage is limited to around 17 million (Scholia statistics). This talk focuses on the ongoing work on improving the links between scholarly articles and existing Wikidata items using P921, its advantages, and limitations.
Join the Semantic Lab at Pratt for a Lightning Talk journey through the evolution of their wikibase over time highlighting their recent Digital Humanities projects (Linked Jazz, E.A.T. + LOD - a partnership with the Robert Rauschenberg Foundation, and Linking Lost Jazz Shrines - a METRO Equity in Action Grant-funded project in partnership with Weeksville). Learn about the advantages, lessons learned, and ultimately how one wikibase has fostered symbiotic relationships with data across project topics and allowed flexibility to model data collaboratively.
Nowadays, systematic reviews have become a major piece of scientific information providing a snapshot about the multiple insights of a given research topic. The creation and revision of such scholarly publications requires the analysis and synthesis of various research papers according to predefined guidelines. Doing this manually is an exhausting and time-consuming task. That is why the creation of computer systems to automate systematic reviews can be very useful to let the development of such an output easier. Here, Wikidata as a semantic resource can be efficient in enhancing many tasks ranging from the formulation of search queries to identify relevant research evidences to the recognition of patterns for representing research findings. In this short presentation, I explain how I succeeded to develop several Python codes that efficiently automate systematic review creation tasks based on Wikidata statements and common libraries and I show how this can bring scholarly publishing to the next stage later.
This talk will address some of the shortcomings of Wikibase and discuss potential solutions found in synergy with Semantic MediaWiki. Real case example taken from the DataTrek Wikibase site.
We would like to present the process we developed and followed to take the scientific and academic publications from RIDC NeuroMat Google Scholar profile to Wikidata and the challenges overcome and still in the way.
The Wikibase Stakeholder Group (WBSG) commissions production and maintenance of open source extensions to Wikibase, and documentation for institutions that want to operate and maintain a fully-fledged instance of Wikibase.
Representative members of the WBSG will hold an open forum session.
This session will introduce the group to the broader Wikibase and Wikidata community by presenting the group’s members, its governance structure, and goals. In addition, we will hold an open discussion over the group’s roadmap for technical developments.
Attendees of the session will have a chance to learn more about the workings of the group, and enquire about membership if interested.
Following this introduction, existing members will have a chance to discuss development projects planned by the Group and get feedback from the broader Wikidata and Wikibase community. The interactive part of the session will involve publicly-shared notetaking and brainstorming tools which will be open to all attendees to view and comment on.
This presentation will provide an overview about recent developments at and around Scholia, a tool to visualize Wikidata-based information about WikiCite-related data. It will pay special attention to how Scholia integrates with WikiCite workflows.
The presentation looks at the use of Wikidata, Open Refine and related tools in the Data Science major of the Information Science bachelor's programme at Hannover University of Applied Sciences and Arts.
 It gives insight into the tasks and contents of the modules, as well as the use of data from projects such as Coding da Vinci or NFDI4Culture. Examples of students' results from the previous semester and their further use in other contexts are introduced.
 Furthermore, it will be discussed to what extent the practical involvement of students in OpenGLAM communities influences motivation and learning success.
Since 2016 the WikiCite community has been building a database of citation networks in academic works, largely within Wikidata. In 2021 the Internet Archive is building a dedicated Wikibase to further this work, including citations between Wikipedia articles and their sources. Combining these two graphs we can look into the "deep provenance" of Wikipedia articles, going from article to cited sources to the sources' sources.
Spontaneous lightning talks (i.e. not submitted in advance of the conference) related to WikiCite. Lightning talk slots are 5 minutes. Sign up on the etherpad for this session: https://etherpad.wikimedia.org/p/WikidataCon2021-WikiCiteLightningtalks
 There will also be time for some lightning talks at the end of the day.
The objective of this session is to allow members of the community to share their recent work, thoughts and reflections on other sessions, coordinate activity in the hack space, etc.
When chemistry students draw their own molecules, they want to know what they've made. This talk will draw together open source web tools to achieve this: Molview, and PubChem working in synchrony with Wikidata, via Entity Explosion and the Wikidata Query Service. The presentation will also show how to easily set up connections and methodologies like this, whenever sites use common identifiers.
At FOMU - photographic museum of Antwerp we are using Wikibase within The Gevaert Paper Project. this project is aimed at unlocking the photographic paper and documentation, photo packaging and sample books in the collection of Agfa-Gevaert. We aim to make these materials and the related information accessible as open data.
In this demo I would show how we upload our data with Quickstatements in a CSV file syntax. We’ve done it like this for our own WB instance because it’s easier to convert the data into this format than into a classical QST format.
Wikidata is one of the most important sources of structured data on the web, built by a worldwide community of volunteers. As a secondary source, its contents must be backed by credible references; this is particularly important as Wikidata explicitly encourages editors to add claims for which there is no broad consensus, as long as they are corroborated by references. Nevertheless, despite this essential link between content and references, Wikidata’s ability to systematically assess and assure the quality of its references remains limited. To this end, we carry out a mixed-methods study to determine the relevance, ease of access, and authoritativeness of Wikidata references, at scale and in different languages, using online crowdsourcing, descriptive statistics, and machine learning. The findings help us ascertain the quality of references in Wikidata, and identify common challenges in defining and capturing the quality of user-generated multilingual structured data on the web.
This short talk will outline the main steps needed to set up and start using a custom reconciliation service between OpenRefine and any arbitrary Wikibase instance. This method has been in use by individual institutions like Rhizome since September 2020, but since the release of OpenRefine beta 3.5 version is becoming more accessible to new users. We will also show a new box service for automated deployment developed at the Open Science Lab at TIB Hannover, in collaboration with the OpenRefine team.
In this session, as a community of wikidata for education enthusiasts, we are going to discuss the possibility of how wikidata as a platform can enhance learning across different stages of the learners journey through digitization, digitilation and digital transformation.
With funding from the community, we created three open source extensions Wikibase. These provide support for local media, extended date time formats and integration with Semantic MediaWiki. In this short session I will demo the functionality and answer your questions.
Outreachy is a diversity initiative that offers paid internships to work on open source software for three months each summer and winter. The Wikimedia Foundation participates in Outreachy, typically offering around four internships. This year, I have been working with several Outreachy students to write pywikibot scripts to synchronise content between Wikipedias and Wikidata. I will give a brief summary of Outreachy; describe the contribution period that leads to the selection of the interns (no CVs are involved!); and the outcomes from two internships from earlier this year.
In this workshop, we will discuss edge cases in historical bibliographic data and document ways of making them fit the Wikidata data models for books and periodicals. We're bringing together people from a variety of backgrounds, some that will be more familiar with the data we're talking about, some that will be more familiar with the technical side, some who might just be observing. We're hoping to spark a conversation that will keep going.
Prior to June 2021, I had never edited a Wikipedia page or even knew what Wikidata was. By August 2021, I had edited over 100 articles on Wikipedia and made 552 Wikidata edits. This presentation is about my journey into the WikiWorld as an intern with the Smithsonian American Women’s History Initiative.
My internship project focused on editing and creating Wikipedia articles and Wikidata properties about women in order to address the gender gap on Wikipedia and Wikidata. Throughout my internship, I learned how to utilize Smithsonian collections and resources to help enhance or create women’s biographies and Wikidata properties. One of my main projects was to assist with our July 27, 2021 edit-a-thon “Wiki Focus: Black Women in Food History.”
I’ll discuss the benefits of learning Wikipedia and Wikidata within the cultural heritage sector and how access to museum archives and resources set me up for success on Wikidata. Additionally, I’ll discuss what projects I had taken on at the Smithsonian in more depth and how the skills I learned transferred into my ability to pursue my own Wiki passion projects.
Wikimedia Germany is excited to share a preview of Wikibase.Cloud, a new addition to the Wikibase ecosystem launching soon.
A chance to globally discuss the state of Wikidata & Education. 
 With 2 years passing since our panel in WDCon19, this panel would offer an opportunity to the community working on implementing WD into Education a chance to connect and discuss burning issues. What are some interesting initiatives globally? What has changed? What are some of our current challenges, especially during COVID? and what can we do, globally, to enhance this global effort? These will be some of the questions we will be discussing with invited panelists from around the world and the audience attending this session.
An essential building block of the thriving Wikibase Ecosystem, that brings diverse data to Linked Open Data Web is a community of developers creating extensions to the base Wikibase functionality, tools providing new workflows, new ways of editing, curating and accessing linked open data for various audiences.
Wikimedia Deutschland focuses on providing the Wikibase software platform, which is intentionally generic. An ecosystem of connected Wikibases will only truly emerge when individuals and institutions who are experts in specialized knowledge and data have powerful technical possibilities to intuitively model, curate and connect their data in the systems based on Wikibase.
Wikimedia Deutschland cannot provide those technical solutions on its own, as we are not able to handle all use cases out there. We rather see Wikibase as a software platform and a base upon which non-staff developers will build powerful and robust tools and systems. Current APIs provided by Wikibase are limited, rigid and too hard to use for this vision to come true.
In this session we invite developers and other interested parties to share their needs, ideas and plans on extensions and tools built on Wikibase. We believe this input will help our work on enabling the ecosystem of developers around Wikibase.
Bibliographic metadata is currently a large part of Wikidata. Statements about scholarly articles make up 50% of the triples in Wikidata, and are heavily used by tools like Scholia, but are only touched by 2% of all queries.
Over the past months an analysis was begun of the possibility of splitting scholarly articles out from the Wikidata graph, including the size of the subgraph and potential performance impact of the change. https://phabricator.wikimedia.org/T281854
We will discuss the ongoing results of that analysis, when it makes sense to consider splits at all, what a split might look like (how would it happen, how would queries be federated, how would this impact tools like Scholia?), and how we might reach decisions about such changes.
Panelists: Lydia Pintscher, Daniel Mietchen and Aisha Khatun; Moderator Sam Klein
What are the challenges we encounter today when trying to teach with Wikidata in a Latin American country? Based on the experience of Wikimedia Chile incorporating Wikidata in higher education communities, this talk aims to explore the challenges and potential that the tool can represent for teaching and learning in terms of access, representation and development of skills in the 21st century.
An open discussion of the current state of citations on Wikidata, other large-scale citation wikibases, and annual WikiCite events.
eViterbo is a Wiki format platform developed by TechNetEMPIRE, a research project financed by the Portuguese Foundation for Science and Technology - https://technetempire.fcsh.unl.pt. The platform builds upon biographical data of building experts (people) and institutions across the Portuguese empire. The presentation will take a general tour on eViterbo (which uses mediawiki software, wikibase, with possible future connection with wikidata) discuss and explore the general aim, sources, the methodologies, problems and solutions of building up this collaborative, linked data, open-access tool.
While the world of Wikibase focuses mainly on institutional users, the increasing maturity of Docker-based distributions and availability of hosting platforms such as WBStack and Miraheze has led to a proliferation of individual and community use outside academia, as well as several independent research projects.
This overview of the field seeks commonalities in various examples, and briefly considers some of the challenges such users face.
People leave online communities after some time. However, the likelihood that a particular user leaves the project is dependent on the time they have been on the project already: people who have only spend a brief time in the project are more likely to leave than people who are long-term members. This is similar to the so-called Lindy Effect: "... a theorized phenomenon by which the future life expectancy of some non-perishable things, like a technology or an idea, is proportional to their current age." (from Wikipedia).
The Lindy Effect in Wikidata user retention holds only if the observed age accounts follow a power-law (Pareto) probability distribution. We have tested this assumption on ~400K Wikidata accounts, obtaining individual, full revision histories and singling out active (>=5 edits) and inactive months. We have also developed a binary classifier machine learning model w. XGBoost to predict if a user will continue to contribute to Wikidata in the immediate future (next month) or not, with satisfying initial results. We share the datasets and the code repository with the community and briefly describe the data acquisition procedures.
The Wikibase Community User Group (WBUG) supports Wikibase as an entity distinct from Wikidata. As a WMF affiliate, its mission is to cultivate Wikibase's development and encourage like-minded developers and data analysts to create and improve related tools.
Over time, the group has taken on an operational support role for new users. Members offer interactive help over Telegram and give input to discussions via their mailing list, as well as contributing to Phabricator tickets and documentation.
WBUG's primary contacts will hold an open forum during WikidataCon's Wikibase Track, instead of the usual monthly meetup. This will introduce the group to a broader Wikidata community and offer space to reflect on what did or did not work in terms of programming and learning as the Wikibase track concludes. Any time remaining may be used to consider the group's future development and activities.
What would be a birthday celebration without presents? Every year, people involved in Wikidata prepares presents to celebrate the work and dedication of the Wikidata community. This session will be the occasion for a few people to present their gift. If you'd like to present something, please add your project in this Etherpad until 1 hour before the start of the session.
This group will discuss examples of Wikidata in the classroom. Projects this group will cover range from using Wikidata as part of an introductory LIS course, documenting public art in the city of Boston, using Wikidata as a corpus for data science students, and speaking to several research groups about framing Wikidata for research.
Closing of the WikidataCon 2021, saying goodbye, and see you next year for Wikidata's 10th birthday!

