WikidataCon 2021

David Lindemann


Session

10-31
10:30
25min
LOD-ifying bibliographical and lexical data using wikibase
David Lindemann

I present ongoing work on two wikibase instances hosted on wbstack.com.

In the framework of the Elexis project, we are working on LexBib (http://lexbib.elex.is), a digital bibliography for the domain of Lexicography and Dictionary Research. LexBib wikibase brings together bibliographical data from LexBib Zotero group, and LexVoc, a controlled vocabulary of subject headings which is used for content-describing indexation of research articles. Zotero literal values are reconciled against ontology items. Highlighting author disambiguation, and wikidata alignment of LexBib entities and entity data, we explain our workflow, which could well be applicable to other domains.

Funded by Wikimedia Basque Country, we have started to build http://datuak.ahotsak.eus. Our goal is to link dialectal lexical forms from a large Basque oral corpus with standard Basque forms as documented in the largest Standard Basque reference corpus available today, and with Basque lexemes on Wikidata, at the level of lemma and form. Each form is linked to its reference on the respective corpus web portal, and each lemma on Wikidata is described at sense and form level. We will summarize solved and unsolved problems, also regarding the wikidata model for lexical data with Basque as use case, an agglutinative language with up to 4.000 documented inflected forms for a lemma, showing different options of how to deal with that, i.e. the path taken by Basque wikidata community, and our own approach.

Wikibase
Room 1