Spatial Humanities 2024

Digital gazetteers: benefits and challenges of the harvesting tool gazetteers.net
2024-09-26 , MG2 01.10

In recent times, the media coverages of the Russian invasion of Ukraine clearly shows the importance - and the challenges - of management of data related to geographical names. Ukrainian place names appeared in an inconsistent spelling which sometimes looks like Russian and Ukrainian mixed together. The recipients can hardly distinguish whether a city or a much larger administrative unit is meant. Several levels of the challenges concerning the place names become apparent here: the correct assignment of the relevant language, the periodicity or actuality of the place names, especially since several place names have been changed in Ukraine since 2020, and the location in the respective administrative system.
In the course of history, place names within a language have changed, as an expression of power relations or as a result of a debate about the importance of local dialects among others. Similar reasons can also play a role in the official assignment of a place name in another language. Homonymity also causes confusion because not only are numerous administrative units named after their headquarters, but various places in different countries and in different languages also have the same name.
For centuries, various printed gazetteers have tried to provide orientation, but the large number of such gazetteers and their diverse structures did not make it easy. Some gazetteers cover the whole world and many languages. However, the coverage of separate regions varies. In addition, global gazetteers barely cover small regional languages which in turn become the focus of small, especially digital, initiatives.
Digital gazetteers usually do not reflect administrative changes. As a result, many incorporated towns are often not represented. Recent name changes are also ignored or updated with a delay of several years. Since there is no standard definition of place as a geographical unit (human settlement), the scope of places that are mentioned in the individual gazetteers include individual farms, mills and municipalities. State sovereignty in the course of history and thus also language authority in the respective areas is also mostly not reflected in the digital gazetteers.
The current development in digital humanities creates new possibilities for the use of gazetteers, which also enable a simultaneous comparison of different sources. Still, the sheer number, different geographical coverage and metadata schemes of digital gazetteers make it difficult to compare existing gazetteer entries systematically and to use existing data in other applications. At the same time, current digital gazetteers show how geographical orders of knowledge are transformed from analogue structures (for example, printed indexes) into digital structures.
The research project of the Herder Institute for Historical Research on East Central Europe (Marburg), the Institute for Regional Geography (Leipzig) and the Justus Liebig University Gießen, developed a publicly operational web application, gazetteers.net, which allows exploring content and metadata structure of several gazetteers simultaneously. The gazetteers.net web application enables users to search several place name-related databases in a unified manner and to view and compare data from different gazetteers. The application supports also the identification of items in different databases that refer to the same geographical entity, regardless of the definition of geographical place in the individual gazetteers or its administrative status. By linking corresponding items across gazetteers, the application facilitates data aggregation and comparison. In addition to the major and well-known web gazetteers, the official gazetteers and some small local gazetteers for a selected country (Poland) have been connected in order to be able to cover regional languages and historical names. A comparison of these specific and general gazetteers has also facilitated, among other things, the identification of differences regarding languages, spelling and administrative changes throughout history.
The project team examined existing digital gazetteers for their structure (semantics, description of metadata) and content (reliability of assignment between place names and coordinates). The project team also discussed geographical discourses inherent in existing gazetteers and examined strategies to reveal specific power-knowledge relationships within existing gazetteers. Having examined the results of this testing, project participants revised and refined the metadata structure and web application interface. The recent version of the harvesting tool was launched online after a positive evaluation by the expert communities. Despite the current regional focus of the project, searches can also be conducted at the global level. Current work on the tool is aimed at finding a way to incorporate more gazetteers, for example, of other countries or regions, without sacrificing clarity and responsiveness. Since the application is designed to support searches in the existing gazetteers, the quality of the results depends directly on the quality of each connected source.

Dariusz Gierczak studied Geography and Slavic Studies at the Philipps University of Marburg and has held various positions at the Herder Institute for Historical Research on East Central Europe since 2008. Among other things, he was editorial member of the Encyclopaedia of the European East at the University of Klagenfurt from 2004-2007 and worked as an editor for the Historical-Topographical Atlas of Silesian Cities from 2008-2014. In 2016-2018 he worked on the project "Upper Silesia from the air" at the University of Siegen and in 2018/2019 on the project " Topography of the Shoah in Breslau/Wrocław 1933-1949" at the TU Dresden. 2019-2022 he coordinated the project “Names change, places too. The Challenge of Developing Geodata-Based Gazetteer Research Technologies and Methods”. The thematic focus of his work is on cartography, migration and urban research; in terms of content, he deals with controlled vocabulary and metadata structures not only in his current position as a data researcher and standards and metadata editor.