Spatial Humanities 2024

Reconstructing and editing historical geodata. An open-source implementation of a conceptural framework
09-25, 17:30–18:00 (Europe/Amsterdam), MG1/02.05

The proposed paper deals with the challenges encountered and workflows needed for reconstructing and editing historical geodata. It describes the results of an effort to reconstruct territorial changes in Hessen since the first half of the 19th century. The paper focuses on the implementation of an existing conceptual data modeling framework using a custom plugin for QGIS (https://qgis.org/). This plugin aims at facilitating the error prone process of editing historical vector data and is published under an open-source license. Due to its generic design, it can easily be reused by other projects.

Introduction

Over the past years, researchers interested in the domain of historical cartography have been blessed with an ever growing number of digitized maps available on the internet, provided by private and public institutions alike. Some of them have been georeferenced and hence are available for desktop and web-based Geographic Information Systems (GIS) to be compared to historical and modern geodata. However, these digital maps are still mere images, a grid of raster cells with associated numerical values. They still need to be consulted by means of critical scholarly research to derive vector data. This kind of geospatial data can be used for the purpose of visualization and geospatial analysis. The features extracted from the map may range from topographical features, human settlement footprint and logistical infrastructure, e.g. canals, roads and railways. During the course of historical geographical data modeling, a special emphasis has been put upon the reconstruction of historical borders. While the late 1990s and early 2000s mark the heyday of the creation of national Historical GIS projects in Europe, little advances have been made in this domain since. With the exception of a recent project aiming at reconstructing the administrative boundaries of modern France. As a result, scholars dependent on such data for their research are confronted with a highly varying degree both in quality and quantity of historical vector data.

Data Modeling

With the advent of GIS in the 1980s and 1990s, several mental frameworks have been developed to cope with the central challenge in creating historical vector data: how to model the change in space and time in a manner that is both manageable by existing software solutions and researchers while limiting the amount of duplicated data. The most prevailing concepts have been the snapshot, time-variant and – as a variation of the latter – the Least Common Geometry (LCG) approach.

The snapshot model aims at reconstructing one or more points in time. Geometries are copied, hence spatial entities that have not changed their borders are nonetheless included multiple times within the same data set. While this method is easily applied and allows for economic and fast initial results, it does come with significant costs in the long run: databases created this way tend to be virtually non editable, as ex post border changes will have to be added to several or all existing snapshots. Despite its obvious drawbacks this approach is still applied in recent projects.

A more complex approach is to encode the validity by setting start and end points on the geometric features. This concept can be extended by reconstructing the smallest entities of territorial boundaries (called Least Common Geometries (LCG), typically boroughs or parishes). Using those features as puzzle pieces one can generate larger administrative units by means of GIS based union operations in an automatized fashion. Contrary to the snapshot model, border changes between features are only recorded once and can be edited as soon as new evidence comes to light. This is especially important in cases where there is no single written source registering all the territorial changes within a region. Indeed, our project has been significantly occupied with identifying the points in time specific border changes occurred and thus needs to be flexible enough to incorporate new findings. While it is highly beneficial regarding the maintainability of the data product new challenges arise with regard to data management. Researchers need to ensure that a) features do not overlap one another at any given point in time (spatial topology) and b) a continuous succession of features for an area (temporal topology) exists. While a limited number of edits can be managed with standard GIS software, an increasing number of border changes rapidly leads to an increasing amount of time spent on quality assurance.

QGIS Time Editor

To facilitate the process of editing time-variant features we developed the Time Editor (source-code: https://github.com/hil-mr/time-editor, documentation: https://wms.hlgl.uni-marburg.de/docs/time-editor/) plugin for the well-known and established open-source GIS QGIS. The plugin does provide several checks that address the challenges associated with the practical applications of the conceptual framework. The most important ones being the Temporal Integrity and Spatial Integrity checks. The Temporal Integrity check ensures that all features associated with an administrative unit do not overlap temporarily. As historical administrative units might dissolve and be reestablished, users can define exceptions for all existing integrity checks. The Spatial Integrity check ensures that for any point in time there are no intersections between adjacent features. All checks can be limited by the use of filter expressions and / or prior feature selections. The plugin was designed to be as generic as possible and has been extensively tested in different project contexts. In addition to integrity checks the plugin provides functions to facilitate the creation of new features.

Summary

With the methods described in this paper we aim at facilitating the edition of historical vector datasets. We hope that the workflows and software solutions developed are beneficial to other projects in this domain. Besides, special emphasis is laid on openness – be it in the software development process or regarding the licensing of the resulting data products.

Niklas Alt works at the Hessische Institut für Landesgeschichte (Hessian Institute for Regional History) in the field of digital history with an special emphasis on spatial humanities. He is involved with the creation of workflows and applications for the digitization and presentation of historical maps online.