Spatial Humanities 2024

Leon van Wissen

Leon van Wissen has a background in Dutch literature and computational linguistics and works in the Humanities Labs of the University of Amsterdam. As Data Engineer in Cultural Data Collection and Linking at the UvA's Data Science Centre, he models cultural heritage data as Linked Open Data and contributes to the development of research infrastructures such as the GLOBALISE project (KNAW-HuC), and the Amsterdam Time Machine (UvA).


Session

09-25
17:30
30min
Click, See, Explore: A Multimodal Approach to Better Understand the Early Modern Colonial World through Old Maps
Leon van Wissen, Lodewijk Petram

Unlocking archives demands more than words alone. In the case of the paper archives of the Dutch East India Company (VOC), localizing toponyms through (historical) maps facilitates the interpretation of the giant collection of letters, reports, and ledgers. These maps can act as interpreters, bridging the gap between past and present place names (often changed by succeeding colonial regimes, independence, or other historical events) and revealing how cartographers and their commissioners perceived and exploited bodies of land and water. By taking information from maps into account, one might gain a richer understanding of the spatial context in written archives, moving beyond the mere textual representation of people, places and what happened to them.

At the GLOBALISE project,[1] funded by the Dutch Research Council (NWO), we aim to make textual archives of the VOC (covering the period 1605-1799) searchable and researchable by recognizing handwritten text on almost 5M scans, and by annotating and identifying entities such as persons, places, and polities, including the events they were part of. This big textual corpus is the starting point for computer-assisted research into the history of Dutch colonization as seen from the perspective of the VOC.

To improve our understanding of the colonial context, we have complemented our textual corpus by interlinking it with a visual counterpart: a corpus of colonial maps. In our presentation, we will present a pilot that involves a three-layered enrichment of maps sourced from two collections (1584-1813) of the Dutch National Archives, and that we intend to develop further with more maps from other collections and archives, such as the Royal Netherlands Institute of Southeast Asian and Caribbean Studies (Leiden) and the Allard Pierson Museum (Amsterdam).[2]

Ahead of the enrichment steps, we convert each of the collection's Encoded Archival Description (EAD) files to a IIIF Collection (with subcollections and manifests, cf. the IIIF Presentation API[3]) to replicate their archival hierarchy and context. Next, we supplement these IIIF Collections with different kinds of (web) annotations, grouped by purpose in separate layers.
In our first enrichment layer, we try to link the early modern map view of colonies and other territories to a modern map through the Allmaps tool (https://allmaps.org/), which adds IIIF Georeference Extension[4] annotations. This layer helps to bridge the historical representation of a place with its contemporary one, and illustrates how the area was seen through the colonial lens:a high level of detail on a large scale likely means that the area was considered of great importance, and that there was considerable colonial influence.

The second layer is about named places. By applying a text spotter model, we automatically extract labels from the maps,[5] allowing us to run these through a handwritten-text recognition model to transcribe the labels,[6] and connect the labels to the places in our written corpus by linking them to our knowledge graph, external thesauri, and other gazetteers.
Finally, the third layer deals with geographic iconography. We have prepared a sample training set for usage in a segmentation model to annotate and classify icons and symbols on the maps, such as a Dutch flag representing a Dutch settlement, or trees to signify plantations and colonial exploitation. This layer is crucial for both comprehending the Dutch colonial worldview and tracing its evolution in the early modern era.

Each of these layers brings in a specific type of interpretation that can be viewed independently, or can be analyzed in combination with other enrichments. For instance, toponyms on maps can be linked to their corresponding icons. For convenience and maximum interoperability, we aggregate a pointer to the image itself, its metadata, and our enrichments in a single IIIF Manifest, which exposes all enrichments (ours and potentially those of others), creating a unified container for this layered information. It is this container that can potentially be called upon in the project's research environment to make it easier for a researcher to get a grip on the historical material as it provides additional aid for interpretation: analyzing the textual materials and these three layers together, and through time, paints a multifaceted picture of the colonizer's world perception. From commissioned maps to embedded references, this combined analysis unlocks crucial context for interpreting the early modern world through the colonizer's lens.

References
- Li, Z., Chiang, Y. Y., Tavakkol, S., Shbita, B., Uhl, J. H., Leyk, S., & Knoblock, C. A. (2020). An automatic approach for generating rich, linked geo-metadata from historical map images. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (3290-3298).
- Petram, L., & van Rossum, M. (2022). Transforming historical research practices–a digital infrastructure for the VOC archives (GLOBALISE). International journal of maritime history, 34(3), 494-502.


[1] Petram and Van Rossum (2022), see also: https://globalise.huygens.knaw.nl/
[2] The Leupe collection of foreign maps (1584-1813, 4.VEL & 4.VELH). See https://www.nationaalarchief.nl/onderzoeken/archief/4.VEL & https://www.nationaalarchief.nl/onderzoeken/archief/4.VELH respectively.
[3] https://iiif.io/api/presentation/3.0/
[4] https://iiif.io/api/extension/georef/
[5] We use the model created by Li et al. (2020) and a slightly modified version of their pipeline. See: https://github.com/machines-reading-maps/map-kurator
[6] An open-source HTR pipeline has been developed by the KNAW Humanities Cluster, Amsterdam: https://github.com/knaw-huc/loghi

Open spatial data (Chair: Christoph Schlieder)
MG1/02.05