The computational humanities and social sciences (CHSS) increasingly rely on large, text-rich datasets – from digital corpora and learner corpora to discourse-annotated datasets and historical archives – yet researchers often struggle with the limitations of existing tools for large-scale data processing, modeling, and reproducibility. This minisymposium introduces Julia as a powerful, expressive, and high-performance solution for CHSS research. We showcase how Julia’s strengths (speed, a solid type system, first-class multiple dispatch, and seamless interoperability with Python and R) enable both rapid experimentation and production-grade analysis. Through case studies ranging from corpus statistics and collocation networks to mixed-effects modeling of experimental data and large-scale language data pipelines, we highlight existing Julia packages and other emerging ones such as TextAssociations.jl and demonstrate how Julia can substantially expand what researchers in the humanities and social sciences can achieve. The minisymposium aims to build bridges between Julia developers and CHSS scholars while fostering a new community of users working with rich textual, linguistic, and sociocultural data.
3 organizers: Alexandros Tantos (Aristotle University of Thessaloniki), Julia Mueller (Universitaet Freiburg) and Axel Bohmann (Universitaet Koeln)
Overview
The computational humanities and social sciences (CHSS) constitute a rapidly growing area that works with massive amounts of unstructured textual, linguistic, and sociocultural data. Despite the scale and complexity of these datasets, CHSS researchers overwhelmingly rely on tools that are either slow (e.g., pure Python), fragmented across ecosystems, or difficult to integrate into modern workflows that require both prototyping and high-performance computation.
This minisymposium aims to introduce Julia as an ideal language for the next generation of CHSS research. Julia’s combination of speed, expressiveness, solid type system, and multiple dispatch makes it uniquely suited for humanities research where interpretability, clarity of modeling, and independence from opaque black-box pipelines are essential. Moreover, Julia’s seamless interoperability with Python and R lowers the barrier for researchers transitioning from more traditional CHSS workflows.
The minisymposium presents real examples of how Julia is already being adopted in CHSS, including large-scale corpus processing, collocation and association analyses, network-theoretic models of discourse, and mixed-effects modeling. We highlight emerging packages such as TextAssociations.jl, which provides high-performance computation of linguistic association measures; demonstrate how Julia’s statistical and data ecosystems support complex modeling and linguistic experimentation; and show pathways for community growth, education, and contribution.
Our goal is to:
- Introduce Julia to researchers in CHSS, where awareness remains low but potential impact is high.
- Demonstrate concrete use cases for text-rich, corpus-driven research that requires both speed and expressiveness.
- Connect Julia developers with domain researchers, fostering collaborations on text data, linguistics, and social-science applications.
Draft Schedule (3 hours)
Introduction: Why CHSS researchers should adopt Julia (15 minutes)
- The landscape of text analysis, corpus linguistics, and digital humanities today
- Limitations of common tools (Python, R, command-line pipelines)
- Why Julia’s design (types, speed, expressiveness) matters for humanities modeling and interpretability
- Easy and convenient interoperability with existing CHSS workflowsEfficient Corpus Processing and Text Pipelines in Julia (30 minutes)
- Handling large textual datasets (historical corpora, learner corpora, annotation repositories)
- String processing performance and benchmarking
- Integrating Julia with Python and R tools for NLP
- Case study: Preparing corpora for downstream statistical modelingLinguistic Association Measures & Collocation Networks with TextAssociations.jl (40 minutes)
- Why association measures matter for humanities research
- Demonstration of TextAssociations.jl: speed, type stability, reproducibility
- Collocation networks, discourse networks, and their relevance for social and linguistic interpretation
- Opportunities for community contributions and package extensionsStatistical Modeling for Humanities Research in Julia (45 minutes)
- Mixed-effects models for linguistic, discourse, or educational datasets
- Practical examples combining corpus data with hierarchical models
- Bridging exploratory and confirmatory approachesPanel and Q&A (20 minutes)
- Discussion with presenters
- How can Julia developers and CHSS researchers collaborate?
- Identifying barriers and next stepsBuilding a Julia-CHSS Community (30 minutes)
- Opportunities for package development
- Teaching Julia in humanities departments
- Interoperability and reproducible workflows
- Creating a Julia-CHSS special interest group
3 organizers: Alexandros Tantos (Aristotle University of Thessaloniki), Julia Mueller (Universitaet Freiburg) and Axel Bohmann (Universitaet Koeln)
Alexandros Tantos is an Associate Professor of Text and Computational Linguistics at the Department of Philology, Aristotle University of Thessaloniki (AUTH). He completed his postgraduate studies in Natural Language Processing at the University of Manchester Institute of Science and Technology (UMIST) in 2003. From 2004 to 2008, Alexandros contributed as a research associate to the SFB 471 project, Variation and Development in the Lexicon, while completing his PhD at the University of Konstanz, Germany. Upon returning to Greece in 2008, he taught Computational Linguistics at the Universities of Crete, Aegean, Thrace, and AUTH. In 2010, he was appointed as a Lecturer in Text Linguistics at AUTH. Since then, he has led multiple research initiatives, including the development of two significant linguistic resources for Greek: ESKEIMATH and C58. Between 2020 and 2023, he served as the scientific director for the project Latent Aspects in L2 Acquisition, funded by the Hellenic Foundation for Research and Innovation (H.F.R.I.). Currently, his research centers on computational semantics and pragmatics, the application of Large Language Models for first and second language learning and teaching, as well as corpus linguistics, with a strong focus on the development, maintenance, and utilization of linguistic resources.