PyConDE & PyData Berlin 2024

Analyzing COVID-19 Protest Movements: A Multidimensional Approach Using Geo-Social Media Data
2024-04-23 , B07-B08

The COVID-19 pandemic and associated policy measures lead to world-wide protest movements that were singled out by the spread of misinformation and conspiracy theories, predominantly on social media platforms. Publicly available social media data therefore is a powerful proxy for studying these protest movements. The data, consisting of user locations, follower relationships, and content information, allows to understand the geographical centers of activity, network structure, and key themes of conspiracy movements.

This talk will present a multi-dimensional network analysis for the Austrian COVID-10 protest movement using Python libraries like geopandas, networkx and gensim.
In particular, it will demonstrate how to identify geo-spatial hot spots using spatial statistics, densely connected clusters within the network by employing community detection techniques, as well as dominating content themes through topic modeling approaches.

The presentation highlights how data-driven analysis enables further understanding of movements that may pose threats to democracy, alongside the importance of publicly available social media data for addressing societal challenges.


The talk will walk through the steps undertaken in the analysis of a protest network using Twitter data. It will explain the methods used, present the results as well as code and libraries used following (roughly) this outline:

  1. Motivation: What was special about the COVID-19 protest movement and why a multi-dimensional view is crucial for understanding.
  2. The Data: The retrieved information using Twitter's API and the necessary pre-processing steps.
  3. Spatial Analysis: The statistical means to understand the movement's spatial manifestation, including explanation of used methods, presentation of results.
  4. Network Analysis: Mere social network analysis is not enough for understanding protest movements. Including the spatial information allows to draw deeper insights by geo-spatially mapping network communities and centralities.
  5. Semantic Analysis: Understanding the dominating themes in the protest network with semantic analysis: generating the document embeddings, clustering topics and dealing with a large dataset of tweets.
  6. Conclusion: Importance of multi-dimensional analysis and the availability of social media data for studying societally important phenomena.

Python libraries that were used (among others): geopandas, networkx. berttopic, lda and friends.


Expected audience expertise: Domain:

None

Expected audience expertise: Python:

None

Abstract as a tweet (X) or toot (Mastodon):

Understanding protest movements through multi-dimensional geo-social media data analysis.

I am a researcher and PhD candidate at the department of Geoinformatics at the University of Salzburg. My research focus lies on the spatio-temporal analysis of online communication networks based on social media data. Before that, I worked as a software engineer on scalable big data processing for machine learning applications.