Juliacon 2024

BERTopic to accelerate Ukrainian aid by the Red Cross
2024-07-11 , Else (1.3)

By means of Topic Modeling, discussed topics can be subtracted from a set of documents. BERTopic is a way of Topic Modeling that uses Large Language Models. A high-level overview of how BERTopic works will be presented, together with its evaluation and the application of it on a use case of the Netherlands Red Cross. In this use case, BERTopic supports in getting insights into the needs of Ukrainian refugees as expressed through social media.


510, the Data & Digital initiative of the Netherlands Red Cross, is performing Social Media Listening on Telegram channels of Ukrainian refugees, to obtain insights in their needs. As a result, the Red Cross can direct its aid more purposefully. Obtaining such insights can be a time-intensive task, while in case of an emergency, one would like to initialize aid as quickly as possible. BERTopic has shown to be a powerful tool in quickly obtaining an overview of the discussion on social media.

In this presentation, a high-level explanation of BERTopic will be provided, which is a LLM-based method to obtain topics from a set of documents and categorize them accordingly. Other than basic machine learning knowledge, no further background knowledge is required. Rather than explaining how an LLM works, we will focus on what its task in BERTopic is. Next, we will discuss the evaluation of BERTopic, which is a challenging task, as it is an unsupervised method. Three different metrics are proposed, including an innovative approach that creates a test set with GPT. Finally, the application and results of BERTopic to the described use case will be discussed. This will show the practicality of BERTopic and hopefully inspire for other applications.

Time breakdown:
- Introduction to the Red Cross use case: 5 min
- BERTopic: 10 min
- Evaluation: 5 min
- Application and results: 5 min
- Q&A: 5 min