Dr. Lisa Andreevna Chalaguine
Lisa is an accomplished educator, researcher, and freelancer specializing in data science, natural language processing (NLP), and artificial intelligence. With a PhD in Intelligent Systems from UCL and a master's from Imperial College London, Lisa has extensive experience in academia and industry, having taught at UCL, and contributed to impactful projects like those with Cancer Research UK.
A digital nomad at heart, Lisa teaches corporate clients and supervises university students worldwide, focusing on Python, machine learning, and NLP. Known for their engaging teaching style and passion for problem-solving, they are currently developing innovative courses and creating a YouTube channel featuring masterclasses on data analysis and machine learning.
Driven by a love for teaching, research, and helping others succeed, Lisa is exploring opportunities to return to academia, with aspirations to lecture in Eastern Europe and Central Asia. Multilingual and versatile, they are shaping the future of data science education while continuing to inspire learners globally.
Session
Topic modelling has come a long way, evolving from traditional statistical methods to leveraging advanced embeddings and neural networks. Python’s diverse library ecosystem includes tools like Latent Dirichlet Allocation (LDA) using gensim, Top2Vec, BERTopic, and Contextualized Topic Models (CTM). This talk evaluates these popular approaches using a dataset of UK climate change policies, considering use cases relevant to organisations like DEFRA (Department for Environment, Food & Rural Affairs). The analysis explores real-time integration, dynamic topic modelling over time, adding new documents, and retrieving similar ones. Attendees will learn the strengths, limitations, and practical applications of each library to make informed decisions for their projects.