Kavit Tolia PyData London 2026

Kavit Tolia
.ical

The speaker spent over 12 years working in quantitative roles in investment management before returning to academia to study Artificial Intelligence. They are currently completing a Master’s degree in AI and ML in Science, and are particularly interested in how modern machine learning systems behave in practice, especially where modelling assumptions quietly break down.

Session

06-06

15:30

45min

Do Multilingual Embeddings Really Share a Semantic Space? Practical Lessons Across Scripts and Languages

Kavit Tolia

Multilingual embeddings are often assumed to place different languages into a shared semantic space. In practice, that alignment breaks down in systematic ways.

This talk explores where multilingual embeddings work, where they fail, and why. Using examples across multiple languages, I show how tokenisation, training data imbalance, and semantic ambiguity shape embedding behaviour in practice, along with practical diagnostics for evaluating multilingual embeddings.

Hardwick Hub

Kavit Tolia .ical

Session

Kavit Tolia
.ical