Kavit Tolia
The speaker spent over 12 years working in quantitative roles in investment management before returning to academia to study Artificial Intelligence. They are currently completing a Master’s degree in AI and ML in Science, and are particularly interested in how modern machine learning systems behave in practice, especially where modelling assumptions quietly break down.
Session
Multilingual embeddings are often assumed to place different languages into a shared semantic space. In practice, that alignment breaks down in systematic ways.
This talk explores where multilingual embeddings work, where they fail, and why. Using examples across multiple languages, I show how tokenisation, training data imbalance, and semantic ambiguity shape embedding behaviour in practice, along with practical diagnostics for evaluating multilingual embeddings.