PyConDE & PyData Berlin 2024

Daryna Dementieva

Hi, I'm Daryna 👋🇺🇦 I am a postdoctoral researcher at Social Computing Research Group in Technical University of Munich🇩🇪. Before, I obtained my PhD degree at Skolkovo Institue of Science and Technology under supervision of Alexander Panchenko with topic "Method for Fighting Harmful Multilingual Textual Content" 📜. Currently, I continue to follow my research vector participating in eXplainable AI (XAI) project and also multilingual NLP developing the models for the Ukrainian language.


X / Twitter handle –

@iamdddaryna

Github –

https://github.com/dardem

LinkedIn –

https://www.linkedin.com/in/daryna-dementieva/


Session

04-23
14:10
30min
How to Do Monolingual, Multilingual, and Cross-lingual Text Classification in April, 2024
Daryna Dementieva

In 2023, the field of NLP was again flurried -- the appearing of powerful closed- and opens-source LLMs opened new possibility for texts processing. However, many questions about these models usability for typical NLP tasks are still open. One of them is quite simple -- if we want a classification model for some task, can we rely on LLMs or is it still better to fine-tune an own model? It might be easier to obtain some classifier for English, but what if my target language is not so resource-rich? In this presentation, the main "recipes" how to obtain the best text classifier depending on the language and data availability will be described.

PyData: Natural Language Processing & Computer Vision
B07-B08