Siddharth Gupta
Computational Cognitive Science researcher at the University of Potsdam, Potsdam, Germany
Sessions
Do you ever find it complicated to learn the complexities of a traditional web framework to push your data science work online? Worry no more! Streamlit might help speed things up as it is designed for the required purpose - creating beautiful data-related web apps that can be deployed in minutes.
In the hands-on tutorial, we’ll go through various features of Streamlit and build a small lyric fetcher app based on the available curated dataset of around 24K Billboard top-100 songs.
What can go wrong with tokenizer encodings? Everything! I will share my experience of understanding, misunderstanding, and ultimately learning to work with tokenization in LLMs. I will discuss what surprisal is, its relevance to my research, and its connection to tokenization. The talk will include various examples illustrating how misunderstandings of tokenization can arise, as well as strategies for debugging and preventing these issues.