Gabi Kadlecova
I am a Machine Learning Researcher at distil labs, where I work on knowledge distillation and tool calling for small language models. I did my PhD at Charles University in Prague, focusing on Neural Architecture Search and surrogate models.
I believe not every problem needs a large and complex model. Both during my PhD and at distil labs, I have been exploring how small models fare compared to state of the art. I enjoy analyzing the problem first - understanding the limitations of both small and large models is what helps us really solve it.
Session
Large language models have been widely used in tool-calling workflows thanks to their strong performance in generating appropriate function calls. However, due to their size and cost, they are inaccessible to small-scale builders, and server-side computing makes data privacy challenging. Small language models (SLMs) are a promising, affordable alternative that can run on local hardware, ensuring higher privacy.
Unfortunately, SLMs struggle with this task - they pass wrong arguments when calling functions with many parameters, and make mistakes when the conversation spans multiple turns. On the other hand, for production applications with specific API sets, we often don't need general-purpose LLMs - we need reliable, specialized models.
This talk demonstrates how to increase the accuracy of SLMs (under 8B parameters) for custom tool calling tasks. We will share how leveraging knowledge distillation helps to get the most out of SLMs in low-data settings - they can even outperform LLMs! We will present the whole pipeline from data generation, fine-tuning, and local deployment.