Luca Baggi PyCon DE & PyData 2025

Luca Baggi
.ical

AI Engineer at xtream by day, and open source maintainer by night. I strive to be an active part of the Python and PyData communities - e.g. as an organiser of PyData Milan. Feel free to reach out!

LinkedIn:

https://www.linkedin.com/in/lucabaggi/

Github:

https://github.com/baggiponte

Session

04-23

14:30

30min

LLM Inference Arithmetics: the Theory behind Model Serving

Luca Baggi

Have you ever asked yourself how parameters for an LLM are counted, or wondered why Gemma 2B is actually closer to a 3B model? You have no clue about what a KV-Cache is? (And, before you ask: no, it's not a Redis fork.) Do you want to find out how much GPU VRAM you need to run your model smoothly?

If your answer to any of these questions was "yes", or you have another doubt about inference with LLMs - such as batching, or time-to-first-token - this talk is for you. Well, except for the Redis part.

PyData: Generative AI

Titanium3

Luca Baggi .ical

Session

Luca Baggi
.ical