2025-04-23 –, Europium2
This intermediate-level talk demonstrates how to leverage Language Model Query Language (LMQL) for structured generation and tool usage with open-source models like Llama. You will learn how to build a RAG system that enforces output constraints, handles tool calls, and maintains response structure - all while using open-source components. The presentation includes hands-on examples where audience members can experiment with LMQL prompts, showcasing real-world applications of constrained generation in production environments.
-
Introduction to structured generation with LMQL and open-source LLMs
- Key differences between constrained and free-form generation
- Why structure matters for production applications
- Setting up LMQL with Llama -
Building a RAG system with structured outputs
- Implementing context retrieval with constraints
- Enforcing response formats through LMQL decorators
- Handling edge cases and error states -
Tool usage and function calling
- Implementing tool calls through LMQL
- Managing tool execution flow
- Error handling and fallbacks -
Interactive segment
- Audience members will write and test their own LMQL prompts through a live demo environment -
Production considerations
- Scaling structured generation
- Monitoring and logging strategies
Attendees will leave with practical knowledge of how to implement structured generation in their own projects using LMQL, understanding both the technical implementation and best practices for production deployment.
Intermediate
Expected audience expertise: Python:Intermediate
On a mission to structure unstructured text with NLP
Ex-cofounder with 8 years experience in NLP
I come from a mixed Hungarian-Dutch background and live in Nuremberg at the moment
In my free time I enjoy improv theatre and swimming