A case study on building our first LLM feature – how to balance speed + quality :: PyCon AU 2025

A case study on building our first LLM feature – how to balance speed + quality
.ical
2025-09-12 11:00–11:30, Ballroom 1

We’re all building AI features now – or will be soon. But working with teams who are building with LLMs brings its own challenges – namely: How can we bring in the latest research, consider AI ethics, and consider the cost of different models without blowing past delivery dates. Not to mention making sure that the features we build will be stable, reliable, and maintainable in the future.

In this talk, I’ll share a case study of how we built our first LLM feature. In 1 month, we did everything from running experiments, developing evaluation methods, assessing the risks, and considering ethical concerns to build the feature. Specifically, over this period we did a literature review, consultation with academic experts, data labelling, model experimentation, a cost assessment, and finally, all the ML engineering to launch it into production. The outcome: <1% extreme misclassification and zero hallucinations.

In this talk, we’ll share our approach to building LLM features – how we partnered with academia (without being delayed by their timelines), what tooling we used, and how we made the cost and money tradeoffs to keep business stakeholders happy. As one example, we’ll share how important evaluation data was for building our features, because it helped us improve our definitions and revealed gender differences in how people perceive feedback. We’ll share the principles we used when balancing rigorous, robust practices with cost and timeline considerations. Finally, we’ll share which frameworks actually helped us make the right calls, avoid expensive do-overs, and navigate the AI ethics side as well.
You'll walk away with hands-on tools for leading the AI conversation within your own organization – including how to identify ethical issues early, address them efficiently, and still deliver on time and on budget.

Vivek Katial

Vivek is Lead Data Scientist and a founding team member at Multitudes, where he drives & executes the company's Data and AI strategy. He is passionate about building data products that create positive change. As co-founder and Executive Director of The Good Data Institute (GDI), he's led an Australian NGO that helps charities build data capabilities, supporting over 65 organisations on more than 80 projects globally. In his spare time, Vivek loves to jam on data science, ML and ethics. He's recently spoken at Tech 4 Social Justice, OPTIMA-CON, and LAST Conf, and he also holds a PhD in Optimisation via Quantum Computing.

A case study on building our first LLM feature – how to balance speed + quality .ical 2025-09-12 11:00–11:30, Ballroom 1

A case study on building our first LLM feature – how to balance speed + quality
.ical
2025-09-12 11:00–11:30, Ballroom 1