PyConDE & PyData Berlin 2024

Would you rely on ChatGPT to dial 911? A talk on balancing determinism and probabilism in production machine learning systems
04-23, 16:35–17:05 (Europe/Berlin), B07-B08

In the last year there hasn’t been a day that passed without us hearing about a new generative AI innovation that will enhance some aspect of our lives. On a number of tasks large probabilistic systems are now outperforming humans, or at least they do so “on average”. “On average” means most of the time, but in many real life scenarios “average” performance is not enough: we need correctness ALL of the time, for example when you ask the system to dial 911.

In this talk we will explore the synergy between deterministic and probabilistic models to enhance the robustness and controllability of machine learning systems. Tailored for ML engineers, data scientists, and researchers, the presentation delves into the necessity of using both deterministic algorithms and probabilistic model types across various ML systems, from straightforward classification to advanced Generative AI models.

You will learn about the unique advantages each paradigm offers and gain insights into how to most effectively combine them for optimal performance in real-world applications. I will walk you through my past and current experiences in working with simple and complex NLP models, and show you what kind of pitfalls, shortcuts, and tricks are possible to deliver models that are both competent and reliable.

The session will be structured into a brief introduction to both model types, followed by case studies in classification and generative AI, concluding with a Q&A segment.


Objective and Outline:
This talk addresses the often-overlooked need for integrating deterministic and probabilistic models in machine learning, which is crucial in complex production environments. We begin by defining deterministic and probabilistic models, highlighting their distinct roles in ML systems. The talk then showcases practical examples where the synergy of these models enhances system performance, focusing on classification and Generative AI models.

Target Audience and Expected Background Knowledge:
Intended for ML engineers, data scientists, and academic researchers, this presentation assumes familiarity with basic machine learning concepts and models. It's particularly beneficial for those involved in designing, implementing, or managing ML systems in production environments.

Key Takeaways:

  • Understanding the strengths and limitations of deterministic and probabilistic models in ML.
  • Strategies for effectively combining these models in various ML systems.
  • Real-world examples demonstrating the improved robustness and controllability achieved through this integration.
  • Insights into future trends and potential developments in model integration.

Time Breakdown:

  • Minutes 0-10: Introduction to deterministic and probabilistic models
  • Minutes 10-20: Synergies of approaches in real-world examples
  • Minutes 20-30: Applications for Generative AI models, including Q&A

Additional Information:
No prerequisites are required beyond a basic understanding of machine learning concepts. The presentation will be informative with a focus on practical applications, providing attendees with actionable knowledge and a deeper appreciation of model integration in ML systems.


Expected audience expertise: Domain

Novice

Expected audience expertise: Python

Novice

Abstract as a tweet (X) or toot (Mastodon)

Combining deterministic and probabilistic models to boost ML system robustness. Learn their benefits and applications in AI, backed by NLP case studies. #AIInnovation #MLTech #RobustAI

Nicolas is a Sr. ML Engineer at GitGuardian where he develops NLP-based technologies to detect vulnerabilities in code and provide remediation. He was previously Sr. Applied Scientist at Amazon Alexa where he developed the models that power Alexa's core understanding capabilities. He published multiple academic papers at top tier NLP conferences in the field of semantic parsing. Nicolas has hands-on experience with a variety of NLP models applied to client-facing applications.