PyData Boston 2025

Uncertainty-Guided AI Red Teaming: Efficient Vulnerability Discovery in LLMs
2025-12-10 , Thomas Paul

AI red teaming is crucial for identifying security and safety vulnerabilities (e.g., jailbreaks, prompt injection, harmful content generation) of Large Language Models. However, manual and brute-force adversarial testing is resource-intensive and often inefficiently consumes time and compute resources exploring low-risk regions of the input space.
This talk introduces a practical, Python-based methodology for accelerating red teaming using model uncertainty quantification (UQ).


Planned outline of the talk -
1. What is AI Red Teaming and why it's important.
2. Description of current AI Red Teaming approaches and existing drawbacks.
3. What is Model Uncertainty Quantification (UQ) and why it can benefit AI Red Teaming.
4. Introducing one UQ approach that can be used - Dempster Shafer Theory (DST)
5. Putting it all together in Python.

This talk is technical and suitable for Python engineers with basic understanding of probablity and LLMs, but does not assume heavy background in AI security or LLMs.
Attendees will walk away with better understanding of challanges in AI security and how to address some of them using model uncertainty quantification.


Prior Knowledge Expected: No previous knowledge expected

Data professional with 15+ years of experience in software, data engineering, analytics, data science, and AI/ML. Graduate degrees in Computer Science and Statistics. Domain knowledge and background in multiple verticals, including media and entertainment, marketing analytics, and finance.