2025-06-19 –, Main Stage
At CIRCL, we're working to make sense of the ever-growing stream of vulnerability data—structured, unstructured, and everything in between. From public advisories to dark web intelligence collected through the AIL project, the data is rich but often inconsistent, fragmented, and difficult to navigate.
To help tackle this, we’re using Natural Language Processing (NLP) and large language models (LLMs) to extract insights—like estimating vulnerability severity from free-text descriptions when no CVSS score is available.
Our custom NLP model is trained on real-world data using our own infrastructure and updated regularly to reflect new trends. We’ve also released everything we’ve built: from raw datasets and training code to the final models, all available on Hugging Face. And with our ML-Gateway tool, anyone can easily retrieve an AI-generated severity score—no structured metadata required.
This session will walk through:
-The challenges of working with messy, real-world vulnerability data
-How we’re using AI to structure, score, and make sense of it
-What we've built, what we’ve learned, and what’s next
It’s an inside look at how we're combining human context with machine learning to improve how we understand and act on vulnerability information.
3rd year cybersecurity engineering student in Paris, AI vulnerability management intern at CIRCL