2026-06-04 –, Track 2
Many organizations are developing LLM‑based applications to improve productivity, supported by the growing number of platforms that simplify their creation. However, integrating LLMs into applications introduces new security risks, as adversaries can exploit models through natural‑language–based attacks such as prompt injections and jailbreaks. Successful attacks can lead to sensitive data leakage, reputational harm, or deeper compromise of internal digital environments.
These risks highlight the need for structured, repeatable, and context‑aware security testing for LLM‑enabled applications. Therefore, we would like to present ProViLE: a systematic approach and supporting open‑source tool for prompt‑based security testing of LLM‑enabled applications. ProViLE emphasizes that effective tests are highly dependent on the context of the application. The approach guides practitioners through four key steps: (1) defining potential attack objectives, (2) identifying relevant attack techniques, (3) formulating corresponding attack prompts, and (4) evaluating the LLM application’s responses to the attack prompts.
The ProViLE tool automates the final two steps by using LLMs to (3) generate attack prompts from objectives and techniques, and (4) evaluate whether a response constitutes a successful attack based on the objective and a scoring rubric. This enables scalable and consistent testing across diverse application contexts. The result is a structured overview of the security posture of an LLM‑based application across custom security considerations.
ProViLE aims to facilitate the penetration‑testing workflow for LLM applications, but can also be used by development teams to conduct initial baseline assessments before deployment. By open‑sourcing our work, we hope to support the broader development of secure LLM‑based systems.
Outline:
During the talk, we will cover several parts of the paper and tool. Both are publicly available, and the tool is open source. With the talk, we hope to give the listeners more insight on how to make a better indication of the risks an LLM may introduce in their applications. We aim to make the talk interesting for both beginners and more experienced cyber specialists in the LLM area.
The following (sub)points will be discussed during the talk
- Why LLM Security Is a Growing Concern
- LLMs are widely adopted, meaning that many modern applications now include LLM in a way.
- LLMs are still relatively new and therefore lack mature pentesting practices.
- Specific attacks, such as prompt-based attacks, are often successful.
- Why Prompt Based Attacks Actually Work
- LLMs are trained to fulfil the users’ requests. This instruction can intervene with given security guidelines.
- Some other guardrails, such as in- and output filters, can be bypassed.
- Challenges in Testing LLM Applications
- Traditional vs ‘LLM Pentesting’
- Hallucinations
- LLMs are non-deterministic, making it harder to find vulnerabilities.
- Introducing ProViLE: Goals and Approach
- 4-step approach to facilitate Prompt Based Testing for LLMs.
- How to systematically find vulnerabilities in LLM based applications.
- The Four Step Framework
- (1) Defining attack objectives
- (2) Identifying relevant attack techniques
- (3) Prompting the LLM
- (4) Evaluate the response
- How the PRoViLE Tool Automates Prompt Generation & Evaluation
- Use of attacker and judge LLM.
- Structured attacker and judge prompt templates.
- Single shot vs multi shot prompting.
- Demo Run
- Small live demonstration of ProViLE on an LLM-based application.
- How Teams Can Start Using ProViLE Today
- Open source tooling
- Code is on GitHub, paper/flyer can be used as a ‘deep dive’ into LLM application testing.
- Limitations & Future Enhancements
- Currently focussed on LLMs with Extensions, such as RAGs.
- Future enhancements may include AI Agent support and agentic support.
- We aim to build an active open-source community, hoping to support the broader development of secure LLM-based systems.
- Conclusion & Takeaways
- Pentesting LLM-based applications is fundamentally different than traditional pentesting.
- Pentesting your LLM application is important and should not be underestimated / seen as an afterthought.
- The ProViLE approach and tooling enable structured identification of vulnerabilities that are specific to the context in which an LLM-based application is deployed.
Rajeck Massa is a Cyber Security scientist at TNO, where he contributes to applied research across system and software security, AI security, and advanced detection and innovation. He holds an MSc in Computer Science from Leiden University and joined TNO after completing an internship there. In his work, Rajeck is involved in research projects that study how complex technical systems can be abused under realistic adversarial conditions, ranging from low‑level software components to modern AI‑enabled applications. His interests include developing and validating practical security testing methodologies, particularly in areas where existing approaches fall short. Through his research, he aims to help bridge the gap between emerging technologies and actionable security practice.