Wintermute: an LLM pen-testing buddy
10-19, 13:40–13:45 (Europe/Luxembourg), Salle Europe

The lightning talk will introduce an LLM-guided privilege-escalation tool designed for evaluating different LLMs and prompt strategies against a novel pen-testing benchmark.

TL;DR: you got a new pentesting buddy who can help you hack away.

We analyze the impact of different prompt designs, benefits
of in-context learning, and the advantages of offering highlevel guidance to LLMs. We discuss challenging areas for
LLMs, including maintaining focus during testing, coping
with errors, and finally compare them with both stochastic
parrots as well as with human hackers.

The research will be published on the week of

Aaron has been working at the national CERT of Austria between 2008 and 2020, he has a background in maths and computer science. Since 2020 he freelances mostly for EC-DIGIT-CSIRC, the IT security team of the European Commission. He is the co-founder of (community wifi mesh network),, a tool for automating the typical tasks of IT security teams. He believes in using automation, open source and machine learning for improving the lives of DFIR folks.

This speaker also appears in: