2026-05-07 –, IFEN room 2, Workshops and AI Security Village (Building D)
With the use of AI agents catching wind across the offensive security space, from phishing, to fuzzing and penetration testing, it was inevitable that malware would follow suit. While most discussions focus on using AI to generate malicious payloads at the malware’s runtime, or "vibe coding" it, we went a step further: we built a system where AI is the sole participant in the malware creation process itself.
We will begin by talking about how we got to this point, what sparked the idea, and jump into comparing different models - showing which gave the best code, which was most evasive, which prompts worked the best, and what we used in the agent.
We will then dig into the generation process itself – we will show the challenges with earlier implementations, and how we solve them, how to build the workflow to maximize the malware’s capability and randomization, and even how it managed to break signatures.
We will finish by showing how the resulting malware is performing, comparing different samples, and showing how each sample defeated several static malware analyzers, as well as talk about what's next for this agent, and what's next in the domain of AI-generated malware.
Modern AI systems have moved far beyond rule-based automation and are now capable of generating complex, functional software. While most discussions focus on productivity benefits like code generation and vibe coding, the same capabilities can also be applied to offensive security. This session explores a research project that examines how AI models can be orchestrated to autonomously generate new malware samples, and what this means for both attackers and defenders.
The talk focuses on understanding the process and experimentation space behind AI-driven malware generation: how model behavior changes depending on prompts, model selection, validation workflows, and code restructuring techniques.
The main things that are explored in the presentation:
Prompt design and task framing (what the model is asked to do)
Directly asking a model to write ransomware often fails due to safety controls or poor results. By reframing tasks, such as generating behavioral descriptions first and then implementing them in code, it becomes possible to produce working implementations while avoiding many common failure modes.
Model selection and orchestration (which models do what)
Different models excel at different tasks. The agent combines uncensored local models for unrestricted generation, stronger coding models for fixes, and remote models for validation. This multi-model approach improves reliability compared to relying on a single model.
Automated generation and validation loops (ensuring working output)
Generated code is automatically compiled, tested, and fed back into models when errors occur. This loop allows the system to fix compilation issues, improve functionality, and rely on working samples without manual intervention.
Code diversity and detection evasion (how “new” samples are created)
By allowing models to choose different implementations, encryption methods, structures, and even programming languages, each generated sample can look structurally different while doing relatively the same task.
Feature expansion (beyond basic malware behavior)
When prompted appropriately, models sometimes add additional behaviors such as persistence, system discovery, evasion checks, or data exfiltration attempts, demonstrating how AI can generate increasingly complex malware variants.
What can you gain from this
A practical view of how AI models can be chained together to generate functional malware samples.
An understanding of how prompts, model choice, and validation workflows affect output reliability and detectability.
A framework that researchers and defenders can use to generate diverse samples for testing detection systems.
While the presentation uses ransomware generation as the running example, the broader takeaway is about how generative AI changes the scale and variability of offensive tooling, and how the same techniques can also be leveraged by defenders to strengthen security systems.
Arad Donenfeld is an attacks and exploits developer in SafeBreach, and has a background in security research from several roles (including Deep Instinct, where this research was conducted). With his strong foundations of development, security, and operating systems internals, Arad develops tools for offensive operations, detection methods, and workflow automation. Arad focuses on practical techniques to identify and manipulate vulnerabilities and breaches, while testing and improving defenses across broad environments