Rethinking AV Testing in the Age of AI-Enhanced Cyberattacks AMTSO 2025

Rethinking AV Testing in the Age of AI-Enhanced Cyberattacks
.ical
2025-10-13 08:45–09:30, Main Track

Traditional AV testing methodologies are rapidly becoming obsolete in the face of emerging AI-powered cyber threats. In 2020, we demonstrated how AI, specifically Reinforcement Learning (RL), could enable ransomware to evade anti-ransomware defenses by autonomously identifying stealthy file encryption strategies. After approximately 600 training iterations, our RL-based agent learned how to encrypt files in a target folder without triggering detection mechanisms [Adamov & Carlsson, EWDTS 2020; AMTSO 2021].

While initially theoretical, such AI-enhanced malware is no longer speculative. The release of large language models (LLMs), beginning with ChatGPT in late 2022, has dramatically accelerated adversarial innovation. By early 2024, joint reporting from Microsoft and OpenAI confirmed that nation-state threat actors were actively leveraging LLMs for reconnaissance, scripting, and social engineering in the preparatory stages of cyberattacks.

Most notably, in July 2025, CERT-UA reported a groundbreaking cyber operation by APT28 (a.k.a. Fancy Bear / Forest Blizzard), in which the attackers operationalized an LLM (Qwen 2.5-Coder-32B-Instruct) to generate system commands on the fly. The attack utilized a Python-based tool, LAMEHUG, which issued reconnaissance commands and harvested sensitive documents autonomously, bypassing traditional AV signatures and behavior-based detection [CERT-UA, 2025].

These developments reveal the necessity to come up with a new testing approach for AI-powered cyberattacks. We will examine the shortcomings of current anti-malware test protocols, present a taxonomy of AI-driven attack techniques, and discuss a new testing approach designed to evaluate AV solutions under conditions involving AI-powered malware. By showcasing real-world examples such as APT28’s use of LAMEHUG, we aim to highlight the urgent need for industry-wide adaptation of AV testing to meet the next generation of cyber threats.

References:
1. A. Adamov and A. Carlsson, "Reinforcement Learning for Anti-Ransomware Testing," 2020 IEEE East-West Design & Test Symposium (EWDTS), 2020. Available: https://www.researchgate.net/publication/346942881_Reinforcement_Learning_for_Anti-Ransomware_Testing
2. A. Adamov, "Simulation Approach to Anti-Ransomware Testing," AMTSO Webinar, June 9, 2021. [Online]. Available: https://www.youtube.com/watch?v=-jtjCjc3r9I
3. "ChatGPT," Wikipedia, 2023. [Online]. Available: https://en.wikipedia.org/wiki/ChatGPT
4. OpenAI and Microsoft, "Disrupting Malicious Uses of AI by State-Affiliated Threat Actors," Feb. 14, 2024. [Online]. Available:
https://openai.com/index/disrupting-malicious-uses-of-ai-by-state-affiliated-threat-actors/
https://www.microsoft.com/en-us/security/blog/2024/02/14/staying-ahead-of-threat-actors-in-the-age-of-ai/
5. CERT-UA, "Кібератаки UAC-0001 на сектор безпеки та оборони із застосуванням програмного засобу LAMEHUG, що використовує LLM (велику мовну модель) (CERT-UA#16039)," July 10, 2025. [Online]. Available: https://cert.gov.ua/article/6284730

Outline:
1. Introduction
Motivation: Why AI changes the game for AV testing
Brief history of malware evasion techniques
Rise of adaptive, intelligent malware

Case Study: AI-Powered Ransomware with Reinforcement Learning
Research background (EWDTS 2020, AMTSO 2021)
How RL was used to train ransomware to bypass detection
Testing methodology and key findings
What this showed us about AV limitations
From Theoretical to Operational: The Rise of LLMs in Cyberattacks
ChatGPT’s release and the shift in AI accessibility
2024: Microsoft and OpenAI report state-aligned actors using LLMs
Overview of threat actor usage: recon, scripting, social engineering
APT28's 2025 LAMEHUG Attack: A Real-World Turning Point
Summary of CERT-UA’s discovery of LAMEHUG (July 2025)
Analysis of how the malware generated system commands via prompts
Challenges this poses to static, signature-, and behavior-based AV detection
Why Current AV Testing Fails
Snapshot of AMTSO and other common AV test frameworks
Inability to simulate adaptive AI-generated behaviors
The “prompt-to-payload” gap in current test scenarios
Proposed Framework for Testing Against AI-Powered Malware
Taxonomy of AI-powered malware (LLM vs RL vs Hybrid)
Simulation architecture: AI malware agent => payload generator => command execution
Benchmarking AV response to variability, prompt injection, and real-time adaptation
Use of local vs public models in testing labs
Call to Action: Building AI-Aware Testing Labs
Guidelines for security vendors and researchers
How to create reproducible and ethical AI-malware testbeds
Suggestions for future work (e.g., LLM adversarial red teaming)

Thesis / Takeaway:
Traditional antivirus testing approaches are insufficient to evaluate modern, AI-powered malware. This talk demonstrates why new, adaptive, AI-aware testing frameworks are urgently needed and proposes a concrete methodology for simulating and benchmarking AI-driven malware behaviors in a controlled lab environment.

Detailed Description:
The cybersecurity landscape is undergoing a radical transformation as LLMs and Reinforcement Learning (RL) techniques enter the offensive toolkit of advanced threat actors. While defensive security research has made significant progress in applying AI to detection and triage, adversaries are now turning the same technologies against us - introducing malware that can learn, adapt, and generate novel attack patterns on the fly.
This talk will present both foundational and newly emergent threats from AI-powered malware, beginning with early work (EWDTS 2020) in which reinforcement learning was used to teach a ransomware agent how to evade detection. We will then transition to real-world use cases, including the 2024 confirmation by Microsoft and OpenAI that nation-state actors were leveraging LLMs in cyber operations, and the groundbreaking APT28 campaign discovered by CERT-UA in July 2025. That attack operationalized a Hugging Face–hosted LLM (Qwen 2.5) to generate reconnaissance and document exfiltration commands dynamically, attempting to evade conventional AV defenses.
The second half of the talk will focus on testing methodologies. Today’s AV testing standards, from static signature tests to dynamic sandboxing, fail to account for the fluid, generative nature of AI-enhanced threats. We will expose these shortcomings and propose a new testing paradigm that integrates LLM-driven code generation, adaptive command execution, and adversarial prompt crafting.

Attendees will gain:
+A clear understanding of how AI-powered malware behaves differently from traditional samples
+A detailed proposal for how to construct meaningful, reproducible, and ethical AV tests for such malware
+Practical recommendations for AV vendors, red teams, and testing labs seeking to future-proof their detection capabilities

Intended Audience:
-Malware analysts, threat researchers, and reverse engineers
-AV developers and QA teams at endpoint security vendors
-Red/purple teams designing adversarial simulations
-Academics and practitioners exploring AI in offensive security
-Security product evaluators

Session category:

AV Testing: New Frontiers

Alexander Adamov, NioGuard Security Lab

Dr Alexander (Oleksandr) Adamov is the Founder and CEO of NioGuard Security Lab (nioguard.com), a cybersecurity research laboratory. With over 20 years of experience in cyber attack analysis, gained through his work in the antivirus industry, he has taught cybersecurity at the universities of Ukraine (nure.ua) and Sweden (bth.se) for the last 15 years. His laboratory focuses on applying AI and machine learning to solve cybersecurity problems. NioGuard Security Lab is a member of the Anti-Malware Testing Standards Organization (AMTSO). Dr Adamov regularly speaks at major cybersecurity events, including the Virus Bulletin Conference, OpenStack Summit, UISGCON, OWASP and BSides.

Rethinking AV Testing in the Age of AI-Enhanced Cyberattacks .ical 2025-10-13 08:45–09:30, Main Track

Rethinking AV Testing in the Age of AI-Enhanced Cyberattacks
.ical
2025-10-13 08:45–09:30, Main Track