Smack my LLM up!
2025-10-22 , Europe

This session dissects a real-world case study where an actor weaponized automation flaws in Meta’s LLM-based compliance system to hijack high-value accounts via orchestrated botnet abuse, prompt injection, and linguistic manipulation. The attacker exploited vulnerabilities in the very safeguards designed to protect users, triggering account suspension and negotiating “restoration” through AI-manipulated support flows.

This case is not an isolated incident—it is a signal of broader systemic risks that emerge when generative models and automation pipelines are integrated without robust adversarial testing. Beyond the technical compromise, the attack leveraged prompt engineering as social engineering, revealing the cognitive blind spots of model-aligned trust systems.

In response, I introduce foundational forensic linguistic techniques and NLP-based detection methods for identifying AI-generated text in compromised communications. By combining stylometry, perplexity analysis, and syntax anomaly detection in Python, we illuminate detection opportunities hidden in prompts and narrative structure. With few more tips from cloud security area to protect the LLM deployments.
The talk closes with a reflection on the ethical tensions in detecting synthetic media.

This talk will blend live demonstration, code walkthroughs, and operational insights from an investigation that didn’t just uncover an exploit—but a philosophy of misuse.


In this talk, I present a forensic case study detailing how a threat actor compromised Meta’s LLM-driven moderation system to systematically hijack verified accounts, using prompt injection, linguistic manipulation, and automation loopholes to trigger platform-enforced takedowns and force ransom-based negotiations. The incident involved a globally active cybercrime network and exposed critical flaws in the trust models of cloud-native AI enforcement systems.

We will walk through the forensic process behind the investigation, analyze prompt-level exploit vectors, and demonstrate how attackers craft model-passing language to elicit beneficial outcomes from black-box systems. Moving from the operational to the analytical, I will also introduce Python-based techniques from forensic linguistics and stylometry that aid in detecting AI-generated text, model hallucinations, and adversarial prompt traces—applicable to both post-mortem analysis and real-time detection pipelines.

Finally, the talk explores the ethical grey zones emerging at the intersection of synthetic content detection, digital identity, and model-assisted enforcement. In the wrong hands, detection tools can become instruments of censorship or control—making it critical to understand both the how and the why behind these systems.

This is a talk for red teamers, detection engineers, AI researchers, and anyone standing at the fault line between automation and abuse.

Audience Takeaways:

Real-world attack path exploiting LLM-based automation in platform support

Techniques for detecting AI-generated text via forensic linguistic analysis

Python-based NLP tools for stylometry and prompt anomaly detection (will share my code)

Ethical considerations around AI-generated content detection in moderation and compliance

Operational guidance for improving LLM security posture in cloud deployments.

Jindřich is a Lead Security Researcher in Rapid7. His research work focuses on the domains of cognitive warfare, cyber espionage, AI threats and cyber threat intelligence. You might also recognise him as the security data scientist known as 4n6strider.