Avoiding the basilisk's fangs: State-of-the-art in AI LLM detection hack.lu 2023

Avoiding the basilisk's fangs: State-of-the-art in AI LLM detection
.ical

2023-10-19 10:00–10:40, Salle Europe

The world is awash in large-language model (LLM) AI (e.g., ChatGPT) news, predictions, and of course, content (all for good and ill). This talk takes a step back from the posturing and hype to look at how these models work, and how to detect the content they produce. We will look at the fundamentals of LLM-generated text detection, compare the best in breed: GPTZero, Roberta, and OpenAI's detector with a novel detector, ZipPy.
ZipPy is a new, open-source LLM text detector developed by Thinkst Labs that is 60-100x faster than the competition, over 1000x smaller (< 200KB), and for many types of content, more accurate. We will explain the intuition behind ZipPy, show how it works, and they types of content it struggles with. Finally we look at where LLMs can improve their stealth, and fundamental shortcomings in their designs that enable detection long-term.

Are LLMs going to upend, or just end the world? Will malevolent AIs spread disinformation and FUD to enslave humanity in a world of fear? Will Roko's Basilisk come to pass? In order to help stay these dramatic end-times, LLM detectors are here! We can build safe, AI-free zones to limit the digital "noise" that these models can blast out at scale, if only we can reliably detect and classify a content's origin.
This talk does a deep dive into the leading LLM text detectors, both open-source and commercial, and compares them against a number of different datasets. Next, we throw into the mix ZipPy, a novel open-source detector based on code written in the mid-1980s that outperforms the state-of-the-art in a number of dimensions. ZipPy is simple (less than 200 lines of Python), and it codifies the intuition about a core difference between LLMs and humans that no additional amount of data or training cores can overcome--being unique! Using ZipPy we can walk through the features used to differentiate a text's origins and how with a simple, embedded detector we can build a human-centric world where LLMs are used only to help us rather than subvert us.

Jacob Torrey

Jacob is the Head of Labs at Thinkst Applied Research. Prior to that he managed the HW/FW/VMM security team at AWS, and was a Program Manager at DARPA's Information Innovation Office (I2O). At DARPA he managed a cyber security R&D portfolio including the Configuration Security, Transparent Computing, and Cyber Fault-tolerant Attack Recovery programs. Starting his career at Assured Information Security, he led the Computer Architectures group performing bespoke research into low-level systems security and programming languages. Jacob has been a speaker and keynote speaker at conferences around the world, from BlackHat USA, to SysCan, to TROOPERS and many more. When not in front of the computer, he enjoys trail running, volunteering as a firefighter/EMT, and hiking with his family.

Avoiding the basilisk's fangs: State-of-the-art in AI LLM detection .ical 2023-10-19 10:00–10:40, Salle Europe

Avoiding the basilisk's fangs: State-of-the-art in AI LLM detection
.ical

2023-10-19 10:00–10:40, Salle Europe