Understanding file type identifiers hack.lu 2024

Understanding file type identifiers
.ical
2024-10-24 16:45–17:15, Europe - Main Room

Yara, LibMagic (file, binwalk, polyfile), TrID, Yara, Magika, PeID, Pronom, FDD, ShareMime, DiE...
How do they work? What are their pros and cons, their limitations, their risks?

There's a lot of misconception around file type identifications and scanning:
the existing tools have different needs and use cases, requirements and limitations (that could be abused).

Warning: contains raw bytes.

See also:

Ange Albertini

A reverse engineer since the 80s who started his Infosec career as a malware analyst decades ago.

His wide knowledge of file formats is available in his hundreds of Corkami posters and visualisations, and is essential for projects like Magika, the AI-powered file type detection at Google.
His passion for retrocomputing and funky files makes him explore the darkest corners of the files' landscape:
bypassing security with ancient techniques, analyzing parsers and breaking them with extreme files, writing tools to evade detections via mock files or polyglots such as PoC||GTFO, exploiting AES-GCM via crypto-polyglots or colliding SHA1 via Shattered.

Understanding file type identifiers .ical 2024-10-24 16:45–17:15, Europe - Main Room

Understanding file type identifiers
.ical
2024-10-24 16:45–17:15, Europe - Main Room