BSides Cape Town 2024

Attacking Pipelines: Large Scale Exploitation of Workflow Files
2024-12-07 13:05-13:50 (Africa/Abidjan), Track 2

In this talk, we present a tool designed to perform large-scale scanning of GitHub repositories to identify potential expression injection vulnerabilities within their workflow files. Our system efficiently scrapes repositories, concurrently pulling and analysing workflow configurations for insecure patterns. Through this mining process, we have discovered that expression injection vulnerabilities are surprisingly prevalent, even among popular projects, and often go unnoticed. We have reached out to affected vendors for remediation and hypothesis this prevalence attributed to a lack of in detection mechanisms and key documentation on GitHub’s end. Additionally, we found that even when vulnerabilities are patched, they can be easily reintroduced by interpolating sanitised values. Our findings underscore the need for better tooling and awareness around securing GitHub workflows. Finally, we make our tool available to open-source for both blue and red team security researchers to benefit from.


In this talk, we introduce a powerful tool that we developed for performing large-scale scanning of GitHub repositories, aimed at identifying expression injection vulnerabilities within workflow files. The motivation for this project arose from an incident where a client was exploited by a white-hat hacker who leveraged such a vulnerability to patch it through exploitation. This incident highlighted the prevalence and potential severity of expression injection in GitHub workflows, where attackers can inject malicious code through interpolated GitHub variables. This type of vulnerability can lead to the unauthorised exposure of sensitive information, such as the highly privileged GITHUB_TOKEN.

Our tool is designed to efficiently scrape repositories, interacting with the GitHub API to concurrently pull and analyse workflow configuration files. By parsing these YAML files and detecting insecure patterns, we were able to uncover a surprising prevalence of expression injection vulnerabilities across a wide range of repositories, including some of the most popular open-source projects. Through the process of continuous mining, our system adheres to GitHub's rate limits, allowing it to run in the background without overwhelming the platform.

A key aspect of our findings is that even when these vulnerabilities are patched, they are often reintroduced through seemingly benign changes, such as interpolating sanitized values back into workflows. This creates a cyclical security risk that many teams may not even realise. We observed that existing mitigations, such as restricting permissions for each step and cautiously using the env directive to safely insert GitHub variables, are not prioritised in the documentation and inadequately enforced in real-world projects.

Written entirely in Scala, our application serves as a robust scanner that not only parses and identifies risky patterns but, allows for an interactive review process of the findings. Through this ongoing effort, we hope to drive awareness around this often-overlooked class of security issues in CI/CD pipelines. Additionally, we have made this tool open-source, allowing both blue team (defensive) and red team (offensive) security researchers to benefit from it.

Key Takeaways:

  • Expression injection vulnerabilities in GitHub workflows are more common than previously thought, even in widely used repositories.
  • Attackers can exploit this vulnerability to extract privileged data like the GITHUB_TOKEN, leading to further compromise.
  • Patching vulnerabilities is not always enough—reintroductions of sanitised values can recreate the problem.
  • Proper mitigations, such as limiting permissions and safely handling GitHub variables, require more comprehensive documentation and awareness.
  • Our tool, written in Scala, interfaces with the GitHub API and is capable of continuous background mining while adhering to API rate limits.
  • We are open-sourcing this tool to promote better security practices and aid both security researchers and developers in safeguarding their workflows.

David is the Director of Research & Development at Whirly Labs, specialising in static program analysis. He develops automated tools for vulnerability detection and code exploration, used by both internal teams and external clients, including pentesters and SAST vendors. David has presented his research at leading international conferences like ICSE and ESORICS, and delivered his first BSides CPT talk in 2023.

🔐 Software Developer and Security Professional, merging development expertise with offensive security skills. Transforming a childhood passion for Arduino tinkering into a career in tech innovation and application security.

💻 Technical Portfolio:
Full-stack development focusing on secure, scalable solutions
Extensive experience in Python, C++, C#, and Pascal
Web application security and exploitation specialist
Active CTF competitor and security researcher

🛠️ Beyond The Code:
Maker and hardware enthusiast: 3D printing, Fusion 360 design
Electronics and microcontroller projects
Automation engineering and IoT solutions

Started by copy-pasting Arduino code at age 12, evolved into architecting secure applications and hunting vulnerabilities. This journey from curious tinkerer to security-focused developer shapes my approach to every project: hands-on, creative, and security-first.
Currently, securing applications at Whirly Labs while pursuing continuous learning in emerging technologies. Always eager to collaborate on projects that push technical boundaries.