Scheduler as a service - Apache Airflow at EA Digital Platform
07-06, 11:00–11:45 (US/Pacific), London Meetup, [Session starts: Monday 06.07 5pm (Monday 06.07 9am PDT)]

In this talk, we share the lessons learnt while building a scheduler-as-a-service leveraging Apache Airflow to achieve improved stability and security for one of the largest gaming companies. The platform integrates with different data sources and meets varied SLA’s across workflows owned by multiple game studios. In particular, we present a comprehensive self-serve airflow architecture with multi-tenancy, auto-dag generation, SSO-integration with improved ease of deployment.


Within Electronic Arts, to provide scheduler-as-a-service and to support hundreds of thousands of execution workflows, each team requires an isolated environment with access to a central data lake containing several petabytes of anonymized player and game metrics. Leveraging Airflow, each team is provided a private code repository and namespace with which they can deploy their DAGs at their own behest. To support agile development cycles, a private testing sandbox and auto-deployment to an isolated multi-tenant airflow platform has been made available to game studios.

In production, a single dockerized airflow deployment on Kubernetes is utilized to ensure highly availability and single-step deployment. Custom SSO-integration and RBAC-based operator and sensor whitelisting allows for secure logical isolation. In addition, providing dynamic DAG instantiation capability helps address varied SLA’s during game launch seasons that are staggered through a financial year.

Software Engineer II at Electronic Arts

My name is Xiaoqin, I'm a Software Engineer at Electronic Arts from Austin, Texas. I started to work on Apache Airflow last summer, I'm excited to see and share how we far we have went and leveraging Apache Airflow at EA.

Preethi Ganeshan is currently a Software Engineer in the Data Platform team at Electronic Arts. She has made several contributions to the central data platform and ad-hoc analytics at EA which includes leveraging engines such as Hadoop, Hive, and Presto in addition to tools like Airflow and Superset.