Democratised Data Workflows at scale
2020-07-07 , Bangalore Meetup [Session starts Wednesday 8.07 9.30 am (Tuesday 07.07. 9pm PDT) ]

Financial Times is increasing its digital revenue by allowing business people to make data-driven decisions. Providing an Airflow based platform where data engineers, data scientists, BI experts and others can run language agnostic jobs was a huge swing. One of the most successful steps in the platform’s development was building execution environment, allowing stakeholders to self deploy jobs without cross team dependencies on top of the unlimited scale of Kubernetes.


Airflow was introduced to Financial Times in 2019 in order to satisfy all of our stakeholders’ needs and to replace our existing legacy ETL solution. Its extensibility, scalability, simplicity, but most importantly - its great and growing community, were among the reasons to choose Airflow to orchestrate all of our batching jobs.

When facing the need to provide an opportunity to independently write and deploy jobs written in Python, R and JAVA, as well as running Spark batch jobs, our Data team created an execution environment engine, which allowed the business to quickly create and deploy their models on top of our internal Airflow application running in Kubernetes.

Additionally, we are improving the performance when the data sharing between task instances is necessary for a particular DAG run. As a result, we plan to provide hot data accessibility with lower or even non existing latency.

My name is Mihail Petkov and I have more than 9 years of experience in the software industry. I'm currently working as a Big Data Engineer at Financial Times. Being a Big Data Engineer is really exciting. I personally enjoy being challenged and love solving complex problems on a daily basis.

My name is Emil Todorov. I am Software Engineer @ Financial Times and I am part of the Data Platform team for almost year and a half. Tackling data at scale became my passion and I believe this is the future.