Mahendra Okza Pradhana
Data Engineer at Singaporean multinational technology company.
Been working for 4 years as Data Engineer and exposed with various technologies such as Python, DuckDB, Airflow, Hadoop, Kubernetes
Currently interested with Data Streaming and Data Lakehouse technologies.
Session
Ensuring the integrity and correctness of data is crucial, including configuration files like YAML. YAML files are used for configuration in various applications, kubernetes manifests and CI/CD pipelines. In our cases we used YAML files to creates Airflow DAGs in human readable format that will be processed by generator to generate numerous of DAGs compared to creating from scratch by using code. Everyone can make an Airflow DAG from YAML files.
However it is prone to human error due to their strict syntax and structure. These mistakes can lead to significant issues when deploying to production, causing downtime and unplanned maintenance.
In this talk, we will explore how to leverage custom pre-commit hooks to automate the validation of YAML files before they are committed, pushed, and deployed. By integrating these checks into your development workflow, you can catch errors early, maintain high standards of code quality, and empower your team to deploy with confidence.