2025-04-25 –, Titanium3
SQL remains a foundational component of data-driven applications, but ensuring the accuracy and reliability of SQL logic is often challenging. SQL testing can be cumbersome, time-consuming, and error-prone. However, these challenges can be addressed by leveraging the simplicity of Python's testing framework such as pytest, enabling clean, robust, and automated SQL testing.
SQL is an essential part of data-driven applications, powering everything from simple queries to complex data transformations. However, ensuring the accuracy and reliability of SQL code is often challenging, particularly when dealing with intricate logic or large-scale datasets. Also, deploying changes in SQL code to production is another complex task, as it requires careful validation to avoid breaking the query logic.
Fortunately, integrating Python’s testing framework such as pytest into SQL workflows provides a streamlined solution for these challenges. Such approach enables creating clean, efficient, and automated testing processes for SQL code and database logic. Therefore, we can validate query results, enforce schema consistency, and simulate complex data scenarios, all while reducing manual effort and improving test coverage.
This talk will address:
- configuring lightweight database fixtures
- verifying SQL query result and testing scripts seamlessly
- data mocking
- schema validation
- testing non-deterministic queries
- handling large datasets
Attendees will gain insights into improving SQL code quality, identifying issues early in the development process, and ensuring the reliability of data-driven products. This presentation is particularly beneficial for Data Scientists, Engineers, and Analysts seeking to enhance the efficiency and precision of their testing practices.
Novice
Expected audience expertise: Python:Novice
Anna Varzina is a Data Science Engineer at Lighthouse, where she has been developing data-driven solutions for the hospitality industry since 2021. She specialises in working with large datasets and performing complex data transformations using Python and SQL to extract meaningful insights.
This is Anna's first time speaking at PyCon/PyData, and she is excited to share her experiences in overcoming the challenges of building reliable and scalable data workflows.