2026-06-07 –, Doddington Forum
Data work often gets blocked by the unglamorous parts: brittle pipelines, unclear ownership, slow deployments, and systems that are hard to trust. This talk is about deliberately making data infrastructure “boring” — predictable, observable, and easy to change — so that the data itself can be used in lots of exciting ways.
Climate Policy Radar is a non-profit building open, credible databases and AI powered tools to support informed climate, nature, and development action.
Using a real-world journey from an unreliable ingest to a steadier, federated platform, this talk will walk through the principles and trade-offs that matter most: resilience over heroics, incremental delivery over big-bang rewrites, and transparency over intuition. The focus is not on specific tools, but on the engineering moves that turn data pipelines into dependable systems: orchestration that supports recovery, interfaces that unblock downstream teams, quality signals that can be acted on, and a shared layer (data lake/warehouse) that aligns definitions and reduces duplication.
Attendees will leave with a practical mental model for taking maturing data flows and making them boring — in a good way.
Data engineering succeeds when it disappears into the background. Not because it’s unimportant, but because it becomes reliable enough that other teams can build on it without thinking about it. In many organisations, the opposite happens: pipelines are fragile, changes are risky, and operational work consumes the roadmap.
This talk tells the story of moving from that state to one where the pipeline becomes a platform:
- Predictable runs and recovery: designing for frequent ingest, safe execution windows, and fast time-to-recover when things fail.
- Incremental modernisation: migrating orchestration and execution in a way that avoids running parallel “shadow pipelines” and reduces blast radius.
- System transparency: turning a black box into something teams can interrogate — what ran, what it produced, what failed, what changed, and why.
- Data quality as a product feature: creating actionable quality signals (not just logs), so improvements to text quality and search relevance can ship quickly and be measured.
- Federation and alignment via a shared layer: using a data lake/warehouse layer to consolidate outputs, align metrics across teams, and remove ad hoc transforms at the edges.
- Unblocking downstream users: improving interfaces and handoffs so application, policy, and data science teams can self-serve, iterate, and trust the numbers.
The emphasis is on the big picture: how to set goals that matter (scale, resilience, extendability), how to define “done” in operational terms, and how to deliver tangible improvements sprint by sprint while still laying foundations for the future. The takeaway is a repeatable approach for making data infrastructure boring — so the work built on top of it can be exciting.
Lead MLOps Engineer at Climate Policy Radar
Data Engineer at Climate Policy Radar
Data Engineer at Climate Policy Radar