Dmitry Petrov
Dmitry Petrov is the creator of open-source tool DVC (Data Version Control), holds a PhD in Computer Science, previously worked as a Data Scientist at Microsoft, and is now the founder of DataChain.ai, a Python-first data platform for Physical AI.
Session
Text-to-SQL makes great demos, but in real systems generating queries is rarely the hard part - understanding data is. Modern data is increasingly S3-first and multimodal, where meaning is defined by Python workflows, not table schemas.
To work reliably, both agents and people need data context across multiple layers: storage context (what exists and where), metadata context (what’s inside files), dataset context (how files are grouped and versioned), and code context (the transformations that define semantics).
In this talk, I’ll share a practical framework for building these context layers in Python-first systems, and show how DataChain makes multimodal workflows agent-ready in domains like Physical AI and biotech.