PyConDE & PyData Berlin 2024

The pragmatic Pythonic data engineer
2024-04-22 , B05-B06

Learn to make practical decisions in data engineering with Python's vast ecosystem. Avoid blindly following market guidelines and consider the reality of your situation for better performance and architecture.


Often, we tend to look at the success of others and try to repeat their decisions, expecting the same result. We must deal with things sensibly and realistically based on practical rather than just theoretical considerations. Python offers a vast ecosystem to handle all phases of data engineering. Implementing a data architecture can be complex, and many adopt the strategy of using market guidelines without pragmatism of understanding your reality; in most cases, this strategy is a big problem of architecture and performance.

As a part of this talk, we will walk through the process of identifying Pythonic components of data analysis, data cleaning, data ingestion, databases, file systems, serialization formats, workflows, and pipelines. As we move through those steps, my main focus is teaching the audience pragmatic thinking on incorporating best practices into the data architecture process. I will also walk through strategies and explain high-level data engineering concepts we can use.


Expected audience expertise: Domain

Novice

Expected audience expertise: Python

Novice

Abstract as a tweet (X) or toot (Mastodon)

Learn to make practical decisions in data engineering with Python's vast ecosystem. Avoid blindly following market guidelines and consider the reality of your situation for better performance and architecture

See also: Slides (6.1 MB)

Robson has been a developer since 2003 with a multifaceted life. Since 2014, I transitioned my career to be a Data Engineer and used Python to handle complex pipelines and glue other technologies. Living in Berlin, in their free time, he is an apprentice paramedical tattooer and glider pilot.