PyCon Sweden 2021

Sergei Beilin

Ph.D. in mathematics.
Independent software engineering consultant, solutions architect, focusing mostly on event-driven systems and industrializing AI solutions.
Previously worked in research and eduction. Taught 200+ students Python before it became mainstream :)


Session

10-21
11:30
25min
Fullstack datascientist v.2021 (how much of software engineering should a modern datascientist know)
Sergei Beilin, Natalia Beylina

Live broadcast: https://www.youtube.com/watch?v=UujU3xOo038

What are the essential software engineering skills a datascientist should have to succesfully bring own work to production? We - Sergei Beilin, Ph.D., software engineering consultant in AI/ML, and his wife Natalia Beylina, Ph.D., datascientist - will go through the most important things a modern datascientist needs to know about software engineering, from both software engineer and datascientist point of views, and using our own experience.

We will discuss:

  • programming language(s): how much of the language should one know?
  • execution models, orchestration, containerization - kubernetes, kubeflow, airflow, spark/databricks, etc
  • storage, network protocols/APIs, file formats - from CSVs to delta, from json to avro
  • modern systems architecture concepts to understand
  • and how the whole system architecture and infrastructure landscape will dictate the way you deploy and run your work
  • tools and devops practices
  • processes: integrating data scientists' workflow into typical agile
  • bad practices to avoid: a few examples we've seen ourselves
Data Science, AI, and Machine Learning
Data