From data analysis in Jupyter Notebooks to production applications: AI infrastructure at reasonable scale EuroSciPy 2024

From data analysis in Jupyter Notebooks to production applications: AI infrastructure at reasonable scale
.ical
2024-08-28 13:55–14:15, Room 7

The availability of AI models and packages in the Python ecosystem has revolutionized many applications across domains. This talk discusses infrastructural decisions and best practices that bridge the gap between interactive data analyses in notebooks and production applications at a reasonable scale, suitable for both commercial and scientific contexts. In particular, the talk introduces the on-premises, Python-based AI architecture employed at MDPI, one of the largest open-access publishers. The presentation emphasizes the impact of the design on reproducibility, decoupling of different resources, and ease of use during the development and exploration phases.

While there is certainly no shortage of tutorials on how to build AI applications in a Jupyter notebook, it can be challenging to move from proof-of-concepts to reliable and reproducible data analyses used for data-driven decisions, or production-grade applications. The presentation discusses architectural decisions in a Python-based environment to bridge this gap at typical scales in academia and industry. Splitting the system into smaller composable building blocks provides reproducibility, more rapid development, and more efficient use of available resources, and has enabled MDPI to leverage AI at multiple stages of the business process. The concepts presented in the talk apply to a wide range of applications.

Abstract as a tweet:

Python-based AI architecture bridging the gap between Jupyter Notebooks and production applications at MDPI

Category [Machine and Deep Learning]: ML Applications (e.g. NLP, CV) Expected audience expertise: Domain: some Expected audience expertise: Python: some

Frank Sauerburger

Frank became a self-employed software developer and consultant while studying Physics in Freiburg. During his Masters, he specialized in data analysis for particle physics at CERN and obtained a doctoral degree in 2022 working with the ATLAS collaboration. Since 2023, he has been the AI Technical Leader at MDPI, one of the largest open-access publishers.

From data analysis in Jupyter Notebooks to production applications: AI infrastructure at reasonable scale .ical 2024-08-28 13:55–14:15, Room 7

From data analysis in Jupyter Notebooks to production applications: AI infrastructure at reasonable scale
.ical
2024-08-28 13:55–14:15, Room 7