Effective data science teams with Jupyter and databooks PyCon JP 2022

Effective data science teams with Jupyter and databooks
.ical

2022/10/15 16:00–16:30, pyconjp_3
言語: English

Jupyter notebooks have been around us since 2015. Since then, it has been used in blogs, books and fields such as data science.
Nonetheless, some of the features that make notebooks great also discourage teams from using them. In this talk I'll explore these issues, some solutions, my personal experience and share a python tool I've built to make them more amicable for software teams.

Jupyter notebook is a great tool for quick prototyping and exploration. For this reason, it's very popular in fields such as data science. However, it's JSON-like structure makes it hard to work with notebooks in teams, as it does not cope well with other software tools such as git.

Allowing developers better version notebooks and provide better tools to compare and resolve conflicts can greatly improve the lives of developers. I have used Jupyter notebooks at different data science projects, have experimented with different tools to support better work with notebooks and git and build my own tool (databooks) to that purpose.

Program

Agenda
Introduction
Data science and notebooks
Jupyter notebooks
What it is
Where you can find them
Issues with notebooks
Solution to those issues
Databooks
Demo
What it is
How it works

This talk is not about

Why you should use notebooks
How does Jupyter notebooks work
Data science
How to put notebooks in production
How does git work

Murilo Cunha

Murilo Cunha is an AI tech lead at Dataroots with a background in Mechanical Engineering and an advanced master’s degree in Artificial Intelligence from KU Leuven, whose main goal is to make AI both useful and accessible. To reach this goal, Murilo takes a pragmatic approach which led him to move more into the direction of data engineering. In line with his passion of enabling AI to make an impact, Murilo developed an expertise in MLOps, meaning that he advocates for automation and monitoring at all steps of ML system construction, including integration and deployment. With his experience in getting ROI on AI initiatives and as an open source supporter, he decided to fill in the existing gap in the tooling that supports data scientists by creating databooks, an open source package to make the life of data scientists easier.

Effective data science teams with Jupyter and databooks .ical 2022/10/15 16:00–16:30, pyconjp_3 言語: English

Program

This talk is not about

Effective data science teams with Jupyter and databooks
.ical

2022/10/15 16:00–16:30, pyconjp_3
言語: English