Reproducibility, Julia, and Renku

Reproducibility should be a central consideration for data science processes, but it requires some support to achieve. In this talk, we present the Renku reproducibility platform and how to take advantage of it from Julia focusing two examples: 1. Using Renku to build reproducible workflows in Julia and 2. Facilitating the teaching of Julia-based courses with Renku.

Doing data science effectively means working reproducibly. Although commitment to reproducibility as an ideal is non-controversial, in practice, it can be challenging to achieve. Many know this from their own experience, and the disconnect between ideal and reality has been well documented in recent years (see the Fall 2020 issue of the Harvard Data Science Review for just one recent example).

Renku is open-source software being developed at the Swiss Data Science Center as a solution for making reproducible data science easier to achieve. It builds upon established software such as git, Docker, and Kubernetes to provide tools necessary to work reproducibly, while offering the flexibility to support a variety of use cases and users working in any language they choose, including Python, R, and, of course, Julia.

In this presentation, we will explain the architecture of the Renku platform and show how it can interact with Julia tools, in particular the Pkg package manager, to provide scaffolding for portable, reproducible Julia projects. Once we have a project set up, we will work through an example of building a reproducible workflow in Julia using Renku.

The same tools that support an individual working reproducibly can be used to solve some of the hurdles that are encountered teaching a class to many students. As an example, we will present a set-up that could be used for teaching a class with Julia and provide a comparison of Renku vs. alternatives like Binder for this purpose.

We will conclude by looking at some more advanced topics for customizing a Julia-based environment for reproducibility by combining Renku with tools like VSCode, Dr.Watson, or Pluto.jl.