2020-07-29 –, Purple Track
Today’s science requires ever-increasing amounts of computation, so ideally, we should be able to easily shift our simulations to any nonpersonal machine and guarantee that we can exactly reproduce the result later.
To do so, I would like to talk about a containerization software aimed at high performance computing called Singularity, the small package that I wrote for it, and the adventure of combining JIT with read-only containers.
As the use of computing in science grows, more scientists are running ever larger computations. However, while the importance and use of a lab notebook is taught in many science degrees, the corresponding best practices for computer use often are not. From personal experience, I can tell that this leads to a growing number of random scripts and result files that after a week can’t be matched anymore. Alternatively, the code is sent to a server, and after requiring some small tweaks, the local and the remote version look different. A google search shows that this is not an uncommon problem, and there is an increasing number of blogs and papers addressing various aspects.
In general, there are three aspect to a numerical result: the parameters, the code, and the environment. Specifically in Julia, the first two can be addressed by the combination of DrWatson.jl and Git, which makes it very easy to store the parameters used and the git hash of the commit with the code. This also includes the Manifest file, meaning that one can recreate the exact package environment, which is a major feature of Julia. However, this does not cover the binaries used by many packages. Examples include an NLopt algorithm throwing a segfault only on a specific version of Ubuntu, or a binary that only on Mac clashes with MKL.
On the surface, containerization is the solution to these problems, but in practice the group doing the science and the group understanding containers seem mostly disjunct. After being in the first group until a few months ago, I now would like to present my approach and the small package [1] I wrote to facilitate it. It uses the scientific container software Singularity, which is less popular but more specialized than the widely known Docker (like Julia compared to Python). Differing from Docker, Singularity containers support various HPC hardware and software, do not require root to run, and integrate well into resource managers like SLURM.
In this talk, I would like to introduce my workflow based on DrWatson.jl and my own measures to create very minimal containers that enable results that are fully reproducible on any machine with the Singularity runtime. This will include general aspects about Singularity and specific one related to incorporating Julia.
Steffen Ridderbusch is a doctoral student of the 2018 cohort of the Centre of Doctoral Training in Autonomous Intelligent Machines and Systems. Holding degrees in Mathematics and Engineering, he is now part of the Control Department.