JuliaCon 2020 (times are in UTC)

JuliaCon 2020 (times are in UTC)

Parallelization, Random Numbers and Reproducibility
2020-07-29 , Purple Track

We will show how the different types of parallelization in Julia interact with random number generators and shared state. Using the parametric bootstrap in MixedModels, we will show to how use threads effectively with a shared random number generator to give the same result as the serial version.


Random-number generators present a special problem for both thread- and process-based parallelism, especially when guaranteeing a reproducible, unbiased stream that is independent of the number of concurrent workers. In other words, we require the ability to parallelize a replicated, stochastic operation in such a way that we will get the same result as the serial computation with the same RNG and seed. For thread-based parallelism, this can easily be achieved via a shared RNG with appropriate locking. We demonstrate this with the implementation of the parametric bootstrap in MixedModels. The parametric bootstrap is embarrassingly parallel computation, yet depends on a stochastic element and thus random-number generation. In particular, we examine how the granularity of locking impacts the 'striping' of random numbers across threads and thus reproducibility. We finish by contrasting our approach to the use of 'fast-forwarding' and copying the RNG and discussing issues with generalizing these approaches to process-based parallelism.

Phillip was a struggling mathematician, then a linguist and now a cognitive neuroscientist, but always a hacker.