JuliaCon 2025

Fixing Julia's task-local RNG: a bother, a bug, a breakthrough
2025-07-25 , Main Room 1 (Main stage)

Each task in Julia has its own PRNG. When a task is forked it needs to seed the child task's RNG. This talk is about the evolution of how we've done this and the novel technique we now use that solves the annoyances and bugs that plagued our previous approaches. We've generalized the DotMix algorithm designed by Leiserson et al. for the Cilk parallel runtime system, simplifying and strengthening it while retaining provable collision resistance.


Julia uses Xoshiro256++ as its default random number generator. With only four 64-bit values of state, this RNG is small enough that each task can have its own instance. This means tasks can generate random values without locking and each task has its own independent, reproducible random number sequence. But what happens when a task spawns a child task? The parent task must seed the child task’s somehow. This talk covers the history of how we used to do this, why it was annoying, how we fixed it, how that fix was discovered to be subtly but significantly broken, and how it was fixed yet again.

The final design is derived from the DotMix algorithm designed by Leiserson et al. for the Cilk parallel runtime system. DotMix requires arithmetic modulo a large prime to guarantee that so-called "pedigree coordinates" have multiplicative inverses. We noticed that by structuring the task tree a bit differently, we can ensure that it is a binary tree, which lets us use machine arithmetic instead since a coefficient of one is invertible in any modulus. We also found that the dot product construction at the heart of DotMix could be problematic as it inherently introduces linear relationships between the states of sibling tasks. These linear relationships can be masked by a good non-linear output function, which is what the original DotMix did. I failed to recognize the significance of that non-linear output function and initially omitted it from our implementation, which caused a very observable correlation between the outputs of random values from carefully constructed task families. When trying to fix this, I noticed that we can replace the dot product in DotMix with reduction by any doubly bijective binary function without losing DotMix's collision resistance property. Machine addition is bijective in each argument, of course, but we can use a non-trivial and non-linear mixing operation instead, which yields a construction that is both stronger and more efficient than the original DotMix. It's stronger because it's inherently non-linear and the non-trivial bit mixing is chained rather than only applying to the output after relatively trivial additive mixing. It's also more efficient because we no longer need a relatively expensive non-linear output function after mixing, we can use the mixed output directly. Finally, we verify empirically that this construction has the collision resistance properties that we expect by simulating a reduced-state version of it.