GerryChain.jl: detecting gerrymandering with Markov chains

Markov chain-based sampling is an increasingly popular method for detecting gerrymandering of voting districts. GerryChain.jl implements optimized versions of the flip chain and the ReCom chain (DeFord, Duchin, Solomon 2019) for sampling large ensembles of districting plans that satisfy legal criteria. The package also provides a suite of tools for preprocessing geospatial data and computing statistics over ensembles of districting plans.


A districting plan is a partitioning of a geographical region, such as a city or a state, into connected pieces with approximately equal population. U.S. law requires that states redraw their districts every ten years, and district lines have a significant effect on voter representation. Gerrymandering is the practice of drawing districts that increase the power of one group of voters at the expense of others, typically along partisan or racial lines. Recently, a community of mathematicians, computer scientists, and policy experts have developed a suite of techniques for drawing districting plans and detecting gerrymandering. Ensemble analysis—the sampling of many reasonable districting plans from a distribution, with the goal of establishing the range of typical outcomes—is an increasingly popular method in this domain that has recently been used effectively by expert witnesses in high-profile voting rights lawsuits. In these analyses, a districting plan is represented as a weight-balanced k-partitioning of the dual graph of geographic units; these units might be counties, voting precincts, or U.S. Census blocks, depending on the granularity of the analysis.

To sample from a known distribution of districting plans, it is useful to employ a Markov chain. Our research group maintains the popular Python package GerryChain (github.com/mggg/GerryChain), which implements a variety of Markov chains for redistricting and has been used in court cases and published academic research. However, this package’s performance is severely limited for many types of chains, such as the recombination (ReCom) chain developed by DeFord et al. Recombination chains are attractive because they sample from a diverse set of reasonable plans by merging two contiguous districts at a time, drawing a random spanning tree, and cutting the tree into population-balanced halves. GerryChain.jl improves on the performance of the recombination chains implemented in the original GerryChain package by over an order of magnitude in a single-core setting, and the powerful worker pool-based parallelization features in Distributed.jl yield an additional 5x speedup. Through our continuing work on GerryChain.jl, we hope to advance the state of Julia’s geospatial packages and popularize Julia in the computational public policy community. Our work is available at github.com/mggg/GerryChainJulia.