Effects.jl: Effectively Understand Effects in Regression Models
2021-07-28, 13:50–14:00 (UTC), Green

Regression models are useful but they can be tricky to interpret.
Variable centering and contrast coding can obscure the meaning of main effects.
Interaction terms, especially higher order ones, only increase the difficulty of interpretation.
Here, we introduce Effects.jl which translates the fitted model, including estimated uncertainty, back into data space.
Using Effects.jl, it is possible to generate effects plots that enable rapid visualization and interpretation of regression models.


Regression is a foundational technique of statistical analysis, and many common statistical tests are based on regression models (e.g., ANOVA, t-test, correlation tests, etc.).
Despite the expressive power of regression models, users often prefer the simpler procedures because regression models themselves can be difficult to interpret.
Most notably, the interpretation of individual regression coefficients (including their magnitude, sign, and even significance) changes depending on the presence or even centering/contrast coding of other terms or interactions.
For instance, a common source of confusion in regression analysis is the meaning of the intercept coefficient.
On its own, this coefficient corresponds to the grand mean of the independent variable, but in the presence of a contrast-coded categorical variable, it can correspond to the mean of the baseline level of that variable, the grand mean, or something else altogether, depending on the contrast coding scheme that is used.
Effects.jl provides a general-purpose tool for interpreting fitted regression models by projecting the effects of one or more terms in the model back into "data space", along with the associated uncertainty, fixing other the value of other terms at typical or user-specified values.
This makes it straightforward to interrogate the estimated effects of any predictor at any combination of other predictors' values.
Because these effects are computed in data space, they can be plotted in parallel format to raw or aggregated data, enabling intuitive model interpretation and sanity checks.

Phillip was a struggling mathematician, then a linguist and now a neuroscientist, but always a hacker.