Probabilistic Biostatistics: Adventures with Julia from Code to Clinic
2019-07-24 , Elm B

Physician scientists conducting clinical trials are typically not statisticians or computer scientists. Perhaps, in a perfect world, they would be, or more realistically could have statisticians and computer scientists on their research team, but that is often not the case. This leads to what we refer to as the “two-field problem.” Physician-researchers require sophisticated and powerful statistical tools to address complex inferential problems, yet these tools must be intuitive and user-friendly enough not to require advanced statistical knowledge and programming skills. Using Julia, we illustrate the application of Bayesian probabilistic biostatistics to meta-analyses of treatment effects and clinical trials. This combination of Julia and Bayesian methods provides a solution to the “two-field problem.”


We know that Julia solves the “two-language problem”: it is both fast and efficient (performance), and easy to use (user friendliness). Using Julia combined with the Bayesian MCMC machinery can solve what we call “the two field problem” in clinical trials, which is that clinical researchers need expertise in more than one field.

Medical research - including clinical trials - is frequently conducted by physician researchers who have limited training in inferential statistics and computer programming. Typically, clinical research teams will have a biostatistician, though this may be an MS level individual who performs pre-specified “off the shelf” analyses, and generally is not someone well-versed in Bayesian inferential tools. The prevalence of “five percentitus,” i.e. looking only for and reporting p-values that are “statistically significant” (p<0.05), testifies to this fact. The advances in computing power and capabilities in the last several decades, along with the subsequent developments in Bayesian computational methods, are only just beginning to have an impact on this.

As those conducting and funding clinical RCTs recognize the high costs of these studies (e.g., medication expense, time required, and potential exposure of patients to ineffective treatments), there has been greater enthusiasm for (1) improving statistical analytic methods for RCTs, and 2) using evidence-based methods to examine existing naturalistically-collected clinical data to inform clinical practice without the need for RCTs. These approaches require far greater statistical and programming knowledge and sophistication from users. Thus, there is an urgent need to provide statistical tools to clinician-researchers that are intuitive and easy to use, yet sophisticated and powerful enough “under the hood” to answer questions that simpler methods cannot.

The “Bayesian machinery” of Markov chain Monte Carlo (MCMC) methods together with Julia offer a solution to this “two-field problem”. They enable exact small sample inference and hypothesis testing for complex models without requiring the restrictive assumptions necessary to obtain analytical tractability (performance), and facilitate the analysis of complex models with basic statistical concepts: frequency distributions, density plots, means, medians, modes, standard deviations, quantiles, and posterior odds (user friendliness).

The talk will demonstrate application of this approach using examples from our own research that illustrate our experiences with Bayesian inferential methods for clinical research using Julia. [the number and detail of examples will be modified to suit the length of the talk].

• Reevaluating the evidence from previously conducted RCTs.
• Analysis of abandoned trials.
• Joint evaluation of tolerability and efficacy in RCTs.
• Bayesian hierarchical modeling for meta-analysis evaluating adverse events (“side effects”) in trial participants,
and examining the difference between industry and federally sponsored randomized controlled trials.

Associate Professor
Lindner College of Business
University of Cincinnati

Research interests: Bayesian inference, statistical hypothesis testing, meta-analyses, Bayesian adaptive randomized controlled trials, time series analysis.