2024-07-11 –, Method (1.5)
CEEDesigns.jl is a decision-making framework for the cost-efficient design of experiments, with direct applications in drug research, healthcare, and various other fields. Typically, a design consists of multiple experiments. Each experiment is regarded as an option to acquire additional experimental evidence and is associated with a monetary cost and an execution time. The framework, then, aims to select experiments that balance the value of acquired information and the incurred costs.
In our talk, we will illustrate the decision-making framework, CEEDesigns.jl, through the following two examples.
First, we will discuss a task from the histopathology domain, where the goal is to predict glioma grades, as gliomas are the most common primary brain tumors. This prediction is based on a set of clinical features and mutation factors. In this context, observing mutation factors necessitates conducting physical experiments, which are associated with costs. To solve this problem, our framework estimates the information value of each subset of features (experiments). For example, through a built-in integration with MLJ.jl, we can evaluate the predictive accuracy of a machine learning model predicting glioma grades from the subsets of experimental features. By considering both the estimated information values and experimental costs, we can generate Pareto-efficient designs that maximize the information value at the lowest cost. In addition to minimizing the monetary cost, the framework also proposes an optimal order for conducting experiments within a chosen design on the Pareto front, thus reducing the execution time.
In general, this allows economical decisions to be made, for example, regarding how to allocate scarce resources to a set of experiments that attain some acceptable level of information (or, conversely, reduce uncertainty below some level).
Second, we consider "personalized" experimental designs that dynamically adjust based on the evidence gathered from the experiments. This approach is motivated by the fact that the value of information collected from an experiment generally differs across subpopulations of the entities involved in the triage process.
Internally, we conceptualized the decision-making process as a Markov decision process, in which we iteratively choose to conduct a subset of experiments and then, based on the experimental evidence, update our belief about the distribution of outcomes for the experiments that have not yet been conducted. The information value associated with the state, derived from experimental evidence, can be modeled through any statistical or information-theoretic measure such as the variance or uncertainty associated with the target variable posterior.
In this process, our goal is to minimize the expected experimental cost that is required to reduce the uncertainty about the grade below a specified threshold. Importantly, while constructing the experimental designs, we incorporate predictive multiple-step-ahead lookups to model experimental outcomes and we consider the subsequent decisions for each outcome.
To solve the Markov decision process, the framework uses the POMDPs.jl package to define the underlying process and the MCTS.jl package, which implements the Monte Carlo Tree Search algorithm, to provide an actual solution to the process.
A decision scientist at Merck & Co., Inc.
Background in maths and molecular biology. Leading the Decision Science team in MSD R&D informatics.
Sr. Decision scientist at Merck