JuliaCon 2026

Teaching Opaque Machine Learning Models Plausible and Actionable Explanations
2026-08-13 , Room 6

CounterfactualTraining.jl leverages CounterfactualExplanations.jl to make opaque machine learning models like artificial neural networks more 1) explainable, 2) sensitive to actionability constraints and 3) adversarially robust. The package is part of the Taija ecosystem for Trustworthy AI in Julia and the engine behind our IEEE SaTML 2026 paper titled Counterfactual Training: Teaching Models Plausible and Actionable Explanations.


In our research paper, we propose a novel training regime termed counterfactual training that leverages counterfactual explanations to increase the explanatory capacity of models.

Counterfactual Explanations and Algorithmic Recourse

Counterfactual explanations (CE) have emerged as a popular post-hoc explanation method for opaque machine learning models and artificial intelligence (AI): they inform how factual inputs would need to change in order for a model to produce some desired output. To be useful in real-world decision-making systems, counterfactuals should be plausible with respect to the underlying data and actionable with respect to the feature mutability constraints. This facilitates the use of CE for the purpose of algorithmic recourse (AR): helping individuals subject to opaque AI to turn negative outcomes into positive ones. Much existing research has therefore focused on developing post-hoc methods to generate counterfactuals that meet these desiderata.

In Julia, CE and AR can be generated and benchmarked using Taija's CounterfactualExplanations.jl.

Counterfactual Training

In our latest research, we instead hold models directly accountable for the desired end goal: counterfactual training employs counterfactuals during the training phase to minimize the divergence between learned representations and plausible, actionable explanations. We demonstrate empirically and theoretically that our proposed method facilitates training models that deliver inherently desirable counterfactual explanations and additionally exhibit improved adversarial robustness.

Our new CounterfactualTraining.jl package was developed during the research process. To run large-scale experiments, it leverages CounterfactualExplanations.jl's support for multi-processing CE.

Real-World Impact

Our approach and package enables researchers and practitioners to train more trustworthy models without changing their architecture. If, for example, a particular problem lends itself to using an artificial neural network, you can improve its trustworthiness through counterfactual training, instead of training it conventionally.

Limitations

Since this package was developed during the research process, it was designed to fit that purpose. While the package is fully functional, its user-facing API, documentation and performance have room for improvement. Through this talk, we hope to receive feedback and ideas from the community and attract contributors.

Further Reading

This work is the culmination of Patrick's PhD, from which he recently graduated. The development of Taija has played a key role in his PhD. If you're interested in getting a broader picture, you may find his thesis and defence talk useful.

I'm a visiting researcher at Delft University of Technology where I recently graduated from my Ph.D. in Trustworthy Artificial Intelligence and Finance. My research revolves around Counterfactual Explanations and Probabilistic Machine Learning. Previously, I worked as an Economist for the Bank of England.

I started working with Julia at the beginning of PhD in late 2021 and have since developed and used various packages, some of which I presented at JuliaCon 2022, 2023 and 2024. These packages now have a common home called Taija, which stands for Trustworthy Artificial Intelligence in Julia.

You can find out more about my work on my website.