Trustworthy AI in Julia meets Supercomputing Juliacon 2024

Trustworthy AI in Julia meets Supercomputing
.ical
2024-07-12 11:50–12:00, Else (1.3)

Taija is a growing ecosystem of packages geared towards Trustworthy Artificial Intelligence in Julia. Various ongoing efforts towards trustworthy artificial intelligence have one thing in common: they increase the computational burden involved in training and using machine learning models. This talk will introduce TaijaParallel.jl: Taija's recent venture into supercomputing.

In the wake of recent rapid advances in artificial intelligence (AI), it is more crucial than ever to ensure that the technologies we deploy are trustworthy. Efforts surrounding Taija have so far centered around explainability and uncertainty quantification for supervised machine learning models. CounterfactualExplanations.jl, for example, is a comprehensive package for generating counterfactual explanations for models trained in Flux.jl, MLJ.jl and more.

🌐 Why supercomputing?

In practice, we are often required to generate many explanations for many individuals. A firm that is using a machine learning model to screen out job applicants, for example, might be required to explain to each unsuccessful applicant why they were not admitted to the interview stage. In a different context, researchers may need to generate many explanations for evaluation and benchmarking purposes. In both cases, the involved computational tasks can be parallelized through multi-threading or distributed computing.

🤔 How supercomputing?

For this purpose, we have recently released TaijaParallel.jl: a lightweight package that adds custom support for parallelization to Taija packages. Our goal has been to minimize the burden on users by facilitating different forms of parallelization through a simple macro. To multi-process the evaluation of a large set of counterfactuals using the MPI.jl backend, for example, users can proceed as follows: firstly, load the backend and instantiate the MPIParallelizer,

using CounterfactualExplanations, TaijaParallel
import MPI
MPI.Init()
parallelizer = MPIParallelizer(MPI.COMM_WORLD)

and then just use the @with_parallelizer macro followed by the parallelizer object and the standard API call to evaluate counterfactuals:

@with_parallelizer parallelizer evaluate(counterfactuals)

Under the hood, we use standard MPI.jl routines for distributed computing. To avoid depending on MPI.jl we use package extensions. Similarly, the ThreadsParallelizer can be used for multi-threading where we rely on Base.Threads routines. It is also possible to combine both forms of parallelization.

🏅 Benchmarking Counterfactuals (case study)

This new functionality has already powered research that will be published at AAAI 2024. The project involved large benchmarks of counterfactual explanations that had to be run on a supercomputer. During the talk, we will use this as a case study to discuss the challenges we encountered along the way and the solutions we have come up with.

🎯 What is next?

While we have so far focused on CounterfactualExplanations.jl, parallelization is also useful for other Taija packages. For example, some of the methods for predictive uncertainty quantification used by ConformalPrediction.jl rely on repeated model training and prediction. This is currently done sequentially and represents an obvious opportunity for parallelization.

👥 Who is this talk for?

This talk should be useful for anyone interested in either trustworthy AI or parallel computing or both. We are not experts in parallel computing, so the level of this talk should also be appropriate for beginners.