EuroSciPy 2026

From Black to White Boxes: Interpretable Regression with the trust-free Python package
2026-07-22 , Room 1.19 (Ground Floor, Shannon)

Machine Learning practitioners often face a trade-off: high accuracy with complex, black-box models (like XGBoost or Random Forests) or lower accuracy with transparent models (like decision trees or linear models). What if you didn't have to choose?
This 90-minute tutorial introduces TRUST (Transparent, Robust, and Ultra-Sparse Trees), a new interpretable regression framework that combines decision trees with sparse linear models to deliver Random Forest accuracy. The algorithm is implemented in the Python package trust-free (available via pip install). We will demonstrate how TRUST autonomously recovers the WHO obesity threshold (BMI = 30) from raw data to inform medical risk pricing.
By the end, you will be able to train high-performing, interpretable regression models and generate automated, natural-language explanation reports for individual predictions and deterministic feature importance.


Tutorial Summary and Learning Objectives

This hands-on workshop introduces TRUST (Transparent, Robust, and Ultra-Sparse Trees), a novel interpretable machine learning framework, and demonstrates its implementation using the Python package trust-free.

The tutorial is designed to solve the critical trade-off in industrial ML: achieving high predictive accuracy without sacrificing model interpretability. Attendees will move beyond black-box models and learn how to fit high-performing regression trees where every split, every leaf, and every final prediction is inherently explainable.

By the end of this 90-minute session, attendees will be able to:

  1. Successfully fit a high-accuracy, interpretable regression model using the trust.TRUSTRegressor() class.

  2. Understand the difference between standard Decision Trees (CART) and Linear Model Trees (LMTs), and interpret the sparse linear models generated at the leaves of the TRUST framework.

  3. Generate and interpret automated, natural-language explanation reports for any single prediction using the powerful .explain() method.

  4. Use the unique .compare() method to contrast two observations head-to-head, immediately highlighting the features responsible for prediction differences.

Prerequisites and Setup

  • Target Audience: Data Scientists and Data Analysts focused on building highly accurate, accountable, and interpretable regression models. The tutorial is especially valuable for those needing to communicate model outputs clearly to non-technical stakeholders (e.g., line managers, regulators, or the general public).
  • Required Knowledge: Intermediate Python (familiarity with pandas, numpy, scikit-learn, and Jupyter notebooks) and basic knowledge of regression concepts (R², feature importance, cross-validation).
  • Required Software: Attendees should ideally have the following installed prior to the tutorial:
  • Python 3.11 or 3.12
  • The trust-free package (pip install trust-free)
  • Jupyter Notebook or similar notebook environment.
  • Compatibility Note: A link to a Google Colab notebook environment will be provided to ensure all participants can run the code immediately, regardless of their local machine setup or operating system architecture.
  • Dataset: We will use a pre-cleaned, publicly available regression dataset (the famous Medical Insurance Charges dataset).

90-Minute Detailed Outline

This workshop is structured with a strong emphasis on practical application, dedicating approximately 70% of the time to live coding and guided exercises.

  • [0-10 min] LMT Theory and Setup: Conceptual introduction to Linear Model Trees (LMTs) and how TRUST achieves sparsity and accuracy. Environment check and quick review of the starter code repository.
  • [10-25 min] Model Fitting: Loading data and preparing it for regression. Hands-on Exercise 1: Fitting the trust.TRUSTRegressor() model and introduction to key parameters.
  • [25-40 min] Global Interpretation of the Fitted Model: Understanding the decision process: how TRUST defines splits. Code Demo: Visualizing the full tree structure (.plot_tree()), interpreting the sparse linear model coefficients within the leaf nodes, and state-of-the-art global variable importance scores.
  • [40-65 min] Individual Prediction Explanations: The power of the .explain() method. How to generate automated, comprehensive reports that justify a single prediction, including local variable importance and human-readable text summaries. Hands-on Exercise 2: Generating and analyzing individual explanation reports.
  • [65-75 min] Head-to-Head Instance Comparison: Using the unique .compare() feature to visually and statistically contrast why two different data points received different predictions. Hands-on Exercise 3: Comparing the profiles of two example observations.
  • [75-90 min] Wrap-up and Q&A: Summary of key takeaways and resources for further exploration. Final Q&A.

Expected audience expertise: Domain: expert Expected audience expertise: Python: some Supporting material: Supporting material Project homepage or Git: Project homepage or Git Your relationship with the presented work/project: Original author or co-author, Active contributor, Developed the presented feature, Maintainer of the presented library/project

Albert Dorador is an Adjunct Professor of Mathematics (Universitat Pompeu Fabra) and Statistics (BarcelonaTech), and leads a Research Lab focused on the development of cutting edge, inherently interpretable machine learning models for tabular data (Whitebox Lab). He holds a PhD in Statistics from the University of Wisconsin–Madison and previously served at the European Central Bank, specializing in financial risk management and machine learning applications. Albert is the creator of the TRUST and Renet algorithms and the maintainer of the trust-free Python library. His work focuses on the intersection of high-performance statisical modeling and auditable machine learning for high-stakes regulatory environments.