2026-06-05 –, Hardwick Hub
This hands-on tutorial takes participants from zero to confident use of tabular foundation models. Using real datasets, we will run TabICL-style models, benchmark them rigorously against XGBoost and Random Forest, diagnose their behavior, and build intuition for when they help and when they don't.
Tabular foundation models are generating excitement, but most practitioners haven't used them yet. This 90-minute hands-on tutorial bridges that gap.
Participants will work through four progressive notebooks on real-world datasets of varying difficulty. By the end, they won't just know about tabular FMs — they'll have run them, broken them, and compared them against familiar baselines.
Who is this for?
Data scientists and ML engineers who:
- Use sklearn / XGBoost / LightGBM regularly
- Are curious about tabular FMs but haven't tried them
- Want to build informed opinions grounded in hands-on experience
What we'll use
- Models: Any TFMs (TabICL, TabPFN or Neuralk proprietary model with free credits), XGBoost, Random Forest
- Datasets: 3 curated real-world datasets chosen to expose different behaviors:
- A small medical dataset (~500 rows, 12 features) — where TFMs tend to shine
- A medium e-commerce dataset (~5K rows, 40+ features with mixed types) — a realistic "grey zone"
- A large, noisy dataset (~50K rows) — where trees typically dominate
- Stack: Python 3.9+, sklearn, tabicl, xgboost, matplotlib, pandas
Detailed outline (90 min)
| Time | Phase | What participants do | Expected output |
|---|---|---|---|
| 0–15 | Conceptual grounding | Short lecture: what tabular FMs are, how they differ from fitted models, what to expect. No code yet. | Shared mental model before touching code |
| 15–30 | Notebook 1: First predictions | Install a TFM, load the small medical dataset, generate predictions. Compare API with sklearn's .fit()/.predict() pattern. | Working predictions; comfort with the API |
| 30–45 | Notebook 2: Rigorous benchmarking | Run XGBoost and Random Forest on all 3 datasets with proper cross-validation. Compare with TFMs using the same splits. Discuss evaluation pitfalls (leakage, metric choice). | A comparison table with confidence intervals across 3 datasets |
| 45–60 | Notebook 3: When things break | Deliberately stress-test the TFMs: add noisy features, increase dataset size, introduce heavy cardinality categoricals. Observe where performance degrades relative to trees. | Intuition for failure modes, backed by their own experiments |
| 60–75 | Notebook 4: Diagnostics & interpretation | Apply SHAP to both TFMs and XGBoost on the same dataset. Compare explanations. Discuss: are these explanations trustworthy? What can we still learn? Calibration plots and confidence analysis. | Practical diagnostic skills; awareness of interpretability caveats |
| 75–85 | Wrap-up: Decision framework | Collaborative exercise: given 3 new dataset descriptions, participants vote on which model they'd choose and why. We discuss as a group. | Internalized decision criteria |
| 85–90 | Q&A and next steps | Open discussion. Pointers to further resources, papers, and community. |
Requirements
- Laptop with Python 3.9+
- Familiarity with sklearn (fit/predict/cross_val_score)
- No deep learning experience needed
- All materials (notebooks + datasets + environment setup) will be distributed via a public GitHub repository at least 2 weeks before the event
Note on materials: The repository is currently being prepared and will contain all notebooks, datasets, and a
requirements.txtfor easy setup. A link will be shared with organizers as soon as it is live.
What attendees will be able to do after this tutorial
- Run tabular foundation models on their own datasets using a familiar sklearn-compatible API
- Benchmark TFMs against tree-based baselines with proper cross-validation and meaningful metrics
- Diagnose model behavior: identify when a TFM is failing, why, and what to do about it
- Interpret TFM outputs using SHAP while understanding the limitations of post-hoc explanations on learned priors
- Decide whether to adopt a tabular FM for a new project based on concrete, experience-backed criteria
Key takeaways
- A working local environment with tabular FM tooling ready to use
- Four completed notebooks they can reuse as templates on their own data
- Confidence to try (or deliberately skip) tabular FMs on their next project
Nicolas holds a Ph.D. in applied mathematics from Université Paris Dauphine - PSL, where his research focused on machine learning, with particular emphasis on attention mechanisms and geodesic approaches to segmentation. His work on designing advanced deep learning architectures for complex datasets has led to multiple publications at leading international conferences.
He brings hands-on expertise in self-supervised learning and large-scale optimisation, and is currently contributing to Neuralk's mission to develop the first enterprise tabular foundation model.