Histogram-based Gradient Boosting in scikit-learn 0.21 EuroSciPy 2019

Histogram-based Gradient Boosting in scikit-learn 0.21
.ical
2019-09-05 11:00–11:30, Track 2 (Baroja)

In this presentation we will present some recently introduced features of the scikit-learn Machine Learning library with a particular emphasis on the new implementation of Gradient Boosted Trees.

scikit-learn 0.21 was recently released and this presentation will give an overview its main new features in general and present the new implementation of Gradient Boosted Trees.

Gradient Boosted Trees (also known as Gradient Boosting Machines) are very competitive supervised machine learning models especially on tabular data.

Scikit-learn offered a traditional implementation of this family of methods for many years. However its computational performance was no longer competitive and was dramatically dominated by specialized state of the art libraries such as XGBoost and LightGBM. The new implementation in version 0.21 uses histograms of binned features to evaluate the tree node spit candidates. This implementation can efficiently leverage multi-core CPUs and is competitive with XGBoost and LightGBM.

We will also introduce pygbm, a numba-based implementation of gradient boosted trees that was used as prototype for the scikit-learn implementation and compare the numba vs cython developer experience.

Project Homepage / Git:

https://scikit-learn.org

Project Homepage / Git:

https://scikit-learn.org

Abstract as a tweet:

Histogram-based Gradient Boosted Trees in scikit-learn 0.21

Python Skill Level: basic Domain Expertise: some Domains: Big Data, Machine Learning, Parallel computing / HPC, Statistics

Olivier Grisel

Olivier is a Software Engineer at Inria working on scikit-learn and related projects of the Python Data ecosystem.

This speaker also appears in:

Introduction to scikit-learn: from model fitting to model interpretation

Histogram-based Gradient Boosting in scikit-learn 0.21 .ical 2019-09-05 11:00–11:30, Track 2 (Baroja)

Histogram-based Gradient Boosting in scikit-learn 0.21
.ical
2019-09-05 11:00–11:30, Track 2 (Baroja)