JuliaCon 2023

the new XGBoost wrapper
07-26, 14:40–14:50 (US/Eastern), 32-123

Gradient boosted trees are a wonderfully flexible tool for machine learning and XGBoost is a state-of-the-art, widely used C++ implementation. Thanks to the library's C bindings, XGBoost has been usable from Julia for quite a long time. Recently, the wrapper has been rewritten as 2.0 and offers many fun new features, some of which were previously only available in the Python, R or JVM wrappers.

We will discuss some new features as of 2.0 of the package including:

  • More flexible training via public-facing calls for single update rounds.
  • Tables.jl compatibility.
  • Automated Clang.jl wrapping of the full libxgboost.
  • Introspection of XGBoost internal data (DMatrix, now an AbstractMatrix).
  • Handling of missing data.
  • Introspection of the trees themselves via AbstractTrees.jl compatible tree objects.
  • Updated feature importance API.
  • Now fully documented!
  • Upcoming GPU support.

My educational background is in both experimental and theoretical high energy physics. My initial programming experience mostly centered around scientific/numerical computing in C++ and fortran. Since receiving my PhD I have been working as a data scientist, and have been using Julia primarily both in and outside my job for almost 7 years. My recent programming experiences have involved both convex and non-convex optimization, including large-scale mixed integer conic programming, as well as machine learning and statistics. My interest in Julia and other new languages such as zig, as well as my enthusiasm for the broader Linux ecosystem has also caused me to spend a lot of time with serialization, IPC and network protocols. I also enjoy video games which has led me to watch projects around gaming on Linux, and as a guitar player I'm also interested in audio.

This speaker also appears in: