2023-07-26 –, 32-123
Gradient boosted trees are a wonderfully flexible tool for machine learning and XGBoost is a state-of-the-art, widely used C++ implementation. Thanks to the library's C bindings, XGBoost has been usable from Julia for quite a long time. Recently, the wrapper has been rewritten as 2.0 and offers many fun new features, some of which were previously only available in the Python, R or JVM wrappers.
We will discuss some new features as of 2.0 of the package including:
- More flexible training via public-facing calls for single update rounds.
- Tables.jl compatibility.
- Automated Clang.jl wrapping of the full
libxgboost
. - Introspection of XGBoost internal data (
DMatrix
, now anAbstractMatrix
). - Handling of
missing
data. - Introspection of the trees themselves via AbstractTrees.jl compatible tree objects.
- Updated feature importance API.
- Now fully documented!
- Upcoming GPU support.
My educational background is in both experimental and theoretical high energy physics. My initial programming experience mostly centered around scientific/numerical computing in C++ and fortran. Since receiving my PhD I have been working as a data scientist, and have been using Julia primarily both in and outside my job for almost 7 years. My recent programming experiences have involved both convex and non-convex optimization, including large-scale mixed integer conic programming, as well as machine learning and statistics. My interest in Julia and other new languages such as zig
, as well as my enthusiasm for the broader Linux ecosystem has also caused me to spend a lot of time with serialization, IPC and network protocols. I also enjoy video games which has led me to watch projects around gaming on Linux, and as a guitar player I'm also interested in audio.