Expanding Man (Michael Savastio)
My educational background is in both experimental and theoretical high energy physics. My initial programming experience mostly centered around scientific/numerical computing in C++ and fortran. Since receiving my PhD I have been working as a data scientist, and have been using Julia primarily both in and outside my job for almost 7 years. My recent programming experiences have involved both convex and non-convex optimization, including large-scale mixed integer conic programming, as well as machine learning and statistics. My interest in Julia and other new languages such as zig
, as well as my enthusiasm for the broader Linux ecosystem has also caused me to spend a lot of time with serialization, IPC and network protocols. I also enjoy video games which has led me to watch projects around gaming on Linux, and as a guitar player I'm also interested in audio.
Sessions
The parquet tabular data storage format has become one of the most ubiquitous, particularly in "big data" contexts where it is arguably the only binary format to successfully supplant CSV. Despite this, there are relatively few implementations of parquet, which, historically, has presented challenges for Julia. I will give a brief overview of Parquet2.jl, a pure Julia parquet implementation including comparison to other tools and formats and what is still needed to reach parity with pyarrow.
Gradient boosted trees are a wonderfully flexible tool for machine learning and XGBoost is a state-of-the-art, widely used C++ implementation. Thanks to the library's C bindings, XGBoost has been usable from Julia for quite a long time. Recently, the wrapper has been rewritten as 2.0 and offers many fun new features, some of which were previously only available in the Python, R or JVM wrappers.