Notes from an All-Julia Honours Thesis Juliacon 2024

Notes from an All-Julia Honours Thesis
.ical
2024-07-10 10:53–10:56, Method (1.5)

For my Honours thesis, I extended Udell et al.'s work in Generalised Low-Rank Models (GLRMs), making GLRMs a tool for projecting data sets to a "fair" space according to various definitions of fairness. This enables the identifying of features that may act as proxy discriminators for protected characteristics such as gender and age. In this talk, I will share my experience of completing a year-long thesis, using Julia as the primary programming language, from zero Julia knowledge.

I will spend most of the three minutes talking about my journey learning Julia, from first encountering it in Udell et al.'s code to submitting an artifact written entirely in Julia. Because my thesis is in machine learning/computational humanities, I would like to highlight the differences, and in particular the advantages, of programming in Julia where most people would program in Python. It would also be nice to garner interest in making libraries for fair machine learning in Julia, which, as far as I know, do not currently exist.

Two major benefits of working with Julia were its performance advantages and intuitive benchmarking suite. As a result, programming in Julia has impacted the outcome of my thesis, especially in particularly tight deadlines. A major moment in my thesis was identifying that my quadratic-time loss function was too computationally expensive to get any meaningful results in time for a major deadline, which I only identified as a result of Julia's benchmarking tools. Being able to compare heap allocations between different algorithms was a major step in my developing a linear-time approximation of my loss function.

Another major advantage I found is that Julia is a new language, and the community is tight-knit and supportive. In particular, there was one academic at my home university who is involved in the Julia community, who took the time out to have a meeting with me discussing using distributed programming in Julia to speed up my quadratic-time loss function.

Finally, I would like to briefly mention some of the challenges I encountered in programming my thesis artifact in Julia. One particular issue I encountered was an incompatibility between Julia's CUDA library and the matrix inverse function in its LinearAlgebra library, which I only managed to solve by closely following the stack trace and reading more deeply about array programming in CUDA.jl.

Vikram Sondergaard

Notes from an All-Julia Honours Thesis .ical 2024-07-10 10:53–10:56, Method (1.5)

Notes from an All-Julia Honours Thesis
.ical
2024-07-10 10:53–10:56, Method (1.5)