JuliaCon 2023

LotteryTickets.jl: Sparsify your Flux Models
07-28, 10:40–10:50 (US/Eastern), 32-D463 (Star)

We present LotteryTickets.jl, a library for finding lottery tickets in deep neural networks: pruned, sparse sub-networks that retain much of the performance of their fully parameterized architectures. LotteryTickets.jl provides prunable wrappers for all Flux.jl defined layers as well as an easy macro for making a predefined Flux model prunable.

Roughly, the lottery ticket hypothesis says that only a small fraction of parameters in deep neural models are responsible for most of the model's performance. Further, if you were to initialize a network with just these parameters, that network would converge much more quickly than the fully parameterized model. We call the parameters of such subnetworks winning lottery tickets. A straightforward way to search for lottery tickets is to iteratively train, prune, and reinitialize a model until the performance begins to suffer.

We introduce LotteryTickets.jl, a forthcoming Julia library for pruning Flux models in order to find winning lottery tickets. LotteryTickets.jl provides wrappers for Flux layers so that one can define a normal Flux model and then prune it to recover the lottery tickets. All of the layers defined in Flux are supported, and it is made easy to define prunable wrappers for custom Flux layers.

In addition to a brief primer on model sparsification, this talk will discuss the main interface for LotteryTickets.jl, the key implementation choices, and an example of training and pruning a model end-to-end, even in the presence of custom Flux layers.

I am a PhD student in NLP at the Tokyo Institute of Technology and a PhD Student Researcher at Google Tokyo. My research focuses on tokenization neural language modeling, in particular within CJK translation.

I can be found at https://sigmoid.social/@mc and at https://theoreticallygoodwithcomputers.com.