Soss.jl: Probabilistic Metaprogramming in Julia
2019-07-25 , Room 349

This talk will explore the basic ideas in Soss, a new probabilistic programming library for Julia. Soss allows a high-level representation of the kinds of models often written in PyMC3 or Stan, and offers a way to programmatically specify and apply model transformations like approximations or reparameterizations.


Probabilistic programming is sometimes referred to as “modeling for hackers”, and has recently been picking up steam with a flurry of releases including Stan, PyMC3, Edward, Pyro, and Tensorflow Probability.

As these and similar systems have improved in performance and usability, they have unfortunately also become more complex and difficult to contribute to. This is related to a more general phenomenon of the “two language problem”, in which performance-critical domain like scientific computing involve both a high-level language for users and a high-performance language for developers to implement algorithms. This establishes a kind of wall between the two groups, and has a harmful effect on performance, productivity, and pedagogy.

In probabilistic programming, this effect is even stronger, and it’s increasingly common to see three languages: one for writing models, a second for data manipulation, model assessment, etc, and a third for implementation of inference algorithms.

Solving this “three-language problem” usually means accepting either lower performance or a restricted class of available models and inference algorithms.

It doesn’t have to be this way. The Julia language supports Python-level coding with C-level performance. In Julia, Julia’s own code is “first-class”: code can be pulled apart and manipulated as a data structure. This leads to an approach for high-level representation of models, with transformations and optimizations specific to a given model or inference family.

This is the approach taken in Soss, a small and extensible Julia library that provides a way to represent and manipulate probabilistic models. In this talk, we’ll discuss the need and for Soss, some of its concepts at a high level, and finally some recent advancements and upcoming opportunities.


Co-authors:

Chad Scherrer has been actively developing and using probabilistic programming systems since 2010, and served as technical lead for the language evaluation team in DARPA's Probabilistic Programming for Advancing Machine Learning ("PPAML") program. Much of his blog is devoted to describing Bayesian concepts using PyMC3, while his Soss.jl project aims to improve execution performance by directly manipulating source code for models expressed in the Julia Programming Language .

Chad is a Senior Data Scientist at Metis Seattle, where he teaches the Data Science Bootcamp.