EuroSciPy 2024

Valentin Haenel

Valentin 'esc' Haenel is a long-time "Python for Data" user and developer who
still remembers hearing Travis Oliphant's NumPy keynote at the EuroScipy 2008.
This was during a time where he first became aware of the nascent scientific
Python stack. He started using Python for simple modeling of spiking neurons
and evaluation of data from perception experiments during his Masters degree in
computational neuroscience. Since then he has been active as a contributor
across more than 100 open source projects. For example, within the Blosc
ecosystem where he has contributed to Bcolz, Python-Blosc and Bloscpack.
Furthermore, he has acquired significant experience as a Git trainer and
consultant and had published the first German language book about the topic in
2011. In 2014 and 2015 he helped kickstart the PyData Berlin community
alongside a few other volunteers and co-organized the first two editions of the
PyData Berlin Conference. Since 2019 he works for Anaconda as a senior software
engineer on the Numba project. His areas of contribution for the project so far
have been social architecture, release management, mutable datastructures and
recently, the compiler frontend.


Institute / Company

Anaconda Inv.

Homepage

https://haenel.co/

Twitter handle

@esc___

Git*hub|lab

https://github.com/esc


Session

08-29
13:55
30min
Regularizing Python using Structured Control Flow
Valentin Haenel

In this talk we will present applied research and working code to regularize
Python programs using a Structured Control Flow Graph (SCFG). This is a novel
approach to rewriting programs at the source level such that the resulting
(regularized) program is potentially more amenable to compiler optimizations,
for example when using Numba[1] to compile Python. The SCFG representation of
a program is simpler to analyze and thus significantly easier to optimize
because the higher order semantic information regarding the program structure
is explicitly included. This can be of great benefit to many scientific
applications such as High Performance Computing (HPC), a discipline that relies
heavily on compiler optimizations to turn user source code into highly
performant executables. Additionally the SCFG format is a first step to
representing Python programs as Regionalized Value State Dependence Graphs
(RVSDGs). This is another recently proposed program representation which is
expected to unlock even more advanced compiler optimizations at the
Intermediary Representation (IR) level. The talk will cover an introduction to
the theory of SCFGs and RVSDG and demonstrate how programs are transformed. We
will start with simple Python programs containing control-flow constructs and
then show both the SCFG representation and the resulting regularized result to
illustrate the transformations.

High Performance Computing
Room 6