Erik Welch EuroSciPy 2024

Erik Welch
.ical

Erik Welch is a senior system software engineer on the RAPIDS cuGraph team at NVIDIA and a core NetworkX developer. He has 20 years' experience using Python as a scientist, engineer, and open-source developer on a wide range of data and high-performance computing problems. He primarily works on nx-cugraph, an accelerated backend to NetworkX, and is the primary maintainer of the popular toolz library.

Institute / Company:

NVIDIA

Git*hub|lab:

https://github.com/eriknw

Sessions

08-28

11:05

30min

Understanding NetworkX's API Dispatching with a parallel backend

Erik Welch, Aditi Juneja

Hi! Have you ever wished your pure Python libraries were faster? Or wanted to fundamentally improve a Python library by rewriting everything in a faster language like C or Rust? Well, wish no more... NetworkX's backend dispatching mechanism redirects your plain old NetworkX function calls to a FASTER implementation present in a separate backend package by leveraging the Python's entry_point specification!

NetworkX is a popular, pure Python library used for graph(aka network) analysis. But when the graph size increases (like a network of everyone in the world), then NetworkX algorithms could take days to solve a simple graph analysis problem. So, to address these performance issues this backend dispatching mechanism was recently developed. In this talk, we will unveil this dispatching mechanism and its implementation details, and how we can use it just by specifying a backend kwarg like this:

>>> nx.betweenness_centrality(G, backend=“parallel”)

or by passing the backend graph object(type-based dispatching):

>>> H = nxp.ParallelGraph(G)
>>> nx.betweenness_centrality(H)

We'll also go over the limitations of this dispatch mechanism. Then we’ll use the example of nx-parallel as a guide to building our own custom NetworkX backend. And then, using NetworkX's existing test suite, we'll test this backend that we build. Ending with a quick dive into the details of the nx-parallel backend.

Community, Education, and Outreach

Dispatching, Backend Selection, and Compatibility APIs

Guillaume Lemaitre, Joris Van den Bossche, Tim Head, Erik Welch, Marco Gorelli, Sebastian Berg, Aditi Juneja, Stéfan van der Walt

Scientific python libraries struggle with the existence of several array and dataframe providers. Many important libraries currently mainly support NumPy arrays or pandas dataframes.
However, as library authors we wish to allow users to smoothly use other array provides and simplify for example the use of GPUs without the need for explicit use of cuda enabled libraries.

This session will be split into three related discussions around efforts to tackle this situation:
* Dispatching and backend selection discussion
* Array API adoption progress and discussion
* Dataframe compatibility layer discussion

High Performance Computing

Room 5

Erik Welch .ical

Sessions

Erik Welch
.ical