2026-07-14 –, Intro
Python has become the dominant language in scientific computing, even in domains that demand high performance. This is largely due to the power of array-oriented programming, which separates complex problems into two parts: lightweight bookkeeping and heavy numerical computation. The latter is handled efficiently by vectorized operations that rely on fast, precompiled libraries.
This tutorial introduces array-oriented programming as a distinct mindset that encourages new ways of structuring problems. Rather than focusing on any one library, we’ll cover general techniques that apply to any array library with a particular focus on NumPy and JAX. You'll work in groups on four class projects: Conway's Game of Life using arrays, iterative computations on arrays, just-in-time (JIT) compilation for the Mandelbrot set, and exploring data in ragged arrays. This tutorial focuses on the thought process: all of the problems are to be solved in an imperative way (for loops) and an array-oriented way.
The tutorial will alternate between short lectures and short exercises for the audience followed by a guided tour through solutions, alternatives, and trade-offs. For exact time slots for each lecture and project, consult the table below.
Part 1: Array-Oriented Programming Fundamentals
Lecture 1: Introduce array-oriented programming as a paradigm. Compare imperative, functional, and array-oriented styles using simple and complex examples (3-body problem). Demonstrate speed/memory advantages. Work through path length example.
Project 1: Attendees implement Conway's Game of Life using arrays. Given imperative solution, attendee create NumPy version that's significantly faster. Stretch goal: discover convolution-based solution.
Solutions: Present manual solution, boundary condition handling, and elegant convolution approach with performance comparisons.
Part 2: Limitations of Array-Oriented Programming
Lecture 2: Discuss disadvantages: (1) intermediate arrays problem (quadratic formula example with timing), (2) "iterate until converged" problem (Newton's method, connection to ML epochs).
Project 2: Attendees perform tree-traversal in an array-oriented way, walking all input points down the tree simultaneously.
Solutions: Present immutable and in-place approaches, compare performance across Python, NumPy, Numba, and JAX.
Part 3: JIT Compilation
Lecture 3: Introduce JIT compilation as solution. Demonstrate Numba (requires imperative code) and JAX (array-oriented but limited by dynamic branching) on quadratic formula.
Project 3: Students accelerate Mandelbrot set computation using Numba and JAX. Compare performance of imperative Python, NumPy, and JIT-compiled versions.
Solutions: Show optimized implementations, "Mandelbrot on all accelerators," discuss GPU programming advantages.
Part 4: Ragged and Nested Arrays
Lecture 4: Present ragged, nested, missing, and heterogeneous data examples.
Project 4: Students compute path lengths from NYC taxi trip data in Parquet format with ragged coordinate arrays.
Solutions: Present efficient solution, discuss practical handling of ragged arrays, mention additional resources.
Here is a general outline:
- 0:00‒0:15 (15 min) Lecture 1: Array-oriented programming and its benefits
- 0:15‒0:35 (20 min) Project 1: Conway’s Game of Life using arrays
- 0:35‒0:45 (10 min) Break
- 0:45‒1:00 (15 min) Solutions to project 1
- 1:00‒1:15 (15 min) Lecture 2: Disadvantages of array-oriented programming
- 1:15‒1:35 (20 min) Project 2: Iterative computations on arrays
- 1:35‒1:45 (10 min) Break
- 1:45‒2:00 (15 min) Solutions to project 2
- 2:00‒2:15 (15 min) Lecture 3: JIT-compilation with Numba and JAX
- 2:15‒2:35 (20 min) Project 3: JIT-compilation of the Mandelbrot set
- 2:35‒2:45 (10 min) Break
- 2:45‒3:00 (15 min) Solutions to project 3
- 3:00‒3:15 (15 min) Lecture 4: Ragged and deeply nested arrays
- 3:15‒3:35 (20 min) Project 4: Exploring data in ragged arrays
- 3:35‒3:45 (10 min) Break
- 3:45‒4:00 (15 min) Solutions to project 4
Participants should have a basic familiarity with Python and very basic NumPy.
Prior Python Programming Level of Knowledge Expected: Basic; enough to understand loops, if statements, function calls, etc.
NumPy Level of Knowledge Expected: Basic; such as the content of the "Introduction to Numerical Computing with NumPy" tutorial.
I'm a PhD student in the Department of Physics and Astronomy at Rice University, conducting research in high-energy physics as a member of the CMS experiment at the Large Hadron Collider at CERN. My work focuses on studying Higgs boson decays into two photons, analyzing data collected by the CMS detector, and contributing to software development for large-scale scientific analyses. I'm passionate about scientific computing and open-source tools that enable reproducible and efficient research. I’m maintainer of Awkward Array, an array library for nested, variable-sized data, using NumPy-like idioms, and an author and maintainer of Coffea, a toolkit designed to simplify data analysis in particle physics. With deep experience in the scientific Python ecosystem, I enjoy building tools that drive insight and accelerate scientific discovery.
Jim was trained as a particle physicist with a Ph.D. from Cornell and helped commission the CMS experiment at the Large Hadron Collider (LHC). He has worked as a data scientist (at Open Data Group) and a software developer (at Princeton), and was the founder of the Awkward Array project. Jim is now at the University of Chicago's Data Science Institute, where he solves data analysis problems for nonprofit organizations.