SciPy 2026

From LiDAR to action: detecting upland gullies to combat erosion and forest fires
2026-07-15 , Thomas Swain Room

UChicago's Data Science Institute (DSI) partners with 11th Hour Project to turn data insights into action. In this talk, I'll focus on our collaboration with Occidental Arts & Ecology Center (OAEC)'s Fuels to Flows program, which stabilizes upland waterways by adding brushwood that would otherwise fuel forest fires. Gullies are hidden by trees, so we used publicly available LiDAR to cleanly identify gullies by shape with a lightweight convolutional model. I'll show how Numba made it possible to convolve hundreds of gigabytes of images with unusually large kernels and how we delivered these map layers via static hosting using PMTiles, even with interactive features like computing elevation profiles along hand-drawn lines.


Who this is for: GIS data scientists who work with rasters, problems of scale, and shipping interactive results to non-technical stakeholders with minimum infrastructure. The general SciPy audience may also be interested in this example of a "right-sized" ML approach (the "lightweight convolutional model"), as well as ways data science can contribute to nonprofits and the public good.

Motivation/context: The UChicago DSI 11th Hour group (https://datascience.uchicago.edu/outreach/11th-hour-project/) partners with 11th Hour Project grantees, spanning energy, food & agriculture, human rights, and marine ecology, to build software and data products for social and environmental impact. This talk focuses on one environmental case study within the broader pattern of supporting mission-driven organizations with tools that reduce manual work and scaling beyond ad-hoc analytics.

Problem: OAEC has implemented the Fuels to Flows program (https://oaec.org/our-work/wildlands/fuels-to-flows/) on their own site and Monte Rio Redwoods Regional Park (both in Sonoma County, CA), but expanding the program requires identifying new sites, working with land-owners to secure the right permits, and hiring contractors. Our work addresses the first step by making gullies, ladder fuels, and erosion patterns visible on a county-wide interactive map.

Data: Sonoma County publicly provides LiDAR-derived products: high-resolution DEMs scanned in 2013 and 2022 (1 m and 0.5 m grids), as well as proxies of ladder fuels that allow ground fires to climb to the forest canopy.

Method: Standard gully-finding heuristics produce rasters to search by eye; we extended this technique to (1) reduce noise by convolving images with trough-shaped, rather than point-like, kernels, (2) approximate a CNN with engineered, rather than learned, features due to the paucity of hand-labeled data, and (3) build a vector-based "road network" of gullies, rather than an image. This technique has a spin-off used in the DSI's Clinic course: a UChicago student adopted it to vectorize blood vessels in MRI images to predict breast cancer treatment response.

Delivery: We provide GIS-ready layers, but the intended users of this work are not GIS experts and the files are unwieldy (400 GB total). Therefore, we built a specialized map app as a website that loads data on demand as the user zooms into it. We also need to minimize our maintenance burden, since this is one of many projects, so we formatted the data as PMTiles, which are flat files that can be served with static web hosting (CloudFlare, in our case), with no application-specific server logic.

Map app: https://oaec-found-gully.vercel.app/
GitHub: https://github.com/dsi-clinic/oaec-found-gully
(currently private; I'll see if I can make it public before submitting)

What attendees will learn:
1. A practical middle ground between simple filters and deep learning when labels are scarce.
2. How to insert a custom optimization with Numba when standard functions (convolve2d in various libraries) restrict performance due to unusual conditions (unusually large kernels in our case).
3. How to deliver large maps in tiles without requiring a custom server.

Jim was trained as a particle physicist with a Ph.D. from Cornell and helped commission the CMS experiment at the Large Hadron Collider (LHC). He has worked as a data scientist (at Open Data Group) and a software developer (at Princeton), and was the founder of the Awkward Array project. Jim is now at the University of Chicago's Data Science Institute, where he solves data analysis problems for nonprofit organizations.

This speaker also appears in: