Neven Caplar
Neven is a research scientist at the University of Washington / Rubin Observatory focused on developing scalable tools and methods to analyze time-domain data from large surveys. His current research involves building software, including contributions to LSDB and HATS, which support cross-matching and large-scale analytics across datasets from sources like the Rubin Observatory. He received his PhD from ETH Zurich and worked previously at Princeton University, developing a data reduction pipeline for Prime Focus Spectrograph at Subaru Observatory.
Session
In recent years, the exponential growth of large survey catalogs has introduced new challenges in the joint analysis of astronomical datasets, particularly as we move towards handling petabytes of data. Our demo will showcase the latest advancements in our Large Survey DataBase (LSDB) framework. The framework utilizes a particular hierarchically sharded spatial partitioning of large datasets, using Parquet to store the data. This approach facilitates efficient and scalable cross-matching and analysis of big datasets.
In this demo, we will explore the new features in LSDB; such as support for nested Pandas/Dask, making it easier to work on time-domain data and spectral data by storing observations from the same astronomical objects in the same dataframe row. We will demonstrate how users can start their analysis on small subsection of sky and easily scale up their analysis after initial testing. We will showcase the cross-matching ability across large datasets and demonstrate real-world applications by applying analysis functions on complete wide-sky synoptic datasets. We will highlight our collaborations with our partners (such as STScl, IPAC, Rubin, and CDS) to provide various catalogs in this format and show how we can utilize the Fornax cloud platform to work with diverse datasets in a unified cloud-based framework.