Matthew Whiting
Matthew Whiting is a Team Leader in ATNF Science at CSIRO Space & Astronomy, and the head of Data Operations for CSIRO's ASKAP radio telescope. He leads the development and operations of the high-performance processing pipelines for ASKAP that run at the Pawsey Supercomputing Centre.
Matthew has extensive experience in both astronomy and software development, having been part of the ASKAPsoft development team for the life of ASKAP. He is the sole developer of the Duchamp source-finder, which has been adapted to form the Selavy source-finder used in ASKAPsoft, and has expertise in developing and running highly-parallel software in supercomputing environments.
Matthew has a strong background in astronomical research - he is a member of several ASKAP survey science teams, with particular interests in quasars, AGN, and absorption-line studies, and also has a background in observational astronomy in the optical and infrared.
Session
The Australian Square Kilometre Array Pathfinder (ASKAP) is a new-technology radio telescope operated by CSIRO at Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory in the Western Australian outback. Its innovative receivers, with their wide field-of-view, generate very large data rates, necessitating high-performance computing to create the required calibrated images and catalogues, and deposit them in the CSIRO ASKAP Science Data Archive (CASDA) for use by astronomers.
The processing is orchestrated by the ASKAP pipeline, a scripted workflow that interfaces with the Slurm workload manager to run all necessary data preparation, calibration, imaging, and source-extraction tasks. The computationally-intensive processing is done using a custom-written imaging package called ASKAPsoft, specially designed to handle the scale of data produced by ASKAP. Crucially, the pipeline must run in near-real-time to keep up with the incoming data rate, allowing the telescope to efficiently survey the entire sky.
The ASKAP pipeline is operational, with regular survey observing resulting in large amounts of data (currently >3.8PB since full-surveys started late 2022) being made publicly available through CASDA. ASKAP processing is a demonstration of what can be possible through a large and complex nearly-autonomous supercomputing workflow, and provides important lessons for planning of even larger workflows anticipated for future instruments.
This talk will describe the design decisions that went into creating and scaling up the workflow, and describe how it has been set up to work on the supercomputers at the Pawsey Supercomputing Centre. This will include the range of different types of processing jobs and their contrasting requirements, the impact of the high I/O on overall processing efficiency, and lessons learned from both developing and running the pipeline at scale. We look ahead also to planned upgrades, as well as considerations for implementing processing for future facilities such as the SKA.