Lessons learned from building LOFAR data pipelines
In this presentation I will show how automated data processing provides great opportunities in developing robust and efficient results in astronomical research. As research code and data processing pipelines grow ever more complex, it has become more important than ever that scientists have access to frameworks that facilitate the validation of their results, and ensure that those results are fully reproducible.
I will demonstrate the current state of pipeline development for the processing of data from the International LOFAR telescope, how this pipeline leverages familiarity of common software tools and community-supported frameworks, and how research software can be embedded into this pipeline to create complex but understandable and consistent processing steps that reliably produce science-ready results.
Another point I want to address is the importance of interdisciplinary communication and coding standards. The existence of such allows a larger part of the scientific community to collaborate on mutually shared goals — of which data processing is a prominent example — and allows us as developers to create and maintain tools that anticipate the needs of future scaling. I will show how the pipeline that I will present is partially a product of such collaboration.
Finally, during this talk I would like to reflect on the broader lessons I have learned during my time developing this pipeline as someone who had no prior experience as a scientific software developer. I hope that, by sharing my experiences, I can inspire others to build and improve on them, and that in turn I can learn from the experiences of others.