Jesus Salgado


Sessions

11-06
08:30
0min
SKA Regional Centres Architecture: One data lake, multiples nodes
Jesus Salgado

The SKA Observatory is a next-generation radio astronomy facility that will help to revolutionise our understanding of the Universe and the laws of fundamental physics. The observatory has three locations: in South Africa's Karoo region (SKA_MID), Western Australia's Murchison Shire (SKA_LOW) and the Global Headquarters in the United Kingdom. The SKA_MID and SKA_LOW locations will be capable of producing a stream of science data products on the order of 700 PB/year. This large data volume is unprecedented for the astronomical community and thus poses unique challenges for curating and providing access to the datasets and resources required to analyse them in order to derive the final scientific insights. The approach chosen is the development and adoption of the SKA regional centre concept in the form of a loose SRCNet association consisting of regionally funded contributions.

The SRCNet data lake will be centrally managed but distributed and federated at the storage elements level. Known challenges of data lakes should be addressed like data exploitation of the data lake through the integration of data and computing and data latency due to distributed repositories. We present the architecture design that is being developed for the SRCNet to allow scientific analysis of the SKA data from the SRCNet data lake that minimises as much as possible the throwbacks of the federated data lakes.

Science with data archives: challenges in multi-wavelength and time domain data analysis
Posters
11-06
08:30
0min
Prototyping access from visualisation tools to SKA science images and cubes stored in a rucio DataLake through IVOA discovery and access services
François Bonnarel, Susana, Marco Molinaro, Pierre Fernique, Vincenzo Galluzzi, Thomas Boch, Manuel Parra-Royón, caroline bot, Mark Allen, Jesus Salgado, Matthieu Baumann, Alessandra Zanichelli

Prototyping access from visualisation tools to SKA science images and cubes stored in a rucio DataLake through IVOA discovery and access services.

M.Allen, R.Barnsley, M.Baumann, F.Bonnarel, T.Boch, C.Bot, R.Butora, J.Collinson, P.Fernique, V.Galluzzi., R Joshi, M.Molinaro, M. Parra-Royon, J. Sanchez-Castaneda , S. Sanchez-Exposito, G.Tudisco, F .Vitello A.Zanichelli.

SKA is the major low frequency radioastronomy project of the future with several major scientific applications: It will upgrade the amount of available science data by several orders of magnitudes reaching eventually more than 700 petabytes of storage per year. The SKA observatory will proceed to the initial data processing to deliver observatory data products while the SKA Regional Center network (SRC) will provide storage for those and processing capabilities to deliver and store advanced data products for the user community.
Within the scope of the SRC network, Orange (visualisation), Magenta (data management) and Coral (node implementation) teams have prototyped the discovery acces and visualisation of science data. Our visualisation tools VisiVO and Aladin discover, access and visualize test science data produced by SKA pathfinders stored in the rucio DataLake. Science metadata functionality has been implemented by the Magenta team to the Rucio data lake prototype to demonstrate a means of enabling IVOA-compliant data discovery and server-side processing.
VisiVo, Aladin Desktop and Aladin Lite are able to query the Discovery service built on ObsCore and SCS IVOA protocols.
This allows them to load DataLink responses providing links towards a SODA cutout service developed by the Orange team able to extract subcubes or images directly from the datasets stored in the rucio DataLake.
The Rucio Storage Element and SODA developments have been deployed and configured on the Spanish SRC node, providing computing and storage resources, managed by the Coral Team members. This prototype paves the way to collaborative development in the SKA regional center network and shows the possible integration of VO services and visualisation tools in DataLakes and science platforms.

Science with data archives: challenges in multi-wavelength and time domain data analysis
Posters