Ludovic Räss
Geo-HPC, Julia GPU & Supercomputing.
Sessions
Julia offers the flexibility of a high-productivity language while providing control, performance, and compatibility with high-performance computing (HPC) hardware. This workshop demonstrates how Julia makes modern HPC accessible. It covers resource configuration, distributed computing, code optimization for CPUs and GPUs, and versatile workflows. Participants will have the opportunity to experience it firsthand with a hands-on session on a GPU-powered supercomputer.
We introduce FastIce.jl, a novel ice flow model for massively parallel architectures, written in Julia. Leveraging GPUs (Nvidia, AMD) and supporting distributed computing through MPI.jl, FastIce.jl includes a thermo-mechanically coupled Full-Stokes ice flow model. We present the performance testing of FastIce.jl in single-node and distributed scaling benchmarks on LUMI, the largest European supercomputer.
The rapidly increasing amount of data in the geosciences requires new tools and workflows to processing them, along with software to modelling the physics of the governing processes. Julia offers a great basis for this by combining the benefits of a high-level language, such as ease of use and interactivity, with the features of a low-level language, including speed, efficiency, scalability, and native GPU support.
Modeling subglacial water flow and its link to ice dynamics is necessary for accurate predictions of ice sheet response to a warming climate. Here we present a re-implementation of the widely used GlaDS model to run on GPUs. We show-case the matrix free implementation which leverages the full capabilities of the GPU, present model runs of test cases, show the model's scalability and provide an outlook towards inversion schemes and high-resolution continental-scale applications.
We present a successful approach for a sustainable development of stencil-based HPC applications in Julia. The approach includes automatic performance optimization for hardware-agnostic high-level kernels, data layout abstractions enabling memory layouts optimized per backend, and GPU-aware inter-process communication that is automatically hideable behind computation. We demonstrate on multiple examples near optimal performance and scaling on thousands of GPUs.