2026-08-10 –, Room 2
High-Performance Computing (HPC) empowers modern science and engineering by enabling the simulation and analysis of complex systems at unprecedented scales on cutting-edge supercomputers. Julia, as a dynamic programming language designed for scientific computing, uniquely combines the ease of high-level syntax with near C and Fortran execution speed, making it a compelling vehicle for performance engineering on supercomputers.
This workshop offers a hands-on introduction to performance engineering with Julia on modern HPC systems, guiding participants through the workflow of analyzing, optimizing, and scaling Julia codes on a real HPC environment. Using interactive Jupyter notebooks backed by the Otus system at the Paderborn Center for Parallel Computing (PC2), participants will experiment with live Julia code and performance analysis tools to gain practical proficiency in optimization techniques.
The three parts of the workshop will first introduce the organization of modern HPC clusters that influences Julia code performance, then examine the performance engineering workflow that identifies and solves optimization problems and finally present a case study of the workflow applied to a real scientific application.
Through the combination with hands-on exercises, participants will not only understand the core principles of performance engineering but also actively practice optimizing and scaling Julia programs on the Otus HPC system at PC2.
Overview and Goals
This 3-hour workshop provides essential knowledge and practical skills for performance engineering with Julia on modern high-performance computing (HPC) systems. Participants will understand the hierarchy of an HPC system and CPU architectures from a Julia programmer's perspective, learn systematic workflows for performance analyzing, optimizing, and parallelizing Julia programs, and apply these methods to real-world scientific computing applications.
Session 1: Demystifying Modern Supercomputers and Processor Architectures
This session introduces the hierarchical architecture of modern HPC clusters for Julia programmers. Using the state-of-the-art Otus system at the Paderborn Center for Parallel Computing (PC2) participants will explore the structure of an HPC cluster, from interconnected compute nodes and node-level memory hierarchy to the core architecture of modern CPUs, such as AMD 5th-generation Turin processors with AVX-512 vector units. Interactive Julia code examples, executed through the PC2 JupyterHub service running on Otus, will accompany the presentation.
This session concludes with an introduction to the Roofline Performance Model, which helps participants reason about computational performance, identify bottlenecks, and recognize optimization opportunities. This session establishes the foundation for performance engineering on modern HPC systems.
Session 2: Performance Engineering Workflow in Julia
This session focuses on practical performance optimization in Julia, illustrating how to fully exploit the computing power of modern CPUs. Following an introduction to performance analysis, participants will learn to profile Julia codes. Key topics include writing type-stable Julia code, optimizing memory layout for cache efficiency, reducing heap allocations, and leveraging SIMD vectorization. Participants will conduct performance measurements using BenchmarkTools.jl and LIKWID.jl to understand the interplay between hardware performance counters and performant Julia programming.
All topics will be illustrated through simple Julia examples executed on the Otus system, allowing participants to follow along and gain hands-on experience.
Session 3: Case Study: Molecular Dynamics Simulation of Liquid Argon (ArgonMD)
The final session applies the learned techniques to a realistic scientific application, ArgonMD, a molecular dynamics simulation of liquid argon based on the Lennard-Jones potential under periodic boundary conditions. Through a live demonstration, participants will observe how successive optimization steps yield performance gains. The case study progresses from a single-core baseline to parallelization across multiple compute nodes on Otus, consolidating the complete workflow for performance engineering introduced in earlier sessions.
By the end of this workshop, participants will be able to:
- Understand the structure of modern HPC systems and CPU architectures in Julia programming.
- Apply the Roofline model to analyze and reason about performance of Julia codes.
- Use profiling tools to identify computational bottlenecks in Julia programs.
- Utilize systematic optimization techniques to write high-performance Julia codes.
- Employ multithreading and distributed computing to scale Julia applications on HPC systems.
This workshop integrates fundamental concepts, benchmarking, live Julia code examples, and a case study to equip participants with the knowledge and methods to develop performant Julia applications for state-of-the-art supercomputers.
Target Audience
This workshop is designed for a broad audience of Julia programmers, from those new to HPC environment to experienced computational scientists and HPC software developers seeking to leverage Julia as a high-performance and high-productivity language for their scientific research.
No prior experience with HPC is required for Session 1; only basic Julia programming is assumed. Session 2 builds upon this foundation and presents practical workflows for performance engineering in Julia. Session 3 is particularly suited to domain scientists interested in developing optimized and scalable Julia codes that can take full advantage of modern HPC systems.
Detailed Outline
This workshop is planned for 3 hours and combines presentations, live Julia code demos, and hands-on exercises for an interactive learning experience.
Session 1: Demystifying Modern Supercomputers and Processor Architectures
- Speaker: Prof. Dr. Christian Plessl (Chair Professor W3 for High-Performance Computing, Managing Director of PC2, Paderborn University)
- Format: Presentation with interactive Jupyter notebooks
- Topics:
- Modern supercomputers: the Otus system at PC2
- Accessing Otus via PC2 JupyterHub
- Hierarchy of HPC cluster system
- CPU architectures in Julia programming
- Roofline Performance Model
Session 2: Performance Engineering Workflow in Julia
- Speaker: Alex Wiens (HPC Advisor at PC2, Paderborn University)
- Format: Presentation with interactive Jupyter notebooks
- Topics:
- Introduction to code optimization in Julia
- Profiling
- Benchmarking with BenchmarkTools.jl
- Writing type-stable Julia code
- Hardware performance counters with LIKWID.jl
- Memory optimization and efficient cache utilization
- SIMD vectorization
Session 3: Case Study: Molecular Dynamics Simulation of Liquid Argon (ArgonMD)
- Speaker: Dr. Xin Wu (Scientific Advisor Theoretical Physics/Chemistry at PC2, Paderborn University)
- Format: Presentation with live demos
- Topics:
- Applying performance engineering workflows to scientific computing
- Progressive optimization of ArgonMD on single CPU-core
- Node-level and multi-node parallelization on the Otus system
Hands-on Exercises
Participants only need a web browser to access the PC2 JupyterHub platform, where all exercises run directly on the Otus system. Each exercise is provided as a Jupyter notebook, containing guided explanations, starter code, and space for experimentation. Hints and reference solutions will also be provided to ensure participants progress smoothly throughout the workshop.
Alex Wiens works as High-Performance Computing (HPC) advisor at the Paderborn Center for Parallel Computing (PC2). His work's focus is performance analysis, consultation and training.
Dr. Xin Wu is a Scientific Advisor for Theoretical Physics/Chemistry at the Paderborn Center for Parallel Computing (PC2), Paderborn University. His doctoral research focused on GPU-accelerated quantum chemistry for high-performance computing. At PC2, he is responsible for HPC training, user support and consultation, as well as code optimization and parallelization, with particular emphasis on FPGA‑accelerated kernels for quantum chemistry calculation.
Christian Plessl is professor (W3) for High-Performance Computing at the department of Computer Science at Paderborn University. He is also managing director of the Paderborn Center for Parallel Computing, which is a central scientific institute of Paderborn University and a National High-Performance Computing center in the NHR alliance. He is a member of the board of directors of the NHR association.
Dr. Plessl earned a PhD degree (Dr. sc. ETH) in Computer Engineering from ETH Zurich in 2006, and a MSc degree in Electrical Engineering in 2001, also from ETH Zurich. He has been a principal investigator in numerous national and transnational research projects funded by the German Research Foundation (DFG), the German Ministry of Education and Research (BMBF), the State of North Rhine-Westfalia, and the European Commission. His research has also received support from industry, for example, by grants from AMD/Xilinx, Intel/Altera, Fujitsu, and others.
Dr. Plessl has authored and co-authored more than 100 peer-reviewed publications and his research has been honored with several awards, e.g., the significant paper award 2015 of FPL conference, the best paper awards at HEART 2023, ReConFig 2014 and 2012, the Paderborn University Research Award 2018 and 2009, and the SEW-EURODRIVE Studienpreis award in 2001. He is a senior member of the IEEE, member of the ACM, Gesellschaft für Infromatik (GI), and the HiPEAC Network of Excellence. He is a regular reviewer for scientific journals and serves on the program committee of major international conferences. His research interests include architecture and tools for high-performance parallel and reconfigurable computing, scientific computing, and adaptive computing systems.