Juliacon 2024

Extrae.jl: Advance profiling of Julia code in HPC clusters
07-12, 11:00–11:30 (Europe/Amsterdam), Else (1.3)

Extrae is a tracer and sampler profiler developed at the Barcelona Supercomputing Center (BSC). In this talk we will showcase its feature and demonstrate how to use it to optimise the performance of Julia applications, in particular in the high-performance computing domain.


Julia has revolutionized the development of high-performance applications. Yet its native profiling capabilities limit to callstack sampling (i.e. periodic statistical sampling of the process callstack). Although useful for identifying bottlenecks, callstack sampling fails to explain the source of performance degradation. Without querying hardware counters, it is impossible to know if vector units are fully used or that memory access pattern is provoking many cache misses.

In this work, we present the bindings to the Extrae tracer and sampler profiler developed at BSC.
It lets you:

  • Annotate user regions
  • Sample hardware counters
  • Inspect the callstack inside C libraries
  • Mark inter-node, inter-process, inter-thread communication
  • Intercept MPI, CUDA and OpenMP calls
  • Emit custom user events

We will showcase the performance evaluation of a some scientific apps written in Julia on x86_64 and AArch64 architectures. Some of these architectures features interesting capabilities such as scalable vector ISA (i.e. SVE) or unified memory between CPU and GPU (e.g. NVIDIA Grace Hopper). Extrae will show to be vital to understand its performance behaviour, and to later optimize it.

See also: GitHub
This speaker also appears in: