2019-07-25, 17:15–17:25, Elm B
TimerOutputs.jl is a tool that lets you annotate sections in your code so that after execution, a nicely formatted table with information about how much time and allocations were spent in each section can be shown.
Julia has for a long time come with a built-in profiler, which is available as the Profile standard library. This profiler is a sampling profiler which means that it collects snapshots of the stacktrace at regular intervals. After execution of the profiled code, the number of traces that were collected in each function can be displayed which gives a good estimate where the code is spending its time. While sampling profiling is a very useful and cheap way of profiling code, there are a few drawbacks:
Since only samples of the execution is collected, exact information of e.g. the time spent in functions or the number of calls to a function is not available.
Not being able to annotate what parts of the code you care about mean that the output from the profiler is sometimes noisy and hard to interpret for non-experts.
Allocations can not be attributed to certain sections of the code.
TimerOutputs is a Julia package that provides an instrumenting profiler which requires you to annotate your code with labeled sections. While the code is running, TimerOutputs records timing and allocation data in these sections and can then print a summary back to the user. This can sometimes make it easier to get a high level overview of the performance characteristics of the code.
An example of the output from TimerOutputs is shown below.
─────────────────────────────────────────────────────────────────────────────── Time Allocations ────────────────────── ─────────────────────── Tot / % measured: 6.89s / 97.8% 5.20GiB / 85.0% Section ncalls time %tot avg alloc %tot avg ─────────────────────────────────────────────────────────────────────────────── assemble 6 3.27s 48.6% 545ms 3.65GiB 82.7% 624MiB inner assemble 240k 1.92s 28.4% 7.98μs 3.14GiB 71.1% 13.7KiB linear solve 5 2.73s 40.5% 546ms 108MiB 2.39% 21.6MiB create sparse matrix 6 658ms 9.77% 110ms 662MiB 14.6% 110MiB export 1 78.4ms 1.16% 78.4ms 13.1MiB 0.29% 13.1MiB ───────────────────────────────────────────────────────────────────────────────
The timing and allocations for each section is presented and the call graph (e.g. that assemble calls inner assemble) is shown by indentation.
This lightning talk gives a short summary of the implementation, syntax and use cases of TimerOutputs.