2024-09-25 –, Louis Armand 1 - Est
Almost all modern CPU have a vector processing unit, making it possible to write faster code for a large category of problems, at the cost of portability - there a re many different instruction sets in the wild! The xsimd library makes it possible to write portable C++ code that targets different architectures and sub-architectures. The specialization choice can be made at compile-time or at runtime, using a provided dispatching mechanism. Intel, ARM, RiscV and Webassembly are supported, and the library has already been adopted by Xtensor, Pythran, Apache Arrow and Firefox.
From smartphone running on basic ARM chips to HPC station with the latest Intel CPUs, most modern CPU have a dedicated vector processing unit, making SIMD computing an optimization angle for many applications. Unfortunately, each chipset has its own instruction set, with different flavor across CPU families and within the same CPU family. Performance at the price of portability.
The xsimd template library abstracts those difference to provide a common high-level API, bridging the gap between the different archtiectures and providing composite fallbacks when a one-to-one mapping with the ISA doesn't exist. It started as a vectorization backend for the xtensor library, has been adopted by Pythran, Krita, Apache Arrow and lately by Firefox to power its AI-based offline translation engine. It supports all Intel SSE variants, AVX, AVX2 and several AVX-512 extensions. ARM Neon with a few extensions, WebAssembly and RiscV are also supported target. In addition to these architectures, xsimd supports a special scalar target to have a similar API to target scalar and vector code, and an emulated target for testing purpose.
This talk will present the xsimd API, showcase a few examples and discuss the difference with intrinsic-based programming and compiler auto vectorization approaches.
Sometimes a compiler engineer, sometimes a wood chopper, but also a retired wizard of the coast. Enjoy telling stories, be it old legends or modern bit yarns.