Statistician and SIMD-enthusiast.
BLAS & LAPACK are an integral component of many numerical algorithms. Due to their importance, a lot of effort has gone into optimizing ASM/C/Fortran implementations.
Nonetheless, early work demonstrated Julia implementations were often faster than competitors, while laying groundwork for new routines specialized for new problems.
We discuss a roadmap toward providing Julia BLAS and LAPACK libraries, from optimizations in LoopVectorization to libraries like Octavian and RecursiveFactorization.