0728, 17:10–17:40 (UTC), Red
Arpack is a library for computing eigenvalues and eigenvectors of a linear operator. It has been used in many technical computing packages. The goal of the GenericArpack.jl
package is to create a Julia translation of Arpack. Right now, the Julia GenericArpack.jl
methods produce bitwise identical results to the Arpack_jll
methods for Float64 types in all testcases. The new library has zero dependency on BLAS and supports element types beyond those in Arpack, such as DoubleFloats.jl
.
Summary of key points
Arpack is a library for iteratively computing eigenvalues and eigenvectors of a linear operator. It has been widely used, debugged, and implemented in many technical computing packages. The goal of the GenericArpack.jl
package is to create a Julia translation of the Arpack methods. (Currently, only the symmetric solver has been translated.)
The new library has zero dependency on system level BLAS, allowing it to support matrix element types beyond those in Arpack, such as those in DoubleFloats.jl
and MultiFloats.jl
. Other advantages of the GenericArpack.jl
package include thread safety and using Julia features to allow one to optionally avoid the reverse communication interface. Despite not using the system BLAS, the goal was to make the Julia output equivalent to an alternative compilation of the Arpack source code for Float64
types.
An anticipated future use is using the GenericArpack.jl
package to give WebASM Julia implementations tools for iterative eigensolvers. This would enables a wide variety of inbrowser analysis including finite element, pseudospectra, and spectral graph theory.
Talk overview
The talk will discuss some interesting challenges that arose:

representing state information in Julia for the statically located Fortran
save
variables in Arpack. 
sensitivity to the
norm
operation and implementing an exact replacement for the OpenBLASdnrm2
on x86 architectures that uses 80bit floating point features of x86 CPUs (without calling the OpenBLAS function) 
getting the same random initialization vectors as Arpack (i.e. porting the Fortran
dlarnv
function) 
designing an interface that allows us to compare results between
Arpack_jll
andGenericArpack.jl
at internal methods in Arpack call chains. 
how much code it took to translate the symmetric eigensolver to a Hermitian eigensolver (which is not in Arpack)
It will also discuss some tools created that may be useful elsewhere

a tridiagonal eigensolver that only computes a single row of the eigenvector matrix (the Arpack
dstqrb
function) 
allocation analysis that automatically runs, parses output, and cleans up after a
trackallocations
run of Julia 
Julia implementations of a few LAPACK/BLAS functions and the details needed to match bitwise match OpenBLAS calls (on MacOS)
Initial rough talk slide ideas

teaser: the world's most precise estimate of the largest few singular values of the netflix ratings matrix. (100M nonzeros) ... or something similar.

reveal: the code... using GenericArpack; svds(...)

pitch: A dropin replacement for Arpack.jl (for symmetric problems).

what is Arpack and why is it important?

Arpack and reverse communication.

summary of project goals: why translation, same input/same output and not something else (new algorithms, etc.), also why minimal dependencies.

basic translation approach: an exercise in @view / subarrays.

getting bitwise identical output  the Lanczos/Arnoldi information seems close, but somewhat different from Arpack
 key issue: well, turns out this is very sensitive to the norm function.
 real problem: OpenBlas norm uses 80bit FP operations. (And why they can get away with sqrt(sum(x.^2)) and you can't!)
 solution 1: use doubledouble to simulate! (but it's slow)
 solution 2: just use ideas from
BitFloats.jl
and llvm intrinsics instead

So, I've written everything, it passes tests, etc. Why does it use so many allocations? (When it should use zero, like the Fortran code!)
 tools for hunting down allocations. (well, really just parsing trackallocation output)
 A curiosity: why does the line
while true
allocate? 
I wish there was a "strict" mode that doesn't allow quite so much flexibility.

because we can: from Arpack
ido
(really what you the user do!) to Julia idonow to avoid reverse communication. 
because we can: going from symmetric realvalued Arpack methods to Hermitian complexvalued methods (which do not exist in Arpack)

because someone will ask: comparing performance. This will show the current state of performance. At the moment, for a problem Arpack solves in 15ms, GenericArpack.jl takes 23ms; although there has been only minor performance tuning.

A list of future work. Portion the nonHermitian complex valued case; an "AbstractEigenspace.jl" package that multiple people could implement; Handling differences.

The vision: Why this would be super useful. Iterative Eigenvalues in the browser for Pluto.jl running via WebASM... for really cool demos akin to pseudospectra... for finite elements in the browser ... for interactive spectral graph analysis in the browser... for mixed precision Arpack computations (Lanczos/Arnoldi info in highprecision, vectors in lowprecision).
David Gleich is an associate professor of computer science at Purdue University.