Juliacon 2024

Improving the life-cycle of tensor algorithm development
07-10, 17:00–17:30 (Europe/Amsterdam), Method (1.5)

In this talk we showcase the advancements of the low-level implementation of ITensors. These advancements seek to tighten the gap between rapid algorithmic development and efficient exascale implementations. Advancements were attained through the generic redevelopment of the backend NDTensors module. We showcase our work via acceleration of the DMRG optimization of the one- and two-dimensional model Hamiltonian problems with controllable accuracy using GPU accelerators.


As the accessibility to advanced computational resources, such as CPU, GPU, and QPU, becomes common to researchers across theoretical and computational disciplines, it is more important than ever to construct robust and generic software frameworks that can facilitate rapid algorithmic development on such resources. One of the biggest challenges researchers have faced in algorithmic development is the complexity of exascale and heterogeneous computer architectures. Some challenges are quite complex, such as constructing efficient memory-distributed algorithms, while others are simple yet cumbersome, such as the syntactic differences in vendor-developed, processor-specific languages. In our most recent efforts in the ITensors software suite, we are attempting to tackle these issues head-on using generic programming.
The goal of the ITensors package is to create a domain-specific language for tensor arithmetic and tensor algebra built from the ground up using Julia's robust features in coordination with thoughtful generic programming practices. ITensors is the meeting point between the lower-level tensor implementation and the higher-level, problem-specific algorithm development. In this talk, I will focus on the low-level implementation in the NDTensors module where we are developing generic practices to handle implementation complexities such as dense and sparse tensor arithmetic, buffered memory management, linear and multilinear algebra, GPU implementations, task-management and scheduling, and beyond. We underscore the power of our generic programming strategy by showing multiple methods for accelerating the DMRG optimization of the one- and two-dimensional Hubbard and Heisenberg models, which require only a small number of changes to existing ITensor code and can take advantage of GPU accelerators.