How Fast Can We Go? Build a NumPy-Like Backend in C++ with Python Bindings
NumPy is fast, but how can we go faster? In this talk, we will show how to make a faster backend for array processing by writing high-performance code from scratch. We will explore how to construct a NumPy-like array engine in C++, integrating hardware acceleration features such as single-instruction multiple data (SIMD), and exposing it to Python using Pybind11. Comparison of runtime performance to NumPy is made to analyze how well it works and plan for continuous improvement. This talk will provide practical insights for Python users interested in low-level performance optimization and extension development.