2025/09/27 –, ダリア1
NumPy is fast, but how can we go faster? In this talk, we will show how to make a faster backend for array processing by writing high-performance code from scratch. We will explore how to construct a NumPy-like array engine in C++, integrating hardware acceleration features such as single-instruction multiple data (SIMD), and exposing it to Python using Pybind11. Comparison of runtime performance to NumPy is made to analyze how well it works and plan for continuous improvement. This talk will provide practical insights for Python users interested in low-level performance optimization and extension development.
NumPy is a popular choice for array operations and numerical computation in Python, offering a simple interface and excellent performance. However, its architecture—built primarily around the Python C-API—can be difficult to integrate into applications where tighter control over system components is required.
In our project, we implemeted a custom array computation backend in C++ to meet integration requirements specific to our numerical application. To utilize the usability benefits of Python, we exposed this backend via Pybind11.
In this talk, I will share our experience building this system, with a focus on achieving runtime performance comparable to NumPy through low-level optimization and exposing the backend with a clean Python interface.
Program performance affects every kind of program but especially in numeric calculations. Diving into low-level implementation gives users a chance to achieve better performance.
オーディエンスが持って帰れる具体的な知識やノウハウ:This talk will demonstrate how array computation can be accelerated and optimized in low-level operation with hardware features utilization.
オーディエンスに求める前提知識:Basic experience with Numpy numeric operations
オーディエンスの経験レベル:Intermediate
発表の言語:English
発表資料の言語:English