2026-03-21 –, Teresa Yuchengco Auditorium (Main Hall)
Learn how Python leverages AVX-512, hyper-threading, and GPU for extreme performance. We'll dive into hardware internals, code patterns, and scaling strategies for HighLoad systems. We'll look at how we get compute modules from sand and how they execute your Python code.
Python has evolved from a slow scripting language into a high-performance tool capable of directly interfacing with cutting-edge C/GPU technologies: AVX-512/SVE, NVLink, and HBM. In this talk, we’ll dive deep into how exactly CPython interacts with "silicon" and which hardware internals—hyper-threading, NUMA nodes, issue ports, cache lines, SIMD (gather/scatter, hyper-threading)—determine your code’s performance.
Then, onto practice: we’ll compare CPU tools (NumPy 2 SIMD, Numba @vectorize) and GPU libraries through benchmarks—from multidimensional FFTs and SPH simulations to horizontally scalable ML inference and processing of billion-record datasets.
Specializes in CPython internals, optimization, and high-performance computing. Driven by GPU acceleration, CPU vectorization. Evolved from ML systems to CPython core research engineer. 8+ years leading teams in AI, maths, and physics. PyCon speaker: Free Threading: Future of CPython. Lecturer at Moscow Institute of Physics and Technology – top 1 Russian university. Open to talks and collaboration.
