Petr Andreev
Specializes in CPython internals, optimization, and high-performance computing.
Driven by GPU acceleration, CPU vectorization. Evolved from ML systems to CPython core research engineer.
8+ years leading teams in AI, maths, and physics. PyCon speaker.
Lecturer at Moscow Institute of Physics and Technology – top 1 Russian university.
Open to talks and collaboration.
3 wild facts about me:
- In 2025 I hosted ≈70 dogs (not at once). I love mammals — especially 🐶 🐴 People literally pay me to pet-sit — so their pets can live out their best moments in life ✨
- I’ve been to ≈20 countries and lived in 5.5. Korea felt the most exotic/interesting. Indonesia was my favorite.
- In 2024 I went unusually hard on fitness: daily run/swim/bike for 16 weeks. I ran every day for 3 months, logged 75h vigorous + 208h moderate (≈2.5h/day) activities, lost ≈12 kg, got stronger 💪
Session
In 90 minutes, you’ll learn a repeatable workflow to accelerate real numeric kernels using CPU SIMD, GPU arrays + custom kernels, and TPU/XLA compilation—all from Python.
For each acceleration tier we follow the same loop: theory → minimal working code → benchmark that confirms (or disproves) the theory. You’ll leave with a small benchmark harness you can reuse, plus a decision checklist for when SIMD is enough, when GPUs pay off, and when XLA/TPU is the right move.