Damian Bogunowicz PyCon DE & PyData Berlin 2023

Damian Bogunowicz
.ical

Engineer, roboticist, software developer, and problem solver. Previous experience in autonomous driving (Argo AI), AI in industrial robotics (Arrival), and building machines that build machines (Tesla). Currently working in Neural Magic, focusing on the sparse future of AI computation.
Works towards unlocking creative and economic potential with intelligent robotics while avoiding the uprising of sentient machines."

Twitter handle:

@dtransposed

LinkedIn:

https://www.linkedin.com/in/dbogunowicz/

Session

04-19

10:00

45min

Why GPU Clusters Don't Need to Go Brrr? Leverage Compound Sparsity to Achieve the Fastest Inference Performance on CPUs

Damian Bogunowicz

Forget specialized hardware. Get GPU-class performance on your commodity CPUs with compound sparsity and sparsity-aware inference execution.
This talk will demonstrate the power of compound sparsity for model compression and inference speedup for NLP and CV domains, with a special focus on the recently popular Large Language Models. The combination of structured + unstructured pruning (to 90%+ sparsity), quantization, and knowledge distillation can be used to create models that run an order of magnitude faster than their dense counterparts, without a noticeable drop in accuracy. The session participants will learn the theory behind compound sparsity, state-of-the-art techniques, and how to apply it in practice using the Neural Magic platform.

PyData: Deep Learning

Damian Bogunowicz .ical

Session

Damian Bogunowicz
.ical