Haowen Huang PyCon Hong Kong 2024

Haowen Huang
.ical

Haowen Huang is currently a Senior Developer Advocate at Amazon Web Services (AWS). He has over 20 years of experience in the telecommunications, internet, and cloud computing industries. He has previously worked for companies such as Microsoft, Sun, and China Telecom. He currently focuses on creating and sharing technical content in the areas of generative AI, large language models (LLMs), machine learning, and data science, and empowering developers around the world.

Country / City:

Hong Kong

Company / Organisation:

Amazon

Session

11-16

10:30

30min

[Sponsored Keynote] Large Language Models Optimization with Python

Haowen Huang

This talk will cover various aspects of optimizing Large Language Models (LLMs) with Python, including quick start, availability optimization, and throughput optimization. Explore cutting-edge techniques involved in areas such as model compilation, model compression, model inference batching, distributed training, and Large Model Inference (LMI) containers. Discover practical examples of optimizing some open-source models using techniques like LMI containers, Low-Rank Adaptation (LoRA), Fully Sharded Data Parallelism (FSDP), Paged Attention, Rolling Batch, and more.

LLM

LT9

Haowen Huang .ical

Session

Haowen Huang
.ical