PyCon Hong Kong 2024

PyCon Hong Kong 2024

Haowen Huang

Haowen Huang is currently a Senior Developer Advocate at Amazon Web Services (AWS). He has over 20 years of experience in the telecommunications, internet, and cloud computing industries. He has previously worked for companies such as Microsoft, Sun, and China Telecom. He currently focuses on creating and sharing technical content in the areas of generative AI, large language models (LLMs), machine learning, and data science, and empowering developers around the world.


Country / City

Hong Kong

Company / Organisation

Amazon


Session

11-16
10:30
30min
[Sponsored Keynote] Large Language Models Optimization with Python
Haowen Huang

This talk will cover various aspects of optimizing Large Language Models (LLMs) with Python, including quick start, availability optimization, and throughput optimization. Explore cutting-edge techniques involved in areas such as model compilation, model compression, model inference batching, distributed training, and Large Model Inference (LMI) containers. Discover practical examples of optimizing some open-source models using techniques like LMI containers, Low-Rank Adaptation (LoRA), Fully Sharded Data Parallelism (FSDP), Paged Attention, Rolling Batch, and more.

LLM
LT9