Python Conference APAC 2024

Optimizing and scaling Generative AI applications and systems
2024-10-27 , CLASS #2
Language: English

These last few years, we’ve seen a significant increase in the number of Generative AI-powered applications being developed by organizations and professionals globally. Due to the number of possible variations available when building these types of systems, companies often struggle to identify and implement the most effective approach. After a few weeks of running their Gen AI application, they start to encounter issues and challenges related to performance, scalability, and cost management.

In this session, we will discuss various best practices and strategies when building, scaling, and optimizing Generative AI systems. We'll tackle various challenges that organizations experience after they have deployed their first version of the Gen AI application and we'll discuss multiple optimization strategies to reduce costs and improve the application's performance.


While a lot of organizations utilize 3rd-party APIs and vector databases to build these types of applications, we are seeing more teams set up their custom RAG implementation which makes use of a self-hosted large language model (LLM) inside a private cloud environment. For some organizations, this is not an option and would start first with a serverless implementation of their Generative AI application. Given the number of options available and lack of available resources and experience, organizations end up experiencing issues and challenges related to performance, scalability, and cost management after they have deployed their application.

With this, we will dive deep into multiple strategies when managing, evaluating, optimizing, and scaling Generative AI systems. At the same time, we'll discuss multiple optimization strategies to reduce costs and improve the application's performance.

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO of 3 Australian-owned companies and also served as the Director for Software Development and Engineering for multiple e-commerce startups in the past. Years ago, he and his team won 1st place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and he has been sharing his knowledge in several international conferences to discuss practical strategies on machine learning, engineering, security, and management. He is also the author of the books "Machine Learning with Amazon SageMaker Cookbook", "Machine Learning Engineering on AWS", and "Building and Automating Penetration Testing Labs in the Cloud". Due to his proven track record in leading digital transformation within organizations, he has been recognized as one of the prestigious Orange Boomerang: Digital Leader of the Year 2023 award winners.

This speaker also appears in:

Sophie Soliven is the Director of Operations for Edamama. She has over 9 years of experience in e-commerce, fintech, and retail. Over the years, she has also been sharing her knowledge and experience in both the local and the international scene.

This speaker also appears in: