2024-04-05 –, Room 228
Developer tools power many LLM-based chat and Retrieval Augmented Generation applications today. However, there is a non-trivial knowledge barrier for entrants that could hinder developer experience. Our discussion intends to offer actionable insights into building and maintaining generative AI solutions in a secure and economical way, thereby improving the developer experience in this Generative AI wave.
Developer tools power many LLM-based chat and Retrieval Augmented Generation applications today. However, there is a non-trivial knowledge barrier for entrants that could hinder developer experience. The onus is on the developer to handle things like fine-grained setup for multiple components (e.g. LLM, vectorDB, etc) with various hyper-parameters, client-side chat-state management, and security. This talk aims to take a deeper dive into each of the RAG pipeline components and provide a better understanding of why certain customizations are being offered. Our discussion will show you how to better choose what is best suited for your needs when building and maintaining generative AI applications.
I lead applied research efforts for our EMEA team to scale AI for production loads at Clarifai. My team has been solving search/ranking, retrieval, and multimodal problems. Previously I was leading the effort in developing custom ML solutions for enterprise customers.