Philip Meier
Philip is a Senior Software Engineer at Quansight. His recent worked focused on Ragna (https://ragna.chat) an OSS RAG orchestration framework.
Session
As we descend from the peak of the hype cycle around Large Language Models (LLMs), chat-based document inquiry systems have emerged as a high-value practical use case. Retrieval-Augmented Generation (RAG) is a technique to share relevant context and external information (retrieved from vector storage) to LLMs, thus making them more powerful and accurate.
In this hands-on tutorial, we’ll dive into RAG by creating a personal chat app that accurately answers questions about your selected documents. We’ll use a new OSS project called Ragna that provides a friendly Python and REST API, designed for this particular case. We’ll test the effectiveness of different LLMs and vector databases, including an offline LLM (i.e., local LLM) running on GPUs on the cloud-machines provided to you. And, we’ll conclude by demonstrating how to quickly build personal or company-level chat-based document interrogation systems.