Python Conference APAC 2024

Building graph-based RAG application for accurate, complete, and explainable AI
2024-10-27 , CLASS #5
Language: English

RAG (retrieval-augmented generation) is a technique for enhancing the accuracy and reliability of LLMs with facts fetched from external sources. However, the naive RAG approach that solely relies on vector databases falls short for complex use cases. Representing the knowledge and facts into a structured & connected data (“knowledge graph”) can help improve the results as the facts are explicitly decoded into a graph structure, resulting in accurate and deterministic answers. In this session, I will discuss graph-based RAG and demonstrate how to easily build a graph-based RAG application using Python by leveraging tools and libraries freely available.


This talk is structured as the following:

  1. The challenge of naive RAG approach using solely vector database
    RAG is a popular technique to enhance the accuracy and reliability of LLMs with facts fetched from external sources. While a vector database is the popular choice as the data store to power RAG, I am going to discuss why it has drawbacks that makes it fall short for more complex use cases.

  2. Graph-based RAG for better accuracy & deterministic results
    One way to improve the accuracy of RAG application is by representing the knowledge and facts into a structured & connected data (graph). With graphs, the facts are structured explicitly as intended by the domain experts, as opposed to vectors in a vector database. I am going to discuss in more detail why representing it as a graph allows us to perform information retrieval in a complete, accurate, and deterministic way.

  3. How to build graph-based RAG application in Python
    Next is to put it into practice. Python is the most commonly used language in AI/ML and Data Science, thus it is natural to use it to build any RAG application. It has extensive libraries that can save time in building the RAG app rather than start from scratch. I am going to demonstrate building the graph-based RAG application by using Python libraries such as Langchain to integrate it with open source LLM and Neo4j Community Edition to store the graph data. I will also demonstrate the improvement offered by graph-based RAG compared to the naive RAG approach.

This session is targeted for audiences who have basic knowledge in Python and are interested in RAG application development with better accuracy.

A data professional.