Enhancing RAG-based apps by constructing and leveraging knowledge graphs with open-weights LLMs
2024-09-25 , Louis Armand 1 - Est

Graph Retrieval Augmented Generation (Graph RAG) is emerging as a powerful addition to traditional vector search retrieval methods. Graphs are great at representing and storing heterogeneous and interconnected information in a structured manner, effortlessly capturing complex relationships and attributes across different data types. Using open weights LLMs removes the dependency on an external LLM provider while retaining complete control over the data flows and how the data is being shared and stored. In this talk, we construct and leverage the structured nature of graph databases, which organize data as nodes and relationships, to enhance the depth and contextuality of retrieved information to enhance RAG-based applications with open weights LLMs. We will show these capabilities with a demo.


The idea behind Retrieval Augmented Generation (RAG) applications is to provide Large Language Models (LLMs) with additional context at query time for answering the user’s question. Graph Retrieval Augmented Generation (Graph RAG) is emerging as a powerful addition to these traditional vector search retrieval methods. Graphs are great at representing and storing heterogeneous and interconnected information in a structured manner, effortlessly capturing complex relationships and attributes across diverse data types. In contrast, vector databases often struggle with such structured information, as their strength lies in handling unstructured data through high-dimensional vectors.
Graph RAGs work as follows:
- When a user asks a question, it first goes through an embedding model to calculate its vector representation.
- The next step is to find the most relevant nodes in the database by comparing the cosine similarity of the embedding values of the user’s question and the relevant nodes in the database.
- Once the relevant nodes are identified, the application is designed to retrieve additional information from the nodes themselves and also by traversing the relationships in the graph.
- We can combine structured graph data with vector search through unstructured text to achieve the best of both worlds.
- The context information from these databases is combined with the user question and additional instructions into a prompt that is passed to an LLM to generate the final answer, which is then sent to the user.

Using open weights LLMs removes the dependency on an external LLM provider while retaining complete control over the data flows and how the data is being shared and stored.
In this talk, we first construct a knowledge graph based on the provided documents with an open weights LLM. We then leverage the structured nature of graph databases, which organize data as nodes and relationships, to enhance the depth and contextuality of retrieved information to enhance RAG-based applications with open weights LLMs.
We will show these capabilities with a demo. The tools used in the demo will be Kùzu + LanceDB + Llama.cpp + Outlines + LangChain.

Alonso Silva is currently a Researcher on Verifiable AI at Nokia Bell Labs in the Machine Learning and Systems Research Lab. He has previously worked at Safran in the Department of Mathematics and Temporal Data, in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley, working with Professor Jean Walrand, at INRIA Paris Rocquencourt working with Dr. Philippe Jacquet and as a Research Consultant/Intern at Bell Labs Headquarters in Murray Hill, New Jersey, working with Dr. Iraj Saniee. He did his Ph.D. at INRIA Sophia-Antipolis under the direction of Dr. Eitan Altman. He obtained his Ph.D. degree in Physics from the École Supérieure d'Électricité in June 2010.