, Helium [3rd Floor]
Traditional RAG systems struggle to understand holistic connections in distributed, constantly changing knowledge sources that characterize real-world organizations. While document-based approaches using vector embeddings provide basic retrieval, they fail to capture relationships and answer complex questions about interconnected information. Graph-based RAG offers a solution, but existing implementations like Microsoft's GraphRAG explicitly avoid dynamic operations due to complexity, requiring costly rebuilds when knowledge changes.
This talk introduces a production-ready dynamic knowledge graph system that supports real-time insertion, querying, and deletion of information. Through practical implementation details you will learn to build maintainable knowledge graphs that evolve with data, handle ambiguous entities and preserve information lineage.
Like many organizations, we at VisualVest face the challenge of distributed and constantly evolving knowledge sources. Documentation lives across repositories, internal wikis, JIRA tickets, and various file formats in cloud storage. With ~250 employees making daily changes, our source of truth is highly dynamic. While traditional document-based RAG using semantic embeddings solved some of these pain points, it couldn't answer holistic questions or understand relationships between sources, leading us to explore graph-based approaches.
The challenge? Real-world knowledge sources are inherently dynamic. When thinking about information management and retrieval, we cannot ignore this reality if we want to create powerful, machine-readable and actually useful products. Microsoft's popular GraphRAG library explicitly rejected dynamic features (like deletion) due to complexity concerns. However, we believe that constantly rebuilding entire graphs isn't feasible for production systems.
This talk presents our solution: a truly dynamic knowledge graph with full insertion, query and deletion capabilities. We are also working on reducing the high computational cost of building knowledge graphs. Through caching strategies and small language model fine-tuning, we are trying to minimized both computational effort and strengthen our independence from cloud providers.
What you'll learn:
- An industry perspective on the challenges of distributed knowledge sources
- Formal definition and properties of dynamic knowledge graphs
- Our transformation pipeline
- Experiments with fine-tuned small-language models
- Implementation details:
- Inserting nodes and edges while preventing ambiguity through similarity matching
- Tracking information origin across sources
- Safely deleting documents from the graph without breaking relationships
- Graph inference strategies
By the end of this talk, you'll understand why real-world knowledge graphs should be dynamic, how to build one yourself as well as the limitations and future directions of our approach.
Hey, Im Jakob. I have studied Data Science in my Bachelors and Masters and currently work at a fin-tech where Im involved in all kinds of projects. My main goal is creating things that are actually useful and not just full of buzz-words. Im a big fan of visualizing things and always make sure that anyone who is interested in the topics Im working on can follow the reasoning of the chosen approach.