David Batista
I’m an experienced machine learning engineer and software developer, with a strong background in Natural Language Processing. I hold a Ph.D from 2016 where I focused on semantic relationship extraction. Currently, I'm based in Berlin and I work as a NLP Engineer and Software Developer at deepset, where I contribute to the development of Haystack, an open-source framework for building end-to-end production-ready LLM-based applications
https://github.com/deepset-ai/haystack
https://github.com/deepset-ai/haystack-core-integrations
https://github.com/deepset-ai/haystack-experimental
https://github.com/MantisAI/nervaluate
Session
Good retrieval performance is key to an effective RAG system, as it ensures relevant information is selected, directly impacting augmentation and generation quality. My presentation focuses on RAG indexing and retrieval, exploring methods to convert text into searchable formats, comparing techniques, and analyzing their advantages, disadvantages, and performance on an annotated dataset to enhance document retrieval based on user queries.