PyConDE & PyData Berlin 2024

Noé Achache

I am a Lead Data Scientist at Sicara, where I worked on a wide range of projects mostly related to vector databases, computer vision, prediction with structured data and more recently LLMs.
I am currently leading the GenAI development in the company.

Here the list of the talks I did:

Great Practices for RAG in Production @GenAI London Meetup

How to Choose a Vector Database in 2023 @DVC Meetup

Advanced Visual Search Engine with Self-Supervised Learning (SSL) @PyconDE et Pydata Berlin 2023

Great Practices for RAG in Production @GenAI Paris meetup

Generating Millions of text boxes with a GAN @Meetup Computer Vision Paris


X / Twitter handle

@noe_achache

Github

https://github.com/NoAchache

LinkedIn

https://www.linkedin.com/in/noe-achache/details/certifications/


Session

04-22
11:25
45min
RAG for a medical company: the technical and product challenges
Noé Achache

RAG (Retrieval Augmented Generation) is the process of querying a (large) set of documents with natural language, leveraging vector search and llms. While it has recently become widely accessible to develop a Proof-Of-Concept RAG using OpenAI and one of the various open-source contributions (e.g. langchain), building a performant RAG that brings value to users is challenging.
This talk will focus on learnings from building a RAG for a medical company, to allow doctors to query drug documentation with natural language, using tools like Chainlit, Qdrant and Langsmith.
Naturally, a product question emerged: how to effectively leverage LLMs that can never guarantee 100% accuracy in the health sector?
We will explain how we addressed this challenge, as well as the various technical improvements implemented to enhance both the retrieval (vector search) and generation (llm) metrics of our RAG.

PyData: Generative AI
B05-B06