Noé Achache

I am an Engineering Manager (for Data Science projects) at Sicara, where I worked on a wide range of projects mostly related to vector databases, computer vision, prediction with structured data and more recently LLMs.
I am currently leading the GenAI development in the company.
You can find all my talks and articles here: https://www.sicara.fr/en/noe-achache, e.g.
- https://www.sicara.fr/blog-technique/how-to-choose-your-vector-database-in-2023
- https://www.youtube.com/watch?v=aX_hdQEintc


Session

09-26
13:50
30min
Towards a deeper understanding of retrieval and vector databases
Noé Achache

Retrieval is the process of searching for a given item (image, text, …) in a large database that are similar to one or more query items. A classical approach is to transform the database items and the query item into vectors (also called embeddings) with a trained model so that they can be compared via a distance metric. It has many applications in various fields, e.g. to build a visual recommendation system like Google Lens or a RAG (Retrieval Augmented Generation), a technique used to inject specific knowledge into LLMs depending on the query.
Vector databases ease the management, serving and retrieval of the vectors in production and implement efficient indexes, to rapidly search through millions of vectors. They gained a lot of attention over the past year, due to the rise of LLMs and RAGs.

Although people working with LLMs are increasingly familiar with the basic principles of vector databases, the finer details and nuances often remain obscure. This lack of clarity hinders the ability to make optimal use of these systems.

In this talk, we will detail two examples of real-life projects (Deduplication of real estate adverts using the image embedding model DinoV2 and RAG for a medical company using the text embedding model Ada-2) and deep dive into retrieval and vector databases to demystify the key aspects and highlight the limitations: HSNW index, comparison of the providers, metadata filtering (the related plunge of performance when filtering too many nodes and how indexing partially helps it), partitioning, reciprocal rank fusion, the performance and limitations of the representations created by SOTA image and text embedding models, …

Louis Armand 2 - Ouest