Maria Knorps

Maria's professional goal is to improve the environment by understanding it first in
the language of mathematics and then applying the gained knowledge.
After graduating in applied mathematics, Maria began research on the two-phase turbulent flows.
Knowledge of mathematical modeling helped her better understand the small-scale physical effects
and allowed her to more accurately model two-phase turbulence while reducing computational costs.

After completing her PhD, Maria began to work as a data scientist.
She was responsible for all stages of data processing, from creating ETL pipelines, through modeling
to visualization of the results and leading 2-5 people projects.
Her inclination towards the implementation and design aspects gravitated her towards functional programming.
She integrated Haskell into parts of the data processing pipelines, finding its type system and
expressiveness more akin to mathematical language. Maria is also dedicated to maintaining neat,
reusable, and well-documented code.

Outside of her technical pursuits, Maria is passionate about promoting diversity in the IT industry
and inspiring girls and women to engage in programming. Balancing her career with being a mother of three,
she finds limited but cherished time for personal hobbies. When the opportunity arises,
Maria enjoys the thrill of motorcycle rides beyond the city limits.


Sessions

09-25
10:30
30min
Evaluating the evaluator: RAG eval libraries under the loop
Nour El Mawass, Maria Knorps

Retrieval-augmented generation (RAG) has become a key application for large language models (LLMs), enhancing their responses with information from external databases. However, RAG systems are prone to errors, and their complexity has made evaluation a critical and challenging area. Various libraries (like RAGAS and TruLens) have introduced evaluation tools and metrics for RAGs, but these evaluations involve using one LLM to assess another, raising questions about their reliability. Our study examines the stability and usefulness of these evaluation methods across different datasets and domains, focusing on the effects of the choice of the evaluation LLM, query reformulation, and dataset characteristics on RAG performance. It also assesses the stability of the metrics on multiple runs of the evaluation and how metrics correlate with each other. The talk aims to guide users in selecting and interpreting LLM-based evaluations effectively.

Louis Armand 1 - Est
09-25
15:00
30min
On the structure and reproducibility of Python packages - data crunch
Maria Knorps, Zhihan Zhang

Did you know that all top PyPI packages declare their 3rd party dependencies? In contrast, only about 53% of scientific projects do the same. The question arises: How can we reproduce Python-based scientific experiments if we're unaware of the necessary libraries for our environment?
In this talk, we delve into the Python packaging ecosystem and employ a data-driven approach to analyze the structure and reproducibility of packages. We compare two distinct groups of Python packages: the most popular ones on PyPI, which we anticipate to adhere more closely to best practices, and a selection from biomedical experiments. Through our analysis, we uncover common development patterns in Python projects and utilize our open-source library, FawltyDeps, to identify undeclared dependencies and assess the reproducibility of these projects.
This discussion is especially valuable for enthusiasts of clean Python code, as well as for data scientists and engineers eager to adopt best practices and enhance reproducibility. Attendees will depart with actionable insights on enhancing the transparency and reliability of their Python projects, thereby advancing the cause of reproducible scientific research.

Louis Armand 2 - Ouest