BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//pretalx.com//euroscipy-2024//speaker//ZHUVWQ
BEGIN:VTIMEZONE
TZID:CET
BEGIN:STANDARD
DTSTART:20001029T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000326T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-euroscipy-2024-MFF7GE@pretalx.com
DTSTART;TZID=CET:20240828T132000
DTEND;TZID=CET:20240828T135000
DESCRIPTION:When it comes to designing machine learning predictive models\,
  it is reported that data scientists spend over 80% of their time preparin
 g the data to input to the machine learning algorithm.\n\nCurrently\, no a
 utomated solution exists to address this problem. However\, the `skrub` Py
 thon library is here to alleviate some of the daily tasks of data scientis
 ts and offer an integration with the `scikit-learn` machine learning libra
 ry.\n\nIn this talk\, we provide an overview of the features available in 
 `skrub`.\n\nFirst\, we focus on the preprocessing stage closest to the dat
 a sources. While predictive models usually expect a single design matrix a
 nd a target vector (or matrix)\, in practice\, it is common that data are 
 available from different data tables. It is also possible that the data to
  be merged are slightly different\, making it difficult to join them. We w
 ill present the `skrub` joiners that handle such use cases and are fully c
 ompatible with `scikit-learn` and its pipeline.\n\nThen\, another issue wi
 dely tackled by data scientists is dealing with heterogeneous data types (
 e.g.\, dates\, categorical\, numerical). We will present the `TableVectori
 zer`\, a preprocessor that automatically handles different types of encodi
 ng and transformation\, reducing the amount of boilerplate code to write w
 hen designing predictive models with `scikit-learn`. Like the joiner\, thi
 s transformer is fully compatible with `scikit-learn`.
DTSTAMP:20260515T110342Z
LOCATION:Room 7
SUMMARY:Skrub: prepping tables for machine learning - Guillaume Lemaitre\, 
 Vincent Maladiere\, Jérôme Dockès
URL:https://pretalx.com/euroscipy-2024/talk/MFF7GE/
END:VEVENT
END:VCALENDAR
