BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//pretalx.com//pyconde-pydata-2026//speaker//7LKG3C
BEGIN:VTIMEZONE
TZID:CET
BEGIN:STANDARD
DTSTART:20001029T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000326T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-pyconde-pydata-2026-PFXR9G@pretalx.com
DTSTART;TZID=CET:20260414T122500
DTEND;TZID=CET:20260414T131000
DESCRIPTION:The timeless phrase “garbage in\, garbage out” is even more
  important today with the growing usage of non-deterministic generative ne
 uronal networks\, which amplifies the effect of bad data quality. This pre
 sentation describes Data Quality Monitor — a tool to bring transparency 
 into data quality and help drive real improvements. \n\nIn the talk\, we'l
 l cover what defines a successful data quality monitoring solution and sha
 re findings from our initial evaluation of available open-source framework
 s. Next\, we'll showcase our implementation based on DQX. DQX is a lightwe
 ight\, open-source framework for performing row-level data quality checks 
 programmatically\, with business rules organized in manageable YAML files.
  DQX\, originally developed by Databricks Labs\, integrates seamlessly wit
 h PySpark\, making it easy and affordable to run data quality checks withi
 n our IoT data lake. Finally\, we will discuss the organizational processe
 s and structures required to effectively respond to data quality issues.
DTSTAMP:20260412T141856Z
LOCATION:Helium [3rd Floor]
SUMMARY:Fight your garbage data: implementation of a pythonic data quality 
 monitoring framework in PySpark - Rostislaw Krassow\, Joshua Finger
URL:https://pretalx.com/pyconde-pydata-2026/talk/PFXR9G/
END:VEVENT
END:VCALENDAR
