Open Data Hub Day 2025

Automated Data Quality Testing Tool
2025-05-30 , Seminar room 3

This talk introduces a novel tool that tests the data quality of a given dataset from the Open Data Hub. Besides formal tests of the data fields, the tool uses a machine learning approach to test the quality of images and whether the content of an image agrees with its description. Hence the tool can provide valuable insights into potential errors or issues with the data and images.


This talk introduces a novel tool that tests the data quality of a given dataset from the Open Data Hub. Besides formal tests of the data fields, the tool uses a machine learning approach to test the quality of images and whether the content of an image agrees with its description. Hence the tool can provide valuable insights into potential errors or issues with the data and images.

The tool also provides graphical representations of historical trends in data quality, allowing users to track improvements over time and identify patterns or inconsistencies. Furthermore, the machine learning component for analyzing images is a key feature of this tool, enabling automatic assessment of image quality and accuracy. By leveraging machine learning algorithms, the tool can identify discrepancies between the content of an image and its description with high precision.

Davide Montesin is the CEO of a Start-up called Catch Solve, which is situated in NOI Techpark in Bolzano Bozen, South Tyrol. The Start-up – founded in 2019 – is an integrated platform that monitors the quality of apps, websites and digital services and to find useful resources to solve software bugs easily, quickly and time efficiently.

The South Tyrol native, currently living in Bolzano – Bozen, has a great passion for programming and open source / free software and has a 20 year-long experience in participating and leading software projects for the public administration, private companies and the tourism sector.

Chris Mair is a software developer and trainer with a freelance career spanning since 2003. Over the years, he has contributed to software development for over 25 companies, specializing in system and network programming, web development, data processing, analysis and computation, machine learning, embedded programming, and relational database design and programming.

In addition to development, he has provided consultancy services to over 15 companies, primarily focusing on database programming and performance, database migrations with a specialization in PostgreSQL, and the utilization of Open Source software and Linux.

As a trainer, he has conducted over 200 courses, both onsite and online, for companies and public institutions. His current areas of expertise in training include, software development, database systems and machine learning.