Open Data Hub Day 2025

Beyond Data Sharing: Enabling Federated Learning Services in Data Spaces with IDSA Protocol
2025-05-30 , Seminar room 1

Existing Data Spaces have focused primarily on secure and policy-compliant data sharing, our work takes the next leap, enabling the sharing of services, specifically Federated Learning (FL) services, as part of a data ecosystem.

FL is a privacy-preserving machine learning approach where multiple participants collaboratively train a model without sharing raw data. Each party trains the model locally, and only model updates (not the data itself) are exchanged and aggregated. This makes FL particularly suited for sectors where data sensitivity, sovereignty, or compliance are critical such as healthcare, energy, mobility, or finance.

In this talk, we present a novel framework that integrates FL into the International Data Spaces Association (IDSA)’s Data Space Protocol. Our framework allows organizations to advertise their FL capabilities via IDSA’s Protocol and enables interested parties to dynamically join collaborative training rounds. This evolution from data sharing to policy compliant collaborative model training represents a significant extension of today’s Data Space paradigms.

One of the key challenges we address is automated usage policy enforcement across distributed, multi-party FL pipelines. Unlike traditional data exchanges, FL introduces complex service-level interactions that occur outside the strict boundaries of a data space connector. We identify and design policy enforcement points (PEPs) tailored to federated workflows and embed them into the FL framework architecture. This is achieved through a policy injection at orchestration level and runtime monitoring hooks that triggers the enforcements.

To ensure interoperability, we’ve also developed an ontology for FL, which defines core concepts and relationships critical for FL-aware data ecosystems. The ontology captures key concepts such as Federated Aggregation, Training Rounds, Model Artifacts, Policy Definitions, and Incentives, ensuring semantic alignment between participants and their connectors. Beyond providing a common vocabulary, the ontology is now being adopted as a foundation for defining machine-readable usage policies, allowing constraints and governance rules to be explicitly tied to these concepts. For example, restricting how model artifacts can be reused, setting participation limits per training round, or expressing incentive conditions for contributors. Moreover, the ontology is then integrated into the IDSA information model for to facilitate FL Service advertisement and discovery.

We have implemented a proof-of-concept using the Flower Federated Learning framework, demonstrating real-world feasibility. Our prototype showcases FL server discovery, policy negotiation, server orchestration, and enforcement in a multi-organizational setup, providing insights for future adoption within data space ecosystems.


Our talk is closely aligned with the data space initiatives spearheaded by the International Data Space Association (IDSA). The framework we present is built upon the robust IDSA protocols, ensuring a seamless, secure, and interoperable data-sharing environment. By leveraging these protocols, our framework goes beyond traditional, static data sharing and fosters a dynamic, collaborative approach that facilitates model training.

Through federated learning (FL), we enable both data providers and consumers to collaborate effectively by sharing machine learning (ML) models rather than raw data. This not only enhances privacy and security but also promotes the responsible use of data. Our framework further strengthens the value of open data, offering an environment where entities can engage in real-time collaboration while ensuring the protection of sensitive information.

The approach also underscores the importance of open standards, as our framework supports the extension of open-source ontologies to ensure compatibility and interoperability across various ML models used by different stakeholders. This ensures that data, whether shared between providers or consumers, can be integrated and analysed seamlessly, fostering innovation and knowledge-sharing across sectors.

In addition, the principles of data governance and privacy protection are at the core of our framework. By adhering to the Data Governance Act and the AI Act, we are committed to aligning with current regulatory frameworks, ensuring that the use of data within the data space is not only efficient but also compliant with the latest data privacy regulations. This makes our solution particularly suitable for industries with stringent privacy and governance requirements, such as mobility, tourism, food, green, automotive, and automation sectors.

Ultimately, our framework represents a forward-thinking, secure, and compliant approach to the challenges of open data and collaborative model training, driven by the growing need for data analytics, data science, and data integration in today's rapidly evolving technological landscape