Beyond Data Sharing: Enabling Federated Learning Services in Data Spaces with IDSA Protocol
Existing Data Spaces have focused primarily on secure and policy-compliant data sharing, our work takes the next leap, enabling the sharing of services, specifically Federated Learning (FL) services, as part of a data ecosystem.
FL is a privacy-preserving machine learning approach where multiple participants collaboratively train a model without sharing raw data. Each party trains the model locally, and only model updates (not the data itself) are exchanged and aggregated. This makes FL particularly suited for sectors where data sensitivity, sovereignty, or compliance are critical such as healthcare, energy, mobility, or finance.
In this talk, we present a novel framework that integrates FL into the International Data Spaces Association (IDSA)’s Data Space Protocol. Our framework allows organizations to advertise their FL capabilities via IDSA’s Protocol and enables interested parties to dynamically join collaborative training rounds. This evolution from data sharing to policy compliant collaborative model training represents a significant extension of today’s Data Space paradigms.
One of the key challenges we address is automated usage policy enforcement across distributed, multi-party FL pipelines. Unlike traditional data exchanges, FL introduces complex service-level interactions that occur outside the strict boundaries of a data space connector. We identify and design policy enforcement points (PEPs) tailored to federated workflows and embed them into the FL framework architecture. This is achieved through a policy injection at orchestration level and runtime monitoring hooks that triggers the enforcements.
To ensure interoperability, we’ve also developed an ontology for FL, which defines core concepts and relationships critical for FL-aware data ecosystems. The ontology captures key concepts such as Federated Aggregation, Training Rounds, Model Artifacts, Policy Definitions, and Incentives, ensuring semantic alignment between participants and their connectors. Beyond providing a common vocabulary, the ontology is now being adopted as a foundation for defining machine-readable usage policies, allowing constraints and governance rules to be explicitly tied to these concepts. For example, restricting how model artifacts can be reused, setting participation limits per training round, or expressing incentive conditions for contributors. Moreover, the ontology is then integrated into the IDSA information model for to facilitate FL Service advertisement and discovery.
We have implemented a proof-of-concept using the Flower Federated Learning framework, demonstrating real-world feasibility. Our prototype showcases FL server discovery, policy negotiation, server orchestration, and enforcement in a multi-organizational setup, providing insights for future adoption within data space ecosystems.