OFA Symposium 2025: Open Technology Impact in Uncertain Times

A Cartography of Collaboration in Open Source AI: Mapping Collaboration in the Development and Reuse Lifecycle of 12 Open Large Language Models
18/11/2025 , Main Room

As generative artificial intelligence (AI) models become increasingly prevalent and released under various forms of open and permissive licenses, there is a critical need to understand how they are built and who contributes to this process. Currently, there is limited research that maps open collaboration practices across different stages of the development or reuse of models and their constituent artifacts (e.g. training datasets, software, model weights, evaluation benchmarks). Our research, therefore, aims to map and characterize open collaboration (specifically, the “collaboration on-ramps”) at different stages in the development and reuse lifecycle of open generative AI models, with a focus on open large language models (LLMs). Through qualitative interviews with 12 open LLM developers (i.e. Allen Institute for AI, EleutherAI, Cohere Labs, Hugging Face, Meta, Alibaba, the BigScience Workshop, AI Singapore, SpeakLeash, SCB 10X, Fraunhofer IAIS, and the National Library of Norway), this study presents a comprehensive cartography of collaboration practices throughout the lifecycle of open LLMs across diverse organizational contexts, from grassroots initiatives to large technology companies, and world regions. This study provides researchers, developers, business leaders, policymakers, and the wider community with empirical insights into collaboration practices, including motivations, opportunities and challenges, in the emerging open source AI community as well as practical recommendations for participation in or promotion of open source AI collaboration.


As generative artificial intelligence (AI) models become increasingly prevalent and released under various forms of open and permissive licenses, there is a critical need to understand how they are built and who contributes to this process. Currently, there is limited research that maps open collaboration practices across different stages of the development or reuse of models and their constituent artifacts (e.g. training datasets, software, model weights, evaluation benchmarks). This research aims to map and characterize open collaboration (specifically, the “collaboration on-ramps”) at different stages in the development and reuse lifecycle of open generative AI models, with a focus on open large language models (LLMs). Through qualitative interviews with 12 open LLM developers (i.e. Allen Institute for AI, EleutherAI, Cohere Labs, Hugging Face, Meta, Alibaba, the BigScience Workshop, AI Singapore, SpeakLeash, SCB 10X, Fraunhofer IAIS, and the National Library of Norway), this study presents a comprehensive cartography of collaboration practices throughout the lifecycle of open LLMs across diverse organizational contexts, from grassroots initiatives to large technology companies, and world regions. The study provides researchers, developers, business leaders, policymakers, and the wider community with empirical insights into collaboration practices, including motivations, opportunities and challenges, in the emerging open source AI community as well as practical recommendations for participation in or promotion of open source AI collaboration.

Senior Researcher at RISE Research Institutes of Sweden and an Adjunct Assistant Professor at Lund University.

Esta palestrante também aparece em:

Cailean Osborne is a Senior Researcher at the Linux Foundation, where he conducts strategic research and advocacy for promoting openness in AI. He has a PhD in Social Data Science from the University of Oxford, where he researched collaboration dynamics in the open source AI ecosystem. During his PhD, he was a visiting researcher at the Open Source Software Data Analytics Lab at Peking University. Previously, he was the International Policy Lead at the UK Government's Centre for Data Ethics and Innovation, where he co-authored the UK's National AI Strategy and served as a UK Delegate at intergovernmental AI governance initiatives at the OECD and Council of Europe. He is based in Berlin, Germany.

Esta palestrante também aparece em: