2025-07-24 –, Main Room 4
We present MOSS (The Map of Open Source Science), a system that consolidates data from GitHub, OpenAlex, CrossRef, and other scholarly sources to build an integrated knowledge graph of open research and software projects. We will demonstrate how MOSS can be configured to zero in on the Julia community, mapping key Julia repositories to associated papers, contributors, institutions, topics, and other objects.
In this talk, we introduce The Map of Open Source Science (MOSS), a large-scale data integration effort aimed at providing a holistic view of the Julia community’s impact on research and open-source development. While the Julia ecosystem has grown impressively in recent years it remains challenging to identify use, impact, and relationships throughout the ecosystem. Our goal with MOSS is to unify disparate data streams, enabling a comprehensive understanding of the ecosystem’s evolution, areas of collaboration, academic influence, and ultimately, importance within research, the economy, and society.
Explore our early mappings at https://opensource.science/moss
-
Motivation and Goals
Julia's expanding footprint in scientific computing, data science, and machine learning demands an up-to-date, data-driven picture of how code repositories, researchers, and research institutions converge. MOSS addresses this need by capturing not only static metadata (e.g., repository descriptions or paper titles) but also the dynamic relationships—who contributed to which package, which institutions are tied to key Julia projects, and how academic citations reference these repositories. By mapping these relationships in a knowledge graph, MOSS provides insights into the flow of knowledge and collaboration across the Julia ecosystem. -
System Architecture and Data Flows
MOSS builds its knowledge graph from multiple sources:
- GitHub: Repositories, contributors, pull requests, issues, and CITATION.cff files that contain DOIs to relevant academic publications.
- Scholarly APIs (OpenAlex, CrossRef, Semantic Scholar): Paper metadata, references, citations, and institutional affiliations.
A data transformation layer standardizes fields, merges conflicting information, and resolves aliases. The resulting property graph uses well-defined node types such as Repository, Paper, Person, Institution, and Topic, each linked by relationships.
- Demonstration: Mapping Julia Packages to Research
We will showcase how MOSS reveals new insights into the Julia world:
-
Automatically linking Julia repositories with the papers they implement or reference.
-
Identifying which contributors are affiliated with which institutions, and how their code impacts various domains (e.g., computational physics, numerical optimization).
-
Quantifying how often key Julia packages are cited in academic works, and in which fields they hold the most influence.
-
Enabling modular, open impact algorithms.
-
Implications for the Julia Community
By visualizing the interplay between code contributions, scholarly publications, and institutional relationships, MOSS serves both new and experienced community members. Developers can discover underexplored collaborations, researchers can trace code dependencies in scientific papers, and institutions can better understand where their affiliated teams and labs fit within the broader Julia ecosystem. -
Future Directions
MOSS is designed to be modular and extensible. Future work could include:
-
Integration with additional scholarly sources (e.g., ORCID for contributor disambiguation).
Automatic detection of “hot topics,” measuring the growth of specific Julia subdomains like machine learning or HPC clusters. -
Expanded entity types, such as grants or patents, to reflect the comprehensive lifecycle of research.
Through MOSS, we aim to encourage data-driven community building, highlight impactful software and research, and support the transparent, reproducible ethos that underpins both open source development and open science. By attending this talk, you’ll learn how knowledge graph techniques can uncover hidden relationships in the Julia world, fostering a richer, more collaborative ecosystem for all.
Jonathan Starr is the program manager of the Open Source Science Initiative out of NumFOCUS where he drives development of MOSS and other OSSci initiatives seeking to connect users of open source research software with their engineers and communities. Outside of NumFOCUS he contributes to several open source projects and start-ups developing technologies that enable open science practices through novel infrastructure and incentive mechanisms. He is also co-founder of The SciOS collaborative and The Institute of Open Science Practices, facilitating connections and workshops between researchers and deep-infrastructure technologists to build technology that enables open science from an empowered grassroots scientific community.