Computational narratives like Jupyter, MyST Markdown, R-Markdown, and Quarto are amazing for doing science. You can combine narrative, code, data and images, conducting your analysis while also creating information to share. However, the workflow has been that notebooks are where you do the work, but you need to publish a pdf article to advertise the work, and this is the research output that most people see. That process not only creates extra work, but we're losing key information, amazing graphics, interactive visualizations, and a connection to the code and data.
Flattening science into a published pdf sacrifices reproducibility and valuable context for others to build on the research. We’re continuing to share our science in 19th century ways, as if we need to send printed, physical copies of our work to people in the mail. This is both a boring and ineffective way to communicate science and also reduces the visibility and value to the modular components of research. The data, images, and code all have individual value, especially as we think about new ways for humans and machines to build on existing science for new impact.
The Open Exchange Infrastructure (OXA, https://oxa.dev) is a community standard for scientific publishing built for modular and computational science. Initial contributors include Stencila, eLife, Posit, PLOS, openRxiv, Curvenote, NeuroLibre, and Creative Commons — representing a new document format that brings together the best of Jupyter Notebooks, Quarto, MyST Markdown, and publishing/archiving standards to enable new scientific publishing experiences and workflows. OXA additionally allows many tools and existing formats to connect with each other and into traditional publishing workflows, like Journal Article Tag Suite (JATS XML) and Manuscript Exchange Common Approach (MECA). This means that what you share is interactive and engaging and your research products, like large scale microscopy images (e.g. OME-Zarr), are first-class citizens where image datasets, notebooks, and other research products are highlighted not hidden.
In this talk we’re sharing more on the technical architecture of the format and a pilot between openRxiv (the non-profit organization behind the largest biomedical preprint servers: bioRxiv and medRxiv) and Curvenote (a scientific content management system that also hosts the SciPy Proceedings) to migrate 500k preprints (8.1TB) to OXA and show real-world examples of interactive scientific content, modular attribution, and what’s possible when the pieces are connected and scientific research can be open, engaging and match what’s possible with our current technology - to change the way we share and do science. This isn’t a future vision, this is what is already happening today.