GOOD 2026

Workflow Management in Open OnDemand
2026-03-11 , Main Hall

Executing multi-task computational research workflows on HPC systems often requires manual scripting, resource manager and scheduler expertise. It creates barriers for researchers and educators, especially at smaller institutions or in multidisciplinary collaborations. Although OOD provides an accessible web-based interface for launching standalone applications as jobs, it has lacked an integrated mechanism to easily compose, manage, and execute multiple tasks with dependencies among them as Workflows. Workflows enabled researchers to visually construct, execute, and monitor simple composition of multiple “launchers” (i.e. independent tasks/jobs) that can perform data pre- or post-processing, simulation, and other computational tasks connected via output to input dependencies.


One of the recent developments in OOD is the Project Manager(PM) to replace the legacy Job composer. The PM introduced a new idea of having a shared or private project abstracting the jobs that categorizes and manages jobs at the project level. Tasks (Launchers) in PM allows users to create jobs with single scripts, but real-world scientific research rarely consists of a single, isolated task. Instead, it involves pipelines, data pre-processing, simulation, post- processing, and visualization, where the output of one task serves as the input for the next. While PM has simplified the launch of standalone interactive applications, it has historically lacked a native, user-friendly mechanism for orchestrating complex, multi-step workflows. Users are forced to exit the web UI and return to manual CLI scripting to manage these dependencies.
The Workflow Manager, integrated into PM, extends standalone Launchers to support multi-step research pipelines. It provides a web-based visual editor where users can construct Directed Acyclic Graphs (DAGs) to define task dependencies and conditional branching without writing complex manual scripts. By converting these DAGs into topological orderings, the system automates job scheduling across various resource managers like SLURM or PBS. Workflows utilizes YAML-based metadata and a real-time pooling mechanism. This allows the interface to track job IDs and provide human-readable diagnostic messages regarding job states and dependencies directly in the browser. This approach simplifies error management by explaining why jobs may be pending or failed, while remaining agnostic to underlying scheduler.
Workflow Manager successfully handles task dependencies based on execution order, but lacks a mechanism for explicit data flow between tasks. Future development will introduce a workflow-scoped temporary directory to serve as a shared execution context for all tasks, enabling users to inspect outputs and browse prior runs through the GUI. Launchers will be updated to support both input and output directory specifications. By default, these directories will resolve to the workflow's shared temporary space, automatically binding the output of one launcher to the input of the next. While scaling this to multiple outputs remains a design challenge, the community is exploring solutions like named outputs and artifact descriptors to evolve the tool into a fully data-aware system without sacrificing its accessibility for novice users.

See also: A recent poster we showcased in SC25 Exhibit (1.8 MB)

Harshit Soora is a Master’s student in the Department of Computer Science at the University of Maryland. He works as a Graduate Research Assistant under Prof. Alan Sussman, collaborating closely with the Ohio Supercomputing Center as part of their core developer team. Before joining UMD, he spent over three years at NVIDIA, specializing in cloud computing and scaling AI models. He holds a Bachelor’s degree in Computer Science and Engineering from IIT Kharagpur.