2024-07-12 –, Else (1.3)
In this talk, we present a serializer-independent interface to a task-based workflow management system. This package aims at simplifying the process of writing a distributed application. Given a workflow pattern as a Petri net and the code for the workflow tasks, our package can be used on a cluster (e.g. with Slurm), to automate the application's parallel deployment. Hence, DistributedWorkflows.jl will be an invaluable addition to the growing high-performance computing packages in Julia.
Over the last few years, Julia has gained rising amounts of attention in the High-performance computing community.
Similarly, Julia developers have come up with packages making the use of HPC tools approachable for domain scientists. This is mainly due to Julia's design in performance and ease of use. Distributedworkflows.jl
is an addition to this increasing collection of useful HPC packages in Julia.
In this talk, we will showcase DistributedWorkflows.jl
, a Julia interface to a task-based workflow management system that enables users to build domain-specific workflows in the form of high-level Petri nets and rely on the runtime system for the workflow's task scheduling. The underlying ecosystem of the workflow manager "auto-manages" the application runs with dynamic scheduling, in-built distributed memory transfer and distributed task execution.
An important feature of DistributedWorkflows.jl
is that it is independent of serialization format which makes it flexible to use and compatible with all serializer formats that are supported by Julia. The main objective of this package is to simplify the process of writing a parallel distributed application and enable HPC experts as well as domain scientists to easily deploy their workflow code from within the Julia environment.
The workflow is defined in the form of a high-level Petri net, which at the moment needs to be written in an xpnet file format following a certain XML schema. In the future, we will provide the generation of the Petri nets as a feature of this package for a better user experience. Using this interface, one can test the deployment of their parallel application locally before launching it on expensive cluster resources with Slurm
or other cluster management tools. We will demonstrate this using various examples (locally), starting with a simple example and showing a more advanced example with a complex Petri net workflow design.
In the future, this package could also pair up with Julia's Distributed
package, from its standard library. The Distributed
package could take care of the resource management while this package handles the workflow management. This makes our package even more attractive for users who rely on the Distributed
package for their applications.
In the end, we will show a list of features and upgrades to look forward to with regards to distributed computing and expand this package with user feedback. We provide an easy installation process and tutorials on the GitHub repository, https://github.com/FiroozehDastur/DistributedWorkflows.jl
, for anyone interested. This approach guarantees that everyone has the opportunity to benefit from the automated deployment of their distributed application, which will create the possibility to compute large examples that would have previously been considered infeasible to compute in their respective domains.
I am an algebraic and tropical geometer who has developed interest in algorithms and architectures for high performance computing and computer algebra systems.