Workflow Managers
Within Ramble’s variants configuration section, users can control which workflow manager is used for a set of experiments.
The workflow manager used controls many aspects of the experiment, and is intended to broadly map to a workload manager. Experiments in a workspace can select different workflow managers, and workflow managers can define additional templates and logic that can construct complex workflows. These can include getting an experiment’s status, cancelling an experiment, or submitting an experiment to a scheduler.
Configuring Workflow Managers
Workflow managers are controlled through a config option in the variants configuration section. The following shows an example of controlling this
variants:
workflow_manager: <workflow_manager_name>
The default workflow manager is user-managed which simply requires the user
to define the mpi_command and batch_submit variables to construct a working
workflow. The value of the workflow manager variant used can be a reference to a
variable, and will be expanded following Ramble’s variable definitions logic.
Supported Workflow Managers
Some of the currently supported workflow managers in Ramble include:
gke-mpi
google-batch
slurm
slurm-intel-mpi
slurm-pyxis
user-managed
GKE MPI Workflow Manager
Selecting the gke-mpi workflow manager will cause generated experiments to
create YAML formatted GKE manifest files for their experiments. Additionally,
the commands for submitting, and querying experiments will be handled through
kubectl.
It is important to note that GKE generally relies on containerized workloads. Ramble supports additional mechanisms for testing containerized workflows, but users might need to provide their own container for running in this mode.
Google Batch Workflow Manager
Selecting the google-batch workflow manager will enable the use of the Google
Cloud Batch workflow. With this, experiments will have YAML formatted job
manifests generated that can be submitted to Google’s cloud batch scheduler.
Additionally, submission and query commands utilize the gcloud cli command.
SLURM* Workflow Managers
There is a family of workflow managers that all apply some layered
functionality of Slurm on the generated experiments. The most basic of these is
the slurm workflow manager which uses a traditional SLURM workflow, including
the use of sbatch and srun.
The slurm-intel-mpi workflow manager augments the base SLURM workflow manager
to use Intel MPI with srun.
The slurm-pyxis workflow manager alternatively uses srun to run
containerized workloads. These are somewhat expected to have the pyxis-enroot
modifier applied to them as well, to ensure the proper format.
User Managed Workflow Manager
The user-managed workflow manager can be selected if a user wants to control
all aspects of the workflow themselves within the workspace. In this case, the
user should specify all required variables in the workspace’s configuration
file.