.. Copyright 2022-2026 The Ramble Authors Licensed under the Apache License, Version 2.0 or the MIT license , at your option. This file may not be copied, modified, or distributed except according to those terms. .. _basic_application_tutorial: ===================================================== 1) Writing a basic application definition ===================================================== This tutorial will provide an introduction to writing an application definition in Ramble. In this tutorial, you will create and test an application definition file to run the ``hostname`` linux utility as your application. It is a good idea to have a basic working understanding of how to create and use Ramble workspaces before starting this tutorial. You should at least be familiar with the content of the :ref:`Hello World Tutorial`. This tutorial is intended to be a practical, hands-on guide to creating a simple application definition. For a more comprehensive reference on all available directives and advanced features, please see the :ref:`Application Definition Developers Guide`. Installation ============ To install Ramble, see the :doc:`../getting_started` guide. **NOTE**: This tutorial does not require a package manager to be installed or configured. .. include:: shared/repository_create.rst Hostname Application Definition =============================== For the remainder of this tutorial, we will be writing and testing the contents of the hostname application definition. Create Application Definition ----------------------------- To begin with, we will create an empty application definition file. We will populate this file throughout the remainder of this tutorial. To create it, issue the following commands: .. code-block:: console $ mkdir -p tutorial-repo/applications/hostname $ touch tutorial-repo/applications/hostname/application.py For the remainder of this tutorial, ``ramble edit hostname`` will open this file with the editor specified with your ``EDITOR`` environment variable. Application Class ----------------- Ramble provides a module (e.g. ``appkit``) which imports a large portion of the features to write application definitions. Each application definition should import this using: .. code-block:: python from ramble.appkit import * Every application definition in Ramble contains a python class defines the characteristics of the application. The name of this class matches the directory name for the application, but converted to CamelCase. Since our application name (``hostname``) does not have any hyphens or underscores in the name, the python class name will be ``Hostname``. Ramble also provides a base class, ``ExecutableApplication`` which handles applying the application language to the object. Every application definition should inherit from either this, or something else that inherits from this class. Object definitions should have a name attribute that matches the directory name as well. In this case: ``name = 'hostname'``. As a result, our class definition file should contain the following to start. .. code-block:: python from ramble.appkit import * class Hostname(ExecutableApplication): name = 'hostname' At this stage, our application should show up in the output of ``ramble list`` and ``ramble info hostname`` should show limited information about this application. Definition Experiment Constructs -------------------------------- To begin with, we will create some basic constructs within this application definition that will help us test the creation of experiments. Application Executables ^^^^^^^^^^^^^^^^^^^^^^^ The lowest level construct in an application definition is an executable. Executables relate to arbitrary commands that you would normally execute when running your specific application. Since we are writing an application definition for hostname, our command simply looks like ``hostname``. The language feature ``executable`` can be used to define how Ramble should use these. The documentation for the ``executable`` directive can be found at :py:meth:`ramble.language.application_language.executable`. For the purposes of this tutorial, we will begin by assuming we will not execute ``hostname`` under an ``mpi`` runtime. The starting executable definition will be: .. code-block:: python executable( "local-execute", "hostname", ) This defines a new executable in the hostname application definition named ``local-execute`` and the template for the executable is simply the ``hostname`` command. The remaining arguments are left as the default which will disable MPI on this executable. Application Workloads ^^^^^^^^^^^^^^^^^^^^^ Workloads are the construct that users refer to within workspaces to create experiments. Workloads are the pairing of one or more executables with zero or more input files. In the case of ``hostname`` we have no input files that are required to run this application. The documentation for the workload directive can be seen at: :py:meth:`ramble.language.application_language.workload`. In our case, we will create a new workload that simply uses the ``local-execute`` executable. This directive should look like: .. code-block:: python workload("local", executables=["local-execute"]) At this stage, our application definition should look like the following: .. code-block:: python from ramble.appkit import * class Hostname(ExecutableApplication): name = "hostname" executable( "local-execute", "hostname" ) workload("local", executables=["local-execute"]) With this application definition, we are now at a point where experiments can be constructed to test this definition file. .. _basic_definition_test: Testing Application Definitions ------------------------------- To exercise the application definition, we need to construct a workspace. To do this, execute the following: .. code-block:: console $ ramble workspace create -d tutorial-workspace **NOTE**: If you have an active workspace (e.g., if you used ``ramble workspace create -a`` in a previous session or this one), you must first deactivate it with ``ramble workspace deactivate`` or unset the ``RAMBLE_WORKSPACE`` environment variable to avoid conflicts when creating a new workspace. Also, creating a workspace *without* the ``-a`` (activate) flag means you will need to use the ``-D `` flag with subsequent ``ramble`` commands to specify which workspace to operate on. The following command can be used to add an experiment with the workload we defined earlier: .. code-block:: console $ ramble workspace manage experiments hostname -v "n_nodes=1" -v "n_ranks=1" --overwrite This will add a single experiment, named ``generated`` to the workspace that will use the hostname application and the local workload (since this is the only defined workload at the moment). The experiments can then be set up using: .. code-block:: console $ ramble workspace setup The contents of the experiment's ``execute_experiment`` script can be examined to ensure it looks correct. It should be inside the workspace in the ``experiments/hostname/local/generated/execute_experiment`` path. The contents should look similar to the following (with some expected path differences): .. code-block:: console #!/bin/bash # This is a template execution script for # running the execute pipeline. # # Variables surrounded by curly braces will be expanded # when generating a specific execution script. # Some example variables are: # - experiment_run_dir (Will be replaced with the experiment directory) # - command (Will be replaced with the command to run the experiment) # - log_dir (Will be replaced with the logs directory) # - experiment_name (Will be replaced with the name of the experiment) # - workload_run_dir (Will be replaced with the directory of the workload # - application_name (Will be replaced with the name of the application) # - n_nodes (Will be replaced with the required number of nodes) # Any experiment parameters will be available as variables as well. # **************************************************** # * No workflow is used with this experiment # * Execution command: /tmp/tutorial-workspace/experiments/hostname/local/generated/execute_experiment # * If this file is not the same as the above path, it is unlikely that this script # * is used when `ramble on` executes experiments. # **************************************************** cd "/tmp/tutorial-workspace/experiments/hostname/local/generated" rm -f "/tmp/tutorial-workspace/experiments/hostname/local/generated/generated.out" touch "/tmp/tutorial-workspace/experiments/hostname/local/generated/generated.out" export OMP_NUM_THREADS="1"; hostname >> "/tmp/tutorial-workspace/experiments/hostname/local/generated/generated.out" 2>&1 The last line of this file shows the hostname command will be run, and the output will be redirected to the experiment's log file. At this stage, this experiment can be executed (if you have access to the ``hostname`` binary) to ensure it executes properly. This can be accomplished using: .. code-block:: console $ ramble on After executing, the output of the hostname command should exist in the ``generated.out`` file inside the experiment's directory. This workflow can be used in the future to continue testing our application definition. Workload Variables ------------------ Workload variables are a mechanism that application definition developers can use to expose aspects of an application or workload that users might want to control. These can be anything from input flags / arguments, to parameter definitions. We will use workload variables to allow users to control the execution flags on the ``hostname`` binary. While the default might be to only use the default behavior of ``hostname``, adding this functionality in allows the application definition to be more flexible for users in the future. The :py:meth:`ramble.language.application_language.workload_variable` directive is used to create a variable that users can easily know about. We will now create a variable named ``input_arguments`` using the following directive: .. code-block:: python workload_variable( "input_arguments", default="", description="Input arguments for hostname", workloads=["*"], ) In this example, we set the default value to be ``""`` which will retain the default ``hostname`` behavior, we can write a description to provide information to users about what the purpose of this variable is, and we can control which workloads this variable are associated with. In this example, we use ``["*"]`` to glob all workloads and make it easier for this to be used on all workloads. However, selecting specific workloads can allow developers to change the default value of a variable based on the workload selected. Now that we have a variable, we need to update the executable definition to make sure it is used. The new ``local-execute`` definition should look like the following: .. code-block:: python executable( "local-execute", "hostname {input_arguments}" ) This definition allows the ``local-execute`` executable to expand the value of the ``input_arguments`` variable and append it to the ``hostname`` executable. If a user had the following in their workspace config: .. code-block:: yaml variables: input_arguments: "-i" The resulting rendered ``execute_experiment`` script will contain the ``-i`` flag, and the output should be an IP address instead of a hostname. At this point, :ref:`the basic test` can be used to see how the ``input_arguments`` applies to experiments. **NOTE**: When using the ``workload`` or ``workloads`` arguments on the ``workload_variable`` directive, the directive needs to show up after the workloads it is attached to within the python class. Usage of ``workload_group`` s can mitigate this restriction. Parallel Executables -------------------- Some applications need to be executed under some parallel runtime, such as MPI. Within application definitions, developers can convey this to Ramble by adding the ``use_mpi=True`` argument when defining new executables. When this argument is set to ``True``, Ramble will prepend the ``mpi_command`` variable definition to the command line within the resulting execution script. Users can control the value of the ``mpi_command`` from their workspace, or the definition can come from other object definitions (such as ``workflow_managers``), however this is the mechanism for executables to say they should be executed in parallel. To see how this behaves, we will create a new workload that will represent the parallel execution of ``hostname``. In general, ``hostname`` only needs to be executed once per node in the job. As a result, we will override the ``n_ranks`` variable definition to match the ``n_nodes`` value. .. code-block:: python executable( "parallel-execute", "hostname {input_arguments}", use_mpi=True, variables={"n_ranks": "{n_nodes}"}, ) workload("parallel", executables=["parallel-execute"]) To test this, we can follow the steps in :ref:`the basic test` from earlier, which will now create experiments for each of the two workloads. **NOTE**: If you had a value for ``input_arguments`` running the commands as-is could remove these from your workspace. After setting up the workspace again, the parallel workload's generated ``execute_experiment`` script should contain: .. code-block:: console mpirun -n 1 hostname >> "/tmp/tutorial-workspace/experiments/hostname/parallel/generated/generated.out" 2>&1 The default ``mpi_command`` from the ``user-managed`` workflow manager happens to be ``mpirun -n {n_ranks}``, which is prepended to our ``hostname`` executable in this line. **NOTE**: Execution of these experiments will fail if you do not have ``mpirun`` on the system you're running the experiments on. To execute only the experiments with the local workload, you can use: .. code-block:: console $ ramble on --where '"{workload_name}" == "local"' Analysis of experiments ----------------------- Up until this point, we have focused on constructing the execution of experiment. However, Ramble also handles analysis of the experiments. To do this, application definitions define figures of merit. A figure of merit is an arbitrary metric that Ramble should extract and track for each experiment generated from this application definition. Additionally, success criteria can be defined to help users know whether their experiment behaved the way it was expected to or not. Figure of merit ^^^^^^^^^^^^^^^ To begin with, we will add a figure of merit using the :py:meth:`ramble.language.shared_language.figure_of_merit` directive. Figures of merit are extracted using a regular expression match on some file in the experiment directory. We will use the following definition to track whatever the output from the experiment is as the possible hostname: .. code-block:: python figure_of_merit( "possible hostname", fom_regex=r"(?P\S+)", group_name="hostname", units="", ) In this directive, the name ``possible_hostname`` will show up in the resulting results file after analysis of a workspace. The ``fom_regex`` argument controls what regular expression is used to extract this figure of merit. The ``group_name`` argument controls which regular expression group (from the ``fom_regex`` argument) matches this specific figure of merit. And the ``units`` argument allows us to define the units on the resulting figure of merit. After defining this figure of merit, the: .. code-block:: console $ ramble workspace analyze command can be used to extract figures of merit from our experiments. **NOTE**: This figure of merit definition will only extract the last line from the experiment's output file. In the case of a parallel run, this will not contain all of the hostnames. To build this list, we would use an in-memory figure of merit, but we will leave the definition of this for a later tutorial. Success Criteria ^^^^^^^^^^^^^^^^ To help users know if their experiments worked or not, application developers can define success criteria. Success criteria can examine several aspects of an experiment, including checking for existence of (or non-existence) of a particular string, comparing the value of a figure of merit, or executing arbitrary python to determine if the experiment succeeded or failed. For this tutorial, we will add a basic success criteria that just ensures something was written from the ``hostname`` executable. This success criteria should look like the following: .. code-block:: python success_criteria("wrote_anything", mode="string", match=r".*") Putting it all together ----------------------- At this point, we should have a fairly complete ``hostname`` application definition that includes workloads for running locally on a given machine, and in parallel on many different machines. Users should also get reasonable figures of merit as output, and their experiments should inform them of failed runs. Application Definition ^^^^^^^^^^^^^^^^^^^^^^ Our complete application definition at this point is as follows: .. code-block:: python from ramble.appkit import * class Hostname(ExecutableApplication): name = 'hostname' executable( "local-execute", "hostname {input_arguments}", ) workload("local", executables=["local-execute"]) executable( "parallel-execute", "hostname {input_arguments}", use_mpi=True, variables={"n_ranks": "{n_nodes}"} ) workload("parallel", executables=["parallel-execute"]) workload_variable( "input_arguments", default="", description="Arguments for executing `hostname`", workloads=["*"] ) figure_of_merit( "possible hostname", fom_regex=r"(?P\S+)", group_name="hostname", units="", ) success_criteria("wrote_anything", mode="string", match=r".*") Final Tests ^^^^^^^^^^^ To complete this tutorial we will test the ``local`` workload to see how everything works. To begin with, delete the tutorial workspace, and recreate it using: .. code-block:: console $ ramble workspace deactivate $ rm -rf tutorial-workspace $ ramble workspace create -d tutorial-workspace -a Now, we can add an experiment to exercise the local workload using: .. code-block:: console $ ramble workspace manage experiments hostname --workload-filter local -v n_ranks=1 -v n_nodes=1 The ``--workload-filter local`` arguments are added here to filter the workloads so we only use the local workload. Now, to complete the test we can execute: .. code-block:: console $ ramble workspace setup $ ramble on $ ramble workspace analyze The result of these commands should be the creation of a ``results.latest.txt`` file that contains the hostname of your machine. Summary and Final Cleanup ------------------------- At this stage, you have now created a new application definition to execute the ``hostname`` binary. You have tested it within a workspace, and have constructed a custom object repository to create new definitions in. To clean up your system, make sure to deactivate your workspace before trying to remove it. These steps can be completed with: .. code-block:: console $ ramble workspace deactivate $ rm -rf tutorial-workspace