Chapter 12. RMS plug-ins

Table of Contents

12.1. For Developers
12.1.1. Basic concept
12.1.2. Requirements
12.1.3. API
12.2. SSH
12.2.1. Resource definition
12.2.2. Example usage
12.3. Localhost
12.3.1. Introduction
12.3.2. Example usage
12.4. SLURM
12.4.1. Sandbox directory
12.4.2. User configuration
12.4.3. Usage example

12.1. For Developers

12.1.1. Basic concept

DDS offers a possibility for external developers to make their own RMS plug-ins.

Conceptually, each RMS plug-in is just an executable, which uses a simple DDS plug-in API and is able to deploy and execute a DDS worker package on a corresponding RMS.

The following is a basic workflow:

  • User requests to deploy DDS agents or a given RMS using the dds-submit --rms XXXX command. Where XXXX is the name of the plug-in user wants to use.

  • DDS commander server receives the request, looks for a suitable plug-in (associated with the XXXX name) and starts it. Plug-in has 2 minutes to connect back to commander to receive exact details about the submit request.

  • Once plug-in is started it should contact with the DDS commander server using DDS API, receive details and deploy agents on a given RMS. That's so far it.

12.1.2. Requirements

  • DDS requires each plug-in to have the name according to the following format: dds-submit-XXXX, where XXXX is the name of the plug-in (or name of RMS it wraps). All lower case characters.

  • A DDS plug-in (executable) and all related files must be sandboxed in a dedicated folder: path/dds-submit-XXXX/. The folder path is provided as a commandline argument for all plug-ins. The default location of plug-ins is $DDS_LOCATION/plugins/dds-submit-XXX/.

  • A DDS plug-in should take two command line arguments

    [--id arg]

    and

    [--path arg]

    DDS will call the plug-in with this command line arguments and will provide a unique ID and a plug-in directory path. ID must be used when ever plug-in communicates with DDS commander server (see "plug-in-id" in the API section for more info). Plug-in's directory path can be used to access related files if needed.

  • Plug-ins are responsible to remove all own temporary files on exit. DDS doesn't take ownership of any file create by plug-ins.

12.1.3. API

The dds::intercom_api::CRMSPluginProtocol is a wrapper class for plug-in/"DDD commander server" communication.

Once started and ready the plug-in should subscribe on the "submit and "message" command from the DDS commander server.

CRMSPluginProtocol prot("plug-in-id");

prot.onSubmit([](const SSubmit& _submit) {
	// Implement submit related functionality here.

	// After submit has completed call stop() function.
	prot.stop();
});


prot.onMessage([](const SMessage& _message) {
	// Message from commander received.
	// Implement related functionality here.
});

onSubmit will deliver to the plugin-in the actual request dds::intercom_api::SSubmit. It can contain either a configuration file (format of the file is plug-in depended) or simply a number of agents to deploy. But it will always contain the path to the worker package, which plug-in is supposed to deploy on RMS and execute. Additionally developers can use a DDS command line tools to find out the location of the worker package: dds-user-defaults --wrkscript. This is especially useful when plug-ins use shell scripts.

Once ready the plug-in let's give a hit to DDS commander that we are online and ready for a job:

// Let DDS commander know that we are online and start listening for notifications.
prot.start();

After that commander will form a submit request and will send it back to the plug-in. By default his call will block the main thread until one of the condition is true:

  • 10 minutes timeout,

  • Failed connection to DDS commander or disconnection from DDS commander,

  • Explicit call of the stop() function

If you do not want to stop the thread use:

// "false" means that we do not block the thread
prot.start(false);

If there are no subscribers the thread is not blocked in any case.

Once connected you can use proto.sendMessage to send messages. Those messages will be displayed to user while he/she waits on dds-submit command. Be advised, that once commander receives the error message it will forward it to the user and close connection as it means a failed submission.

We strongly recommend to protect CRMSPluginProtocol calls in a try/catch block, as all methods can throw std::exceptions:

try {
	CRMSPluginProtocol prot("plug-in-id");

	prot.onSubmit([](const SSubmit& _submit) {
		// Implement submit related functionality here.

		// report something back to a user
		proto.sendMessage(dds::intercom_api::EMsgSeverity::info, "Text of the info message");

		// After submit has completed call stop() function.
		prot.stop();
	});


	prot.onMessage([](const SMessage& _message) {
		// Message from commander received.
		// Implement related functionality here.
	});

	// Let DDS commander know that we are online and start listening for notifications
	prot.start();
	} catch (exception& _e) {
		// Report error to DDS commander
		proto.sendMessage(dds::intercom_api::EMsgSeverity::error, e.what());
	}

12.2. SSH

12.2.1. Resource definition

DDS's SSH plug-in is capable to deploy DDS agents on any resource machine available for password-less access (public key, ssh agent, etc.) To define resources for the SSH plug-in we use a comma-separated values (CSV) configuration file, in case if you want to deploy agents on several computing nodes. The ssh plug-in can also spawn agents on the local machine only. In this case you don't need a configuration file - just use dds-submit -n X, where X is a desired number of agents to spawn. Fields are normally separated by commas. If you want to put a comma in a field, you need to put quotes around it. Also 3 escape sequences are supported.

Table 12.1. DDS's SSH plug-in configuration fields

12345

id (must be any unique string).

This id string is used just to distinguish different DDS workers in the plug-in.

a host name with or without a login, in a form: login@host.fqdnadditional SSH params (could be empty)a remote working directorya number of agents to spawn


Example 12.1. An example of an SSH plug-in configuration file

r1, anar@lxg0527.gsi.de, -p24, /tmp/test, 10
# this is a comment
r2, user@lxi001.gsi.de,,/home/user/dds,10
125, user2@host, , /tmp/test,


12.2.2. Example usage

Call using a given configuration file:

dds-submit -r ssh -c your-ssh-Resource-definition-config-file

Call using a local system only to spawn 10 DDS agents on it:

dds-submit -r ssh -n 10

12.3. Localhost

12.3.1. Introduction

DDS's localhost plug-in is capable to deploy DDS agents on a local machine. Unlike SSH plug-in, localhost plug-in doesn't require a password-less access (public key, ssh agent, etc.). The configuration file is not required for localhost plug-in. The plug-in spawns 1 agent with a defined number of task slots on the local machine only. Just use dds-submit --slots X, where X is a desired number of task slots.

12.3.2. Example usage

Call using a local system only to spawn 1 DDS agent with 10 task slots:

dds-submit -r localhost --slots 10

12.4. SLURM

12.4.1. Sandbox directory

If your home directory is not shared on the SLURM cluster, then you must define a sandbox directory, which DDS will use to store SLURM job script and all jobs' working directories will be also located there. Please note, that at the moment DDS doesn't clean jobs' working directories, therefore you are responsible to remove them if needed.

In order to set sandbox directory a DDS global option "server.sandbox_dir" have to be changed, which is located in the DDS configuration file DDS.cfg (default location: $HOME/.DDS/DDS.cfg)

12.4.2. User configuration

Using dds-submit -c My_SLURM.cfg command you can provide additional configuration options for DDS SLURM jobs. For example, the following command will submit 10 DDS agents (each with 50 task slots) and will use additional SLURM configuration options provided in the My_SLURM.cfg:

dds-submit -r slurm -n 10 --slots 50 -c My_SLURM.cfg

[Caution]Caution

The content of the custom SLURM job configuration file can be any sbatch parameter, except "srun" and "--array".

For example, My_SLURM.cfg can contain:

#SBATCH -A "account"
#SBATCH --time=00:30:00

12.4.3. Usage example

Submit 10 DDS agents to SLURM cluster. On the SLURM submitter machine execute:

dds-submit -r slurm -n 10

	dds-submit: Contacting DDS commander on lxbk0200.gsi.de:20001 ...
	dds-submit: Connection established.
	dds-submit: Requesting server to process job submission...
	dds-submit: Server reports: Creating new worker package...
	dds-submit: Server reports: RMS plug-in: /u/manafov/DDS/1.1.61.g474ddc6/plugins/dds-submit-slurm/dds-submit-slurm
	dds-submit: Server reports: Initializing RMS plug-in...
	dds-submit: Server reports: RMS plug-in is online. Startup time: 17ms.
	dds-submit: Server reports: Plug-in: Generating SLURM Job script...
	dds-submit: Server reports: Plug-in: Preparing job submission...
	dds-submit: Server reports: Plug-in: pipe log engine: Submitting DDS Job on the SLURM cluster...

	dds-submit: Server reports: Plug-in: pipe log engine: SLURM: Submitted batch job 9539993

	dds-submit: Server reports: Plug-in: DDS agents have been submitted

Check the status of your SLURM jobs:

scontrol show job 9539993

Check the status of your DDS agents:

dds-info -ln 

Once agents are online, use DDS as normal.