Main concepts
The job scheduler goal is to run interdependent jobs (i.e. one job might need one or several outputs from other jobs)
-
Resources: Any object is a resource, whether it is data, a job, or a server. Resources are located on different hosts that can be specified by a URI. For example,
file:///a/b/c
denotes a local folder or file, whilexpm:token:user@hostname.org
corresponds to a token used to limit the number of launched processes on a given computer. -
Connectors: Connectors specify how a resource can be accessed and how processes can be launched. Single connectors (i.e., localhost and ssh) are built-in. Composite connectors (i.e., describing a cluster of computers) can be built from single connectors.
-
Resource state: A resource can be in the state
WAITING
(waiting for dependencies to be met),HOLD
(waiting a user action) consequently to a dependency being in the ERROR or HOLD state.DONE
when completed,ERROR
.
For tasks that can be run, two other states are possible:
READY
(waiting to be run),RUNNING
,
-
Groups: One should allow to set a group for a set of experiments. For example: I run several series of experiment and call them "trec.test1", "trec.test2". I can then operate on all the resources of a specific group, like e.g. "trec" or "trec.test1".
General architecture
Resources
We have the following types of resources:
-
Data: the output of one job (one job can have several outputs). Some data can also be already generated by an external process (e.g. a data collection) and be declared to experimaestro as read. Data can be in three states:
WAITING
,HOLD
orDONE
. -
Task:
- Job: a task to be run, that produces a given set of resources.
- Server: a task that need to be run; however, we don't wait for the server run to complete. *
Status
Every resource (see below) has a unique ID which is a path to a directory containing information about the resource on the host.
Based on the file ${FILE}
, several paths are defined (note that not all files might be present)
${FILE}.lock
locks the write access to the status (can also be used as an exclusive lock to the resource) This file is used whenever an exclusive access is needed.${FILE}.status
contains the PID of the running process (two columns PID MODE separated by space) where MODE is r, w. This file is used when the resource can be accessed by a single writer and multiple readers${FILE}.run
corresponds to the script that are needed to execute the job${FILE}.code
corresponds to the error code at the end of the execution of the job${FILE}.done
created when the job was successfully executed or the data successfully generated${FILE}.err
contains the error log output (jobs only)${FILE}.out
contains the standard log output (jobs only)${FILE}.input
contains the standard input (jobs only)
Using experimaestro
XML-RPC
The Experimaestro server can be reached through XML-RPC calls.