DIAL release 1.30: Submitting and monitoring a job

ROOT interface

The submit command in DIAL 1.30 provides commands to make it easy to submit and monitor a job. Simply type to submit a job using the application, task and dataset referenced by the global variables papp, ptsk and pdst. Typically these are defined as described in the previous section. The ID assigned to the job will be recorded in the global variable jid.

The status of the job may be checked with the command

The job should quickly move from the INITIALIZED to the RUNNING state and will end up in the DONE state when it completes successfully or FAILED if there is an unrecoverable error.

You can exert additional control over how and where the processing is done by specifying non-default analysis service or job preferences.

Web monitor

Find the link to the web page for your analysis service (e.g. from the service page) and follow the link to the "PrimaryJobRepository". You job should be near the top of this list. Click on the link from the job ID to get a description of the job including its current (after refresh) status. Follow links to the application, task, input dataset and preferences for descriptions of those items. There is also a link to the output dataset (result) once it is available. Note that links to partial results will not be valid as these are not stored in the dataset repository.

Each primary job page also provides links to similar descriptions of the subjobs that comprise the job. You can follow these to find information such as where the subjob ran and ist status and return code after completion.

Choosing an analysis service

By default, jobs are sent to the interactive analysis service running at BNL. If your job is long-runnning or you would like to make use of resources at another site or use a diffeent workload management system, you can specify a different analysis service.

The easiest way to spcecify the analysis service is to put its URL in the file scheduler in the directory where root is run. The web interface described in the getting started section can be used to select a scheduler and find its URL. Note that the service URL is not the same as the web page address (typically the port number is incremented).

Job preferences

By default, jobs are submitted with no preferences. If you wish to provide preferences, then global variable pprf must point to the desired preferences when the job is submitted. The example macro create_preferences.C may be used before of after modification for this purpose. Run it in the usual way: The variable max_retry specifies the maximum number of times a failed subjob may be resubmitted before the service declares the parent compound job to be failed. The default is zero.

Input datasets may be constructed from other datasets in a tree structure. Most Rome AOD datasets have subdatasets with a maximum of fifty files. The splitter parameter dataset_depth specifies the depth in this tree used for splitting (default is 1) and min_dataset specifies the minumum number of consituent datasets in a subjob (default 1). The parameter min_event specifies minumum number of events per job.


DIAL release 1.30: Defining a job, updated 17oct05