DIAL release 1.30: Introduction

DIAL model

DIAL is a framework for doing distributed analysis, i.e. it provides a connection between distributed computing resources and those who would like to use such resources them to analyze existing data or generate new data. It is created and maintained for use in the ATLAS experiment but is sufficently generic that it can be easily utilized in other contexts.

DIAL presents its users with a data-centric, job-driven view. A user may examine an existing dataset or may define a job to create a new dataset and then examine that dataset after the job completes. DIAL enables users to monitor jobs to follow their progress, examine partial results and, if desired, to stop the job.

A job is defined by selecting an input dataset and specifying a transformation to act on that dataset. The tranformation and input dataset are passed to a DIAL scheduler (typically a remote "analysis service") which has the responsibility for carrying out the processing, providing status updates, and stopping the job if requested. By definition, a tranformation applied to an input dataset produces an output dataset (the result).

Typically a user job is split into many subjobs which are processed in parallel. The splitting is accomplished by splitting the input dataset and then applying the same tranformation to each subdataset. The scheduler then merges the result (output dataset) from each subjob to create the result for the user job. This hierarchical job model may be extended further, i.e. the subjobs themselves may split and so on.

Datasets

Input datasets may be the output of earlier DIAL jobs or may be obtained from other sources, e.g. ATLAS production. At present DIAL provides ATLAS users with a collection of datasets consisting mostly of AOD samples generated for the Rome physics meeting last June. The ATLAS data management group has recently begun to deploy a new dataset-based system. It is expected that the next release of DIAL will enable users to use that system for both input and output.

Transformations

A DIAL transformation has two components: an application and a task. The task is a collection of text files where the user provides information that characterizes the processing, e.g. software versions, run time parameters, and possibly code to compiled and used during processing. The application uses the task to carry out the processing, i.e. act on an input dataset to produce a result.

Applications and task are characterized by their "task interface," the names and meanings of the files that an application expects to find in the task. A task may only be expected to work with an application if it provides the appropriate interface. An application is further characterized by its "environmental interface," i.e. the environment it expects and the means by which it is run.

A standard DIAL application provides two scripts: build_task and run. The former runs in a directory where the task files are present and uses them to create new files in that directory, e.g. compiling sources into a library. The latter uses the result of the first step along with an input dataset to create an output dataset. It is allowed and expected there will be other applications that do not meet this interface; specialized jobs or schedulers are then required to integrate these into a DIAL processing system.

A few applications and example tasks are avaialble for ATLAS users with emphasis on analysis of the Rome AOD datasets. Work is in progress to make the existing ATLAS production transformations available to DIAL users. ATLAS has initiated a project to define standard ATLAS transformations and it is expected thes will be incorporated into the DIAL interface when they become available.

Catalogs

Applications, tasks, datasets, and jobs are all tagged with unique identifiers and are stored in repositories where they may be accessed with these identifiers. This storage is now done automatically whenever a job is submitted. In addition, selected applications, tasks and datasets are published in selection catalogs indexed by names unique in the context of the catalog.

Doing some work

It is expected that the the typical DIAL user will make use of a DIAL client to connect to a remote analysis service that provides the scheduler services described above. The following sections describe how to start up such a client, define a job, submit this job definition and then monitor the progress of the job and examine results.

This document describes use of the ROOT-based client aided by web-based monitoring. There is also a python binding, PyDIAL, with similar capabilities and integration with GANGA for usrs of that system.


DIAL release 1.30: Introduction, updated 17oct05