DIAL is a framework for doing distributed analysis, i.e. it provides a connection between distributed computing resources and those who would like to use such resources them to analyze existing data or generate new data. It is created and maintained for use in the ATLAS experiment but is sufficently generic that it can be easily utilized in other contexts.
DIAL presents its users with a data-centric, job-driven view. A user may examine an existing dataset or may define a job to create a new dataset and then examine that dataset after the job completes. DIAL enables users to monitor jobs to follow their progress, examine partial results and, if desired, to stop the job.
A job is defined by selecting an input dataset and specifying a transformation to act on that dataset. The tranformation and input dataset are passed to a DIAL scheduler (typically a remote "analysis service") which has the responsibility for carrying out the processing, providing status updates, and stopping the job if requested. By definition, a tranformation applied to an input dataset produces an output dataset (the result).
Typically a user job is split into many subjobs which are processed in parallel. The splitting is accomplished by splitting the input dataset and then applying the same tranformation to each subdataset. The scheduler then merges the result (output dataset) from each subjob to create the result for the user job. This hierarchical job model may be extended further, i.e. the subjobs themselves may split and so on.
Applications and task are characterized by their "task interface," the names and meanings of the files that an application expects to find in the task. A task may only be expected to work with an application if it provides the appropriate interface. An application is further characterized by its "environmental interface," i.e. the environment it expects and the means by which it is run.
A standard DIAL application provides two scripts: build_task and run. The former runs in a directory where the task files are present and uses them to create new files in that directory, e.g. compiling sources into a library. The latter uses the result of the first step along with an input dataset to create an output dataset. It is allowed and expected there will be other applications that do not meet this interface; specialized jobs or schedulers are then required to integrate these into a DIAL processing system.
A few applications and example tasks are avaialble for ATLAS users with emphasis on analysis of the Rome AOD datasets. Work is in progress to make the existing ATLAS production transformations available to DIAL users. ATLAS has initiated a project to define standard ATLAS transformations and it is expected thes will be incorporated into the DIAL interface when they become available.
This document describes use of the ROOT-based client aided by web-based monitoring. There is also a python binding, PyDIAL, with similar capabilities and integration with GANGA for usrs of that system.