ADA - ATLAS Distributed Analysis 

Contact: David Adams
Mailing list: atlas-analysis-tools

ATLAS Offline Computing: [ Top | Grid | ATLAS grid | Analysis ]


Deployment
Endpoint job
ATLAS review

Contents
News
FAQ
Documents
Talks
Meetings
Savannah
Deliverables
   AA
   service infra
   AJDL
   analysis services
   catalogs
   data
   package mgmt
   clients
   transformations
   datasets
   deployment
   monitoring
Projects
DIAL bug reports
 
ATLAS projects
Analysis tools
Production
Don Quijote
Computing model
Data Mangement
PANDA
 
Other projects
ARDA
EGEE
   JRA1 (gLite)
   NA4 (reqs)
LCG
GAG
Globus
GGF
OASIS
WS-I
DIAL
GANGA
AMI
GSOAP
GAE/Clarens
PPDG
OSG
 

 

Reorganization
The ATLAS grid tools and services group has been reorganized and Dietrich Liko appointed as the new distributed analysis coordinator.

Latest release
DIAL release 1.20 was made June 8, 2005. It is intended for general ATLAS use. All ATLAS physicists interested in distributed analysis are encouraged to try it out and send comments to the above mailing list. For more information, please see the release page.

Performance
Here is a recent plot showing the performance of the BNL LSF analysis service.
Here is an explanation of that plot.

ATLAS review
Here is an ADA overview for the ATLAS review.
Here is a DIAL overview.

Here is a description of the ADA components requested for the review.

Plans
Here is our wish list.


Introduction
ADA (ATLAS Distributed Analysis) is a project to deliver an end-to-end distributed analysis system for ATLAS. The scope of analysis is taken to be the manipulation and extraction of summary data (e.g. histograms) from any type of event data (TAG, AOD, ESD, ...) and the user-level production of such data. Distributed analysis extends the extraction and production to an environment where the users, data and processing are distributed over the grid.

Strategy
ADA is implemented as a collection of grid services consistent with the model being developed by the LCG ARDA project. Work has begun on a deployment model. Clients are provided for a variety of analysis environments including command line, Python and ROOT. Frequent releases allow rapid rsponse to feedback from users.

The main focus of ADA is to identify and bring together the pieces required to assemble a useful distributed analysis system. We look to other projects to supply the ingredients. These projects include GANGA, DIAL, ARDA, ATLAS production, ATLAS data management and the many of the grid projects around the globe.

Model
ADA describes data using datasets. Users can fetch the properties of a dataset (including the location of its data), browse the existing collection of datasets, make selections based on metadata, define transformations to create new datasets, create jobs to carry out these transformations and monitor those jobs. A transformation is expressed in terms of an application which specifies the executable and shared libraries and a task which provides runtime configuration.

The components dataset, application, task and job are described using AJDL (Abstract Job Definition Language). AJDL specifies XML schema for describing the content of the components and provides corresponding classes for accessing this data.

ADA identifies a small number of high-level services with relatively simple interfaces. Users, or more precisely user client programs, interact directly with these services. The interfaces for these services are expressed in AJDL. Fixing the interfaces of these services allows us to mix and match clients and services. A service meeting this interface is available to all clients and a new client written to the interface immediately gains access to all existing services.

On the back end we expect many different implementations of these services to handle different requirements (e.g. interactive analysis vs. large-scale batch production), different resources (farms and grids) and to make use of existing job management systems including batch systems and grid workload management systems.

Similarly, we expect different analysis environments will provide different clients for these services to meet the varying requirements and tastes of the user communities. These environments include Python-based frameworks like GANGA, the C++/CINT-based ROOT, Java-based environments like JAS as well as command line and web page interfaces.

Subprojects
ADA is broken up into the subprojects listed below. Each entry is linked to a page providing more information.
     Authentication and authorization
     Service infrastructure
     AJDL - Abstract Job Description Language
     Analysis services
     Catalog services
     Data access and movement
     Package management
     Clients for these services
     Transformations
     Datasets
     Deployment
     Monitoring

Releases
Release 1.20 of DIAL is the basis for the current ADA system. For more information, please see the release page


Last modified 14jul04 by dla