Report for ATLAS Grid and Data Challenge meeting April 6, 2004 David Adams The LCG workshop was held the week of March 22. A GANGA-LHCb-ATLAS meeting and an ADA meeting were also held that week. Plans for ARDA were presented at the LCG workshop. The ARDA effort is divided into two pieces: the ARDA integration team headed by Massimo Lamanna and the EGEE middleware team headed by Frederic Hemmer. -- The MW team has delivered its first report (copy accesible from the ADA documents page) and plans to deliver a prototype system based on AliEn by the end of this month. -- The integration team has the mandate to deliver prototypes for each experiment based on this middleware. The integration team has two people associated with each experiment; Frederik Orellana and Dietrich Liko are the ATLAS people. At present Frederik is working on ATLAS production (ATCOM) and Dietrich is finishing up tasks on the online system. I presented the strategy for ATLAS distributed analysis at the GANGA-LHCb-ATLAS workshop with the hope that LHCb would move in that direction and increase the potential sharing of GANGA (and other) effort. However, Phillipe and Pere prefer to stay on their current course which is to look to GANGA to deliver their distributed analysis system. The system proposed for ATLAS makes a separation between client tools and high-level services where the latter include tasks such as job splitting, result merging, job tracking and management, and cataloging. This division facilitates the following -- Enabling these services to have a lifetime distinct from that of the client session -- Dedicating common resources to supporting these services -- Sharing of produced data -- Enforcement of combined local and ATLAS-wide allocation policies -- Recognized development by loosely connected groups -- Common user interface for existing systems, the "final" system and the intermediaries One important early task is to identify the ingredients of the interface to the high-level services. AJDL is a first pass at this identification. Recognized components are the dataset, transformation (decomposed as application plus task) and the job which describes both the low level job run as a process on a CPU and the collection of such jobs run for a common purpose, e.g. to process a dataset. These component have generic interfaces so that high-level services can be reused. They are extensible to provide interfaces and implementation specific to particular experiments or tasks. ADA recognize a component presently called the analysis service that receives a request in the form of transformation plus input dataset and returns an output dataset. There may be a lot going on under the covers: job splitting, result merging, submission and tracking of jobs, enforcement of resource allocation policies, etc., etc. These services may be cascaded, i.e. a request passed from one to another possibly after job splitting. The different services may focus on different aspects of the processing or may be dedicated to a particular collection of hardware resources. The analysis service is especially important in the short term because it provides means to access different existing and coming systems with a common user interface. Within ATLAS candidates include the production supervisor (i.e the production DB), each of the executors, DIAL, GANGA services and the soon-to-be-delivered EGEE MW prototype. ADA has promised to deliver a system and do a demo at the May software workshop. The plan is to deliver something close to what we have now. The following summarizes what I am confident we can do with current resources (minimal system) and then what we would like to deliver with additional effort. Contributors ------------ GANGA - David Adams, Karl Harrison DIAL - David Adams, others Swiss - Szymon Gadomski, Christian Haeberli PAT - ATLAS physics analysis tools: Ketevi et al Spain - ATPROD - Minimal system -------------- Deliver a system that will allow users to fill histograms from AOD and to carry out reconstruction from bytestream. User interface is comman line in Python or ROOT with syntax close to that of AJDL, i.e. job is defined by application, task and dataset, monitored with a job interface and result is accessed as a dataset. AJDL - Use something close to that presently used in DIAL. ROOT command line interface - delivered by DIAL. Python command line interface - delivered by GANGA most likely as a wrapper around the C++ implementation. Application - To fill histograms from AOD delivered by PAT group. Reconstruction delivered by Swiss group. Dataset - Need bytestream, POOL ev coll and ROOT histogram. Delivered by DIAL? Analysis service - DIAL analysios service running at BNL. Desirable -------- Analysis service at other locations. Production system implementation of analysis service. Graphical interface with job submission, task manipulation and job management.