NOVA
Networked Object-Based EnVironment for Analysis

N
E
W
02/05 NOVA presentation at CHEP'2000 February 7-11, Padova (ppt)
01/18 NOVA short paper for CHEP'2000 February 7-11, Padova (pdf)
08/09 Development of a State of the Art Object Oriented Analysis Framework (pdf) (doc)
07/20 NOVA presentation at US ATLAS Collaboration Meeting July 21-22, BNL (pdf) (ppt)
07/19 ATLAS Parameters Database
06/16 STAR Parameters Database

Motivation

The new generation of HENP experiments such as those at RHIC and LHC must contend with processing and mining heretofore unprecedented amounts of data with highly complex analysis software developed and used by large worldwide communities of physicists. Object oriented (OO) programming has been identified and adopted by these communities as an efficient and powerful approach to developing capable, robust, maintainable software in this environment.

Vital to fully realizing the benefits of object oriented technology is careful attention to how this technology is employed and delivered In our view, object oriented frameworks -- integrated sets of classes providing solutions to problems of large-scale of data analysis -- provide the necessary foundation to dramatically improve the software development process for future applications.

The BNL-based core computing group of RHIC's STAR experiment has developed an object oriented analysis framework that itself builds on prior STAR work on the STAF analysis framework and an ongoing program of collaboration with the CERN team developing the ROOT system. The ROOT system is the foundation of STAR's analysis framework. The NOVA (Networked Object-based Environment for Analysis) project seeks to leverage the expertise developed through this existing BNL effort to develop an application-neutral, object oriented analysis framework that incorporates the latest standards and technologies in component software, component middleware and distributed computing to support a new level of distributed object oriented physics analysis.

Several barriers confront the application developer when moving to OO software: the need for retraining and adaptation to a new approach to designing software; the learning curve that must be climbed in order to produce quality OO software designs; and the cost in time and effort of establishing a functional and productive infrastructure capable of supporting OO application development. NOVA will help to solve these problems by providing an object-oriented infrastructure, a consistent application programming model, and data analysis templates which can be used to start building applications. These components will ease the transition to object oriented technology by providing a well-tested baseline of functionality and services. Physicists can design their solutions using a proven programming model instead of developing (reinventing) a unique approach. Finally, the use of a shared architecture will make it easier to integrate solutions from different experiments.

The success of this effort with constitute a first step in establishing an important new computational science component in the research effort at BNL. It will lead to a BNL-supported software product that will provide new capabilities serving BNL and HENP community physicists participating both in BNL-hosted research such as the RHIC program and in worldwide collaborations such as the LHC. It will improve the depth and visibility of the Laboratory's contribution to HENP community software and better position the Laboratory for important roles in computing and software for near and long term projects.

Goals
To develop a set of software tools for which can be applied in many varied global computing environments (RHIC, LHC, muon collider...). These tools will be used via implementation-neutral interfaces, with select implementations provided for products of wide application or interest in the community (eg. ROOT, Objectivity, and the Grand Challenge HENP data access project), to the extent possible with the available manpower.
Approach and Architecture
Many well-developed experiments already have established object oriented frameworks in production or under development. However, the present generation of object oriented analysis frameworks are very limited in several respects: The NOVA project will not reinvent or evolve existing analysis frameworks, but rather will provide new capabilities in these areas.

Analysis frameworks have typically been large monolithic systems requiring full 'buy-in' to the system in order to make use of them, with ROOT being an example of such a "vertically integrated" system. A more recent trend, which we will follow, is to develop modular components providing application-neutral interfaces which can be used in isolation to extend the capability of existing analysis systems. The NOVA framework will consist of small, interoperable components designed for flexibility and ease of reuse. We will provide select implementations of these components using HENP and software community standard tools and will integrate and test them with at least one large "vertical" analysis framework (ROOT).

We will focus principally on supporting C++ based analysis. This is the analysis software language for all RHIC, LHC experiments and most other large experiments. Other efforts (JAS) are underway to develop distributed OO analysis frameworks based on the Java language.

NOVA will be developed using an iterative process driven by user participation and closely coupled to prototyping in real-world experiments (STAR, ATLAS).

Architecture diagram (and components implemented): April: PDF August: PDF

Tools and technologies

Requirements for tools and technologies employed in NOVA

Existing experience and the evolutionary path of HENP computing suggest requirements which should be met by tools and technologies employed within NOVA: Following these requirements we have identified the following tools for application in the development of NOVA.

ROOT

MySQL

XML

Apache web server modules

CORBA

Developments since our original proposal have led us away from some tools under consideration at that time:
Implementation Approach to Project Goals

Distributed Software Management
Distributed Analysis
Event Data and Distributed Data Access
Software Robustness and Reusability
Project Domains and Components

Bold items are project component deliverables, either fully implemented in the project or third party tools customized or extended for NOVA. Non-bold items are third party tools used by (or with provision for use by) by NOVA. Italic items are application components used with NOVA.

Data Management Domain
Analysis Server Domain
Mobile Analysis Domain
Web Middleware Domain
Schedule and Milestones
Consistent with the provided funding level, the dedicated manpower devoted to this project consists of one full-time person for one half year, augmented by contributions from the project leader (Torre Wenaus) and from several existing BNL STAR Computing group members (Valery Fine, Victor Perevoztchikov, Jeff Porter). The dedicated developer (Sasha Vanyashin, a visitor to BNL from Royal Institute of Technology, Stockholm) arrived April 22. Accordingly, dedicated effort and milestones are clustered in the second half of the year, with limited design work and prototyping taking place in the first half of the year.
Milestone Activity Deliverable Schedule
 1. Design, prototyping Status report April
 2. Implementation, testing Status report and year two plan August
 3. Documentation, refinement Delivery, final report, users manual September

References


Torre Wenaus, Sasha Vanyashin