For DC1, we only provide the capability to extract histograms from the combined ntuple files that are the endpoint of reconstruction. The relevant datasets describe combined ntuples and HBOOK histograms.
For DC2, we will support a much broader range of activities including simulation, reconstruction, selection and the extraction of summary data from any data type. Most of the event data will be in the form of POOL event collections with the raw data in bytestream format being one notable exception. We assume it is now more useful to write ROOT histograms.
The relevant datasets are described below. Unless otherwise noted, these are already implemented in DIAL.
Content describes the data held by the data. If the dataset holds event data, it includes a list of event ID's and a list of content ID's that are type-key pairs (as in StoreGate). For non-event data, only the list of content ID's is present. The content is expressed as a list of content blocks each with this information. Each content block also carries a label and its dataset type name.
Location tells where the data may be found and is most often a list of logical file names.
If a dataset is composed of other datasets, then the sub-datasets are called the constituents. The content and location may explicit, i.e. included directly in the dataset description, or are implicit if one must examine the content or location or the constituents to determine their values.
See "Datasets for the grid" on the ADA documents page for more details.
Dataset
This defines the interface for all datasets.
GenericDataset
This is an implementation of the Dataset interface that is used
to hold the data for all current dataset types. This class
defines a common XML schema that is used to describe all these
dataset types.
SimpleCompoundDataset
A dataset which is made up of a collection of other datasets.
The content and location are implicit.
EventMergeDataset
Collection of event datasets with the same type-keys and different
events.
CbntDataset
This dataset describes a single HBOOK file containing a DC1 combined
ntuple. It provides acess to the list of event ID's and the list
of blocks in the ntuple.
HbookDataset
This dataset describes a single HBOOK file containing histograms.
It provides a merge method to append the histogram contents from
aonther such dataset.
AtlasPoolEventDataset
This event dataset describes an ATLAS-POOL event collection, holding
a single file that is an implicit collection of ATLAS event headers.
The conetn includes the ATLAS event ID's and the StoreGate type-keys.
RootHistogramDataset
This dataset describes a single ROOT file containing histograms
and/or ntuples. It provides a merge method to append the
contents from aonther such dataset.
AtlasRaw
This dataset has not been implemented. It will hold an ATLAS
bytestream file.
EVGEN (AtlasPoolEventDataset)
Holds data produced by event generators such as pythia and iasajet.
HITS (AtlasPoolEventDataset)
Holds hits and truth information produced by a detector simulation
program such as GEANT4.
DIGI (AtlasPoolEventDataset)
Holds the digits (aka digitizations) simulating detector responses
to hits.
RAW (AtlasRaw)
Hold raw data from the detector or simulation therof.
ESD (AtlasPoolEventDataset)
Holds event summary data (ESD), a summary of the reconstruction of
raw data or digits.
AOD (AtlasPoolEventDataset)
Holds analysis-oriented data (AOD), a summary of ESD
TAG (no type yet)
Summary of AOD in a relational table
NTUP (not type yet)
Ntuples often with each entry describing an event.
AOD (AtlasPoolEventDataset)
Holds analysis-oriented data (AOD), a summary of ESD
HISTO (RootHistogramDataset)
Histograms.