DIAL release 1.20: Examples

David Adams
09jun05


Introduction

Here we run through a few examples that demonstrate the expected typical use of DIAL 1.20 in the ADA (ATLAS distributed analysis) environment. Emphasis is on analysis of the AOD samples produced for the Rome physics workshop.

If you have not already done so, you may wish to run one of the demos before starting your own analysis. See the demo page for information on running the demos. See the getting started page for instructions on installing, setting up and running DIAL. In the examples below, we make us of the root interface.

The catalogs are changing with time: the number and size of Rome datasets is increasing as data is produced and the transformations are being improved. Any queries you make may have different results than those shown below.


Jobs and catalogs

Distributed analysis is an iterative process where a physicist defines a job, submits it to a processing system, examines the result and the repeats the sequence.

A job is specified by defining a transformation and selecting a dataset to process with this transformation. The transformation is specified by an application and a task. The application carries the scripts that do the processing and the task carries user configuration data.

The application, task and dataset objects may be created or archived objects may be extracted from the corresponding repository. The latter requires knowledge of the object ID. Objects of general interest are published in selection catalogs. Entries in these catalogs are identified by a name and include an object ID and metadata to aid in object selection.

The demos identify objects by name, extract the corresponding ID from a selection catalog and use this ID to extract the object from a repository. See any of the demo scripts, e.g. that for demo 6. The following examples go beyond the demos and use queries to make the selections and then creates a new task using the selected task as a starting point. This is expected to be a typical use of the system.

A job is created by submitting an application, task and dataset to a scheduler. In the examples below, this scheduler is a remote analysis service. As with the demos, results are available when processing is complete and partial results may be examined during processing. In the current transformations, these results are obtained by merging the histograms and ntuples from the completed subjobs.


First job (atlasopt)

(Skip to the next session if you know your transformation and dataset and just want to submit a job.)

Having started root with the dial root command, we begin by displaying the status of all catalogs to verify our connection and see the size of each:

root [10] show_catalogs()

   dr - Dataset repository has 206976 entries
  dfc - Dataset file catalog has 113934 entries and 15 columns
  dsc - Dataset selection catalog has 5604 entries and 15 columns
   ar - Application repository has 31 entries
  asc - Application selection catalog has 6 entries and 5 columns
   tr - Task repository has 32 entries
  tsc - Task selection catalog has 23 entries and 5 columns
   jr - JobRepository has 0 entries
We begin by querying for a dataset:
root [27] print(dsc.query("level = 'TOP' and name like 'rome%recov10%SU%AOD-bnl'", 100))
List has 12 entries:
  rome.004401.recov10.SU1_Jimmy_coann.AOD-bnl
  rome.004402.recov10.SU2_Jimmy_focus.AOD-bnl
  rome.004403.recov10.SU3_Jimmy_bulk.AOD-bnl
  rome.004404.recov10.SU6_Jimmy_funnel.AOD-bnl
  rome.004406.recov10.SU4_Jimmy_lowmass.AOD-bnl
  rome.004410.recov10.SU51_Jimmy_scan.AOD-bnl
  rome.004411.recov10.SU52_Jimmy_scan.AOD-bnl
  rome.004412.recov10.SU53_Jimmy_scan.AOD-bnl
  rome.004421.recov10.SU1_Jimmy_coann.AOD-bnl
  rome.004423.recov10.SU3_Jimmy_bulk.AOD-bnl
  rome.004424.recov10.SU6_Jimmy_funnel.AOD-bnl
  rome.004426.recov10.SU4_Jimmy_lowmass.AOD-bnl
Note we limited the query to 100 results and received 12 and so we know we have all matching datasets. The query resticts the selection to TOP level datasets, i.e. complete samples intended for user access and then uses the name to select Rome samples with v10 reconstruction, SUSY data using all AOD data avaialble at BNL. Replace AOD-bnl with AOD to get samples available at both CERN and BNL.

We can count datasets matching a query with the query_count method, e.g.

oot [28] print(dsc.query_count("level = 'TOP' and name like 'rome%recov10%AOD-bnl'"))         
141
We look at the schema and see if we can find a way to further refine our search:
root [30] print(dsc.schema())
List has 15 entries:
  uid
  name
  level
  owner
  type
  virtual
  nevt
  nfile
  nsub
  runmin
  evtmin
  runmax
  evtmax
  update_uid
  modtime
Let's restrict the search to large samples:
root [45] print(dsc.query("level = 'TOP' and name like 'rome%recov10%SU%AOD-bnl' and nevt>100000"))
List has 2 entries:
  rome.004401.recov10.SU1_Jimmy_coann.AOD-bnl
  rome.004421.recov10.SU1_Jimmy_coann.AOD-bnl
Our SUSY friends tell us the 440* samples should not be used and we list the attributes of the 4421 dataset:
root [46] print(dsc.attributes("rome.004421.recov10.SU1_Jimmy_coann.AOD-bnl"))
Row has 15 entries:
  evtmax = 0
  evtmin = 0
  level = TOP
  modtime = 20050522110923
  name = rome.004421.recov10.SU1_Jimmy_coann.AOD-bnl
  nevt = 146678
  nfile = 3026
  nsub = 80
  owner = rome
  runmax = 0
  runmin = 0
  type = AOD
  uid = 10013-170225
  update_uid = 10013-170225
  virtual = 0
Record the ID and fetch the dataset from the dataset repository:
root [51] did=dsc.id("rome.004421.recov10.SU1_Jimmy_coann.AOD-bnl")
(class DatasetId)(-1222959128)
root [52] print(did)                                               
10013-170225
root [53] pdst = dr.extract(did);
root [54] pprint(pdst)
EventMergeDataset 10013-170225 with no parent is locked and not empty
  Content includes 1 block:
    Dataset content block:
      Dataset type: AtlasPoolEventDataset
      Content label: AOD
      Content ID list has 36 entries:
        type BJetContainer with with key BCandidates
        type ElectronContainer with with key ElectronCollection
        type INavigable4MomentumCollection with with key MuonboyTrackParticles
        type INavigable4MomentumCollection with with key StacoTrackParticles
        type INavigable4MomentumCollection with with key TrackParticleCandidate
        type INavigable4MomentumCollection with with key TrackParticleCandidateXK
        type JetTagContainer with with key BJetCollection
        type McEventCollection with with key GEN_AOD
        type MissingET with with key MET_Cryo
        type MissingET with with key MET_Final
        type MissingET with with key MET_Muon
        type MissingET with with key MET_Topo
        type MissingEtCalo with with key MET_Base
        type MissingEtCalo with with key MET_Calib
        type MissingEtTruth with with key MET_Truth
        type MuonContainer with with key MuonCollection
        type ParticleJetContainer with with key Cone4TowerParticleJets
        type ParticleJetContainer with with key Cone4TruthParticleJets
        type ParticleJetContainer with with key ConeTowerParticleJets
        type ParticleJetContainer with with key ConeTruthParticleJets
        type ParticleJetContainer with with key KtTowerParticleJets
        type ParticleJetContainer with with key KtTruthParticleJets
        type PhotonContainer with with key PhotonCollection
        type Rec::TrackParticleContainer with with key MuidCombTrackParticles
        type Rec::TrackParticleContainer with with key MuidCombTrackParticlesLowPt
        type Rec::TrackParticleContainer with with key MuidMooreTrackParticles
        type Rec::TrackParticleContainer with with key MuidMooreTrackParticlesLowPt
        type Rec::TrackParticleContainer with with key MuidStandAloneTrackParticles
        type Rec::TrackParticleContainer with with key MuidStandAloneTrackParticlesLowPt
        type Rec::TrackParticleContainer with with key MuidiPatTrackParticles
        type Rec::TrackParticleContainer with with key MuidiPatTrackParticlesLowPt
        type TauJetContainer with with key TauJetCollection
        type TrackParticleTruthCollection with with key TrackParticleTruthCollection
        type TrackRecordCollection with with key MuonEntryRecordFilter
        type TruthParticleContainer with with key SpclMC
        type VxContainer with with key VxPrimaryCandidate
      Event count is 146678
  Location has 3026 files:
    lfn://atlas/rome.004421.recov10.SU1_Jimmy_coann._00801.AOD.pool.root
    lfn://atlas/rome.004421.recov10.SU1_Jimmy_coann._00802.AOD.pool.root
    lfn://atlas/rome.004421.recov10.SU1_Jimmy_coann._00803.AOD.pool.root
    lfn://atlas/rome.004421.recov10.SU1_Jimmy_coann._00806.AOD.pool.root
    lfn://atlas/rome.004421.recov10.SU1_Jimmy_coann._00808.AOD.pool.root
    ...
    lfn://atlas/rome.004421.recov10.SU1_Jimmy_coann._04000.AOD.pool.root
  Dataset ID list has 80 entries:
    First ID: 10013-159915
     Last ID: 10013-170223
The last command displays the information stored in the dataset object which is distinct from but has some overlap with the data published in the selection catalog.

Next we select an application and task in a similar way:

root [55] print(asc.query("", 50))
List has 6 entries:
  aodhisto
  aodhisto-old
  atlasdev
  atlasdev-src
  atlasopt
  esd2aod
root [56] print(asc.attributes("atlasopt"))
Row has 5 entries:
  modtime = 20050609104509
  name = atlasopt
  owner = dadams
  task_interface = atlas_job_options
  uid = 10201-640
root [57] print(tsc.query("task_interface = 'atlas_job_options'"))
List has 6 entries:
  atlasopt_example_zll
  demo6
  atlasopt_example_zll-9.0.4
  atlasopt_example_zll-10.0.1
  atlasopt_example_zll-9.0.4-dev
  atlasopt_example_zll-10.0.1-dev
root [58] aid = asc.id("atlasopt");
root [59] print(aid)
10201-640
root [60] papp = ar.extract(aid);
root [61] tid = tsc.id("atlasopt_example_zll");
root [62] print(tid)
10301-25
root [63] ptsk = tr.extract(tid);
root [64] pprint(papp)
Application 10201-640 has 4 files:
  build_task
  readme.txt
  release_notes.txt
  run
root [65] pprint(ptsk)
Task 10301-25 has 2 files:
  atlas_release
  jo.py
It is unlikely that you want to modify the application but very likely that you would like to modify the task. Extract the the files from the task:
root [68] ptsk->write_files("mytask", true) 
(const int)0
This extracts the files into the directory mytask. The second argument indicates that this directory may be created if it does not exist.

We exit to a command shell and examine and modify the task files:

root [69] .sh
sh-2.05b$ l mytask
total 8
   4 -rw-rw-r--    1 dladams  dladams         6 Jun  9 16:47 atlas_release
   4 -rw-rw-r--    1 dladams  dladams        54 Jun  9 16:47 jo.py
sh-2.05b$ cat mytask/atlas_release 
9.0.4
sh-2.05b$ cat mytask/jo.py 
include ("AnalysisExamples/ZllExample_jobOptions.py")
sh-2.05b$ vi mytask/atlas_release 
sh-2.05b$ cat mytask/atlas_release 
10.0.1
sh-2.05b$ vi mytask/jo.py 
sh-2.05b$ cat mytask/jo.py 
include ("AnalysisExamples/ZeeZmmOnAODExample_jobOptions.py")
sh-2.05b$ exit
exit
We changed to a more recent version of the ATLAS release and updated the job options accordingly.

See the share directory of the AnalysisExamples package for some example job options. You may include one of these (as in the example) or copy it to jo.py and modify as desired.

Note that, at present, the atlasopt application supports the output of histograms, ntuples or both but does not support the production of event data.

Build a new task from the modified files:

root [71] ptsk = new dial::Task("atlas_release jo.py", "mytask");
root [72] pprint(ptsk)
Task 10301-823 has 2 files:
  atlas_release
  jo.py
The list of files used to construct the task may be replaced with "*" if you want all the files from the directory.

Now that papp, ptsk and pdst are defined appropriately, we can submit a job:

root [25] submit()
Application 10201-640
Task 10301-823
Dataset 10013-170225
*** Submitting job
*** Submitted job status:
CompoundJob 10501-35542 is running
Application: 10201-640
Task 10301-823
Dataset 10013-170225 with 146678 events
Job preferences ID 0-0
     Owner: /DC=org/DC=doegrids/OU=People/CN=David Adams 407137
Credential: /DC=org/DC=doegrids/OU=People/CN=David Adams 407137
Run host: adial01.usatlas.bnl.gov
Job directory: /usatlas/u/dial/local/jobs/MasterScheduler/00/00/29/05/00/00/8a/d6
 create time: 2005 June 09 17:03:02
  start time: 2005 June 09 17:03:27 (25 sec elapsed)
 update time: 2005 June 09 17:03:27 (25 sec elapsed)
There are 80 subjobs
  49 running
  0 done
  0 failed
  0 killed
  0 included in result
Events processed: 0 (0%)
       in result: 0 (0%)
The job does not have a result
(int)0
The job may be monitored using print(msch.job(jid)) or get_results() as described on the demo page.


Job scripts

Of course one will not want to do all the above typing for every job submission. A typical user will create a job definition script that defines the application, task and dataset (variables papp, ptsk and pdst). A sample script is copied into the local directory when the dialroot files are installed (dialroot -i). Here is an example jobdef.C that may be used to run

Edit the top part of this script to specify the application, task and dataset of interest. The ramainder of th script is used o to extract the corresponding objects and store their pointers in papp, ptsk and pdst.

Start root (command dialroot), run the job definition script and submit and monitor as above:

root [0] .x jobdef.C
root [1] submit()
...
root [2] get_results()
...
The last command is repeated until the job is complete.


User code (aodhisto)

The above example demonstrates how a user may supply different job options to configure an analysis job, but serious analysis often requires providing user code. For this, there are two existing transformations: aodhisto and atlasdev. The latter is described in the following section.

Aodhisto (developed by F. Fassi and T. Maeno) allows the user to supply code that is built inside the AnalysisExamples package. For information about that package, please see the tutorials linked from the ATLAS analysis tools page.

Follow the same procedure as in the atlasopt example above, except select the application "atlasopt" and start from one of the atlasopt example tasks. These may be found with the following query:

root [25] print(tsc.query("name like 'aodhisto%'"))       
List has 3 entries:
  aodhisto_big
  aodhisto_zll_aod
  aodhisto_zll_aod-esd
or query for all compatible tasks (i.e. those supplying the atlas_simple_analysis interface expected by aodhisto) with
root [5] print(tsc.query("Task_interface='atlas_simple_analysis'"))
List has 4 entries:
  aodhisto_big
  demo4
  aodhisto_zll_aod-esd
  aodhisto_zll_aod
We hope to have more of these soon.

Here is an example job definition script that can be used to run a job using the example task or a task provided by the user. Execute this script and then use submit() to submit a job. It may take a couple minutes for the to get a response after submission because the analysis service is compiling your code.

If the submission returns an invalid job ID, then it is likely that your code did not compile. Although it should, the scheduler does not provide means to access the log files for the task build (compilation). If the scheduler is running at BNL, the log files may be found at

   /usatlas/u/dial/local/tasks/AID/TID
where AID is the application ID and TID is the task ID.


Local CMT development (atlasdev)

Many ATLAS developers will want to change other packages or prefer to develop and test in the CMT environment before submitting a distributed job. These activities are supported by the atlasdev transformation (developed by H. Ma). The configuration (task) for this transformation includes the atlas release version, a job option file and name of the directory where the CMT build was done. At present, submitted jobs need direct access to this directory and so the job should be submitted to a local service and the local build should not be changed while jobs are running.

To get started, query existing compatible tasks with

root [1] print(tsc.query("task_interface='atlas_developer_directory'"))
List has 5 entries:
  atlasdev_example_zll-9.0.4
  demo7
  atlasdev_example_zll-9.0.4-dev
  atlasdev_example_zll-nochange-10.0.1
  atlasdev_example_zll-erescale-10.0.1
and select and modify any one as in the previous examples. Here is an example job definition script. Again, modify the script to select one of the above or a local task definition and then, inside root, execute the script and use submit() to submit a job.

Note that, at present, atlasdev only works with development releases and not with release kits. Add the suffix "-dev" to the version in atlas_version to ensure that such a release is used.


dladams@bnl.gov