r9 - 25 Mar 2008 - 18:52:30 - NurcanOzturkYou are here: TWiki >  AtlasSoftware Web > PathenaOnFDRData
Pathena On FDR Data

Introduction

This page describes how to run pathena jobs on the FDR data at the Tier2's analysis queues.

Set up CMT for acas account at BNL

Login to the US ATLAS acas machines at BNL and create a working directory called "Jamboree". We will use the release 13.0.40.
ssh atlasgw.bnl.gov
rterm -i
cd $HOME
mkdir Jamboree
cd Jamboree
mkdir 13.0.40
Create a "requirements" file:
#########################################################
set CMTSITE STANDALONE

   macro PROJ_RELEASE   "latest" \
   13.0.40        "13.0.40"

set SITEROOT /opt/usatlas/kit_rel/${PROJ_RELEASE}
macro ATLAS_DIST_AREA ${SITEROOT}
macro ATLAS_GROUP_AREA "/afs/cern.ch/atlas/groups/PAT/Tutorial/EventViewGroupArea/EVTags-13.0.40.323"
macro ATLAS_TEST_AREA "" \
    13.0.40 "${HOME}/Jamboree/13.0.40"
macro SITE_PROJECT_AREA ${SITEROOT}
macro EXTERNAL_PROJECT_AREA ${SITEROOT}
apply_tag oneTest
apply_tag simpleTest
apply_tag setupCMT
apply_tag noCVSROOT
use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)
set CVSROOT /afs/usatlas.bnl.gov/software/cvs
macro setup_slc3compat "" \
   gcc323  "/opt/usatlas/kit_rel/SLC3/setup_slc3compat"
setup_script $(setup_slc3compat)
set PATHENA_GRID_SETUP_SH /afs/usatlas.bnl.gov/lcg/current/etc/profile.d/grid_env.sh
#############################################################
Setup CMT:
source /afs/usatlas.bnl.gov/cernsw/contrib/CMT/v1r20p20070208/mgr/setup.sh
cmt config

Set up CMT for lxplus account at CERN

Login to the lxplus machines at CERN and create a working directory called "Jamboree" under your public area. We will use the release 13.0.40.
ssh lxplus.cern.ch
cd public
mkdir Jamboree
cd Jamboree
mkdir 13.0.40
Create a "requirements" file:
#############################################################
set   CMTSITE  CERN
set   SITEROOT /afs/cern.ch
macro ATLAS_DIST_AREA ${SITEROOT}/atlas/software/dist
 
macro ATLAS_GROUP_AREA "/afs/cern.ch/atlas/groups/PAT/Tutorial/EventViewGroupArea/EVTags-13.0.40.323"
 
apply_tag simpleTest
apply_tag oneTest
 
macro ATLAS_TEST_AREA "" \
      13.0.40 "${HOME}/public/Jamboree/13.0.40"

use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)
#############################################################
Setup CMT:
source /afs/cern.ch/sw/contrib/CMT/v1r20p20070208/mgr/setup.sh
cmt config

Setup athena for release 13.0.40

source setup.sh -tag=13.0.40,32,groupArea
Check out Tools/Scripts package to setup your work area (easy way of checking out and compiling multiple packages)
cd 13.0.40
cmt co -r Scripts-00-01-14 Tools/Scripts
Setup work area and create run area:
./Tools/Scripts/share/setupWorkArea.py
cd WorkArea/cmt
cmt bro cmt config
cmt bro gmake
source setup.sh
Check out PandaTools for pathena, cd to 13.0.40 directory:
cd ../.. 
cmt co PhysicsAnalysis/DistributedAnalysis/PandaTools
Check out HighPtView package:
cmt co -r HighPtView-00-01-10 PhysicsAnalysis/HighPtPhys/HighPtView
Check out EventViewConfiguration package:
cmt co -r EventViewConfiguration-00-01-13 PhysicsAnalysis/EventViewBuilder/EventViewConfiguration
Run every time new packages checked out:
./Tools/Scripts/share/setupWorkArea.py
It prints:
   WorkAreaMgr : INFO     ################################################################################
   WorkAreaMgr : INFO     Creating a WorkArea CMT package under: [/usatlas/u/nurcan/Jamboree/13.0.40]
   WorkAreaMgr : INFO     Scanning [/usatlas/u/nurcan/Jamboree/13.0.40]
   WorkAreaMgr : INFO     Found 4 packages in WorkArea
   WorkAreaMgr : INFO     => 0 package(s) in suppression list
   WorkAreaMgr : INFO     Generation of WorkArea/cmt/requirements done [OK]
   WorkAreaMgr : INFO     ################################################################################
Compile all packages from WorkArea:
cd WorkArea/cmt
cmt bro cmt config
cmt bro gmake
source setup.sh
Go to run area and get the jobOption file from HighPtView package:
cd ../run
get_files HighPtViewNtuple_topOptions.py
Make a jobOption file for details of the job, called MyJobOptions.py:
#######################################################################
import os
print os.environ["CMTPATH"]
                                                                            
InserterConfiguration={} # Always need this line
InserterConfiguration["Electron"]={} # Need such for every item you will modify
InserterConfiguration["Electron"]["FullReco"]=[{"Name":"ElMedium"}]
                                                            
#DoTrigger=True
TriggerView=True
include("HighPtView/HighPtViewNtuple_topOptions.py")
include("AthenaPoolCnvSvc/ReadAthenaPool_jobOptions.py")
ServiceMgr.PoolSvc.SortReplicas=True
from DBReplicaSvc.DBReplicaSvcConf import DBReplicaSvc
ServiceMgr+=DBReplicaSvc()
ServiceMgr.DBReplicaSvc.UseCOOLSQLite=False# fix for stream and DPDs by Attila
InserterConfiguration.update({ "CommonParameters":
                         { "DoPreselection":False,
                           "CheckOverlap":False } })
#######################################################################

Setup Grid and DQ2 at BNL

source /afs/usatlas.bnl.gov/lcg/current/etc/profile.d/grid_env.sh
source /afs/usatlas.bnl.gov/Grid/Don-Quijote/dq2_user_client/setup.sh.BNL

Setup Grid and DQ2 at CERN

source /afs/cern.ch/project/gd/LCG-share/current/etc/profile.d/grid_env.sh
source /afs/cern.ch/atlas/offline/external/GRID/ddm/endusers/setup.sh.CERN

Look at available FDR datasets at Tier2's from Panda monitor

list of FDR datasets at Tier2's

Pick up one dataset:

fdr08_run1.0003051.StreamEgamma.merge.AOD.o1_r6_t1
One can also list the replicas for a given dataset (from BNL and CERN):
source /afs/usatlas.bnl.gov/Grid/Don-Quijote/DQ2_0_3_client/dq2.sh
dq2-list-dataset-replicas fdr08_run1.0003051.StreamEgamma.merge.AOD.o1_r6_t1
     INCOMPLETE: DESY-ZN
     COMPLETE: BNLXRDHDD1,SARA-MATRIX_DATADISK,RAL-LCG2_DATADISK,IN2P3-CC_DATADISK,
RALPP,SLACXRD,LIP-LISBON,TAIWAN-LCG2_DATADISK,NDGF-T1_DATADISK,IFICDISK,WISC,
TOKYO-LCG2_DATADISK,MWT2_IU,LIV,ICL,PIC_DATADISK,BU_DDM,TIER0TAPE,INFN-T1_DATADISK,
DESY-HH,JINR,CYF,IJST2,TRIUMF-LCG2_DATADISK,FZK-LCG2_DATADISK,TORON,PNPI,AGLT2_SRM,
BNL-OSG2_DATADISK,SWT2_CPB,LNF,TW-FTT,OU,MWT2_UC

List of names of analysis queues at Tier2's

DDM Name Analysis Queue Name
SWT2_CPB ANALY_SWT2_CPB
OU ANALY_OU_OCHEP_SWT2
AGLT2_SRM ANALY_AGLT2
MWT2_UC ANALY_MWT2
SLACXRD ANALY_SLAC
BU_DDM ANALY_NET2
WISC ANALY_GLOW-ATLAS

Submit pathena job

One line command:

pathena -c "Mode=['FullReco'];DetailLevel=['FullStandardAOD'];Branches= ['StacoTauRec']" MyJobOptions.py 
--inDS fdr08_run1.0003051.StreamEgamma.merge.AOD.o1_r6_t1  
--outDS user.NurcanOzturk.HighPtView.StreamEgamma.Jamboree 
--nfiles 1 --site ANALY_SWT2_CPB
HighPtView options:
Mode=['FullReco'];DetailLevel=['FullStandardAOD']; Branches= ['StacoTauRec']" 
pathena options:
Specify input dataset by --inDS
Specify output dataset by --outDS
Specify # of files to be run on by --nfiles 1
Specify the analysis queue name by --site siteName

The following will be printed on the screen:

Your identity: /DC=org/DC=doegrids/OU=People/CN=Nurcan Ozturk 155817
Enter GRID pass phrase for this identity:
Creating proxy ........................................... Done
Your proxy is valid until: Mon Mar 17 21:00:48 2008
extracting run configuration
ConfigExtractor > No Input
ConfigExtractor > Output=AANT EVAANtupleDump0Stream AANT0
archive sources
archive InstallArea
post sources/jobO
query files in dataset:fdr08_run1.0003051.StreamEgamma.merge.AOD.o1_r6_t1
submit
===================
  JobID  : 8235
  Status : 0
    > build
       PandaID=8485091
    > run
       PandaID=8485092

Monitor pathena job status from Panda monitor

Panda monitor

Copy-paste PandaID in the bar on the left panel (under Quick search). You can also use List users link to see all your jobs.

Retrieve output files from pathena job and make plots

Use dq2 client tools to retrieve the output dataset (from BNL do):

export DQ2_COPY_COMMAND='lcg-cp -v --vo atlas'
dq2_get -rv user.NurcanOzturk.HighPtView.StreamEgamma.Jamboree 
This copies the output files:
user.NurcanOzturk.HighPtView.StreamEgamma.Jamboree.AANT0._00002.root
user.NurcanOzturk.HighPtView.StreamEgamma.Jamboree._8485092.log.tgz

Open the file in root and make some plots:

root user.NurcanOzturk.HighPtView.StreamEgamma.Jamboree.AANT0._00002.root
root [1] FullRec0->GetListOfLeaves()->Print(); 
root [2] FullRec0->Draw("El_N", "El_N>0");
root [3] FullRec0->Draw("El_p_T", "El_N>0");
root [4] FullRec0->Draw("Jet_C4_N", "Jet_C4_N>0");
root [5] FullRec0->Draw("Jet_C4_p_T", "Jet_C4_N>0");

How to submit same pathena job on multiple datasets

Use a python script (from Lashkar Kashif ):
#####################################################################
import os
import commands
inDSs = ['fdr08_run1.0003070.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003071.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003072.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003073.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003074.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003075.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003076.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003077.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003078.StreamMuon.merge.AOD.o1_r12_t1',
         'fdr08_run1.0003079.StreamMuon.merge.AOD.o1_r12_t1'
        ]
outDS = "user.LashkarKashif.fdr1.StreamMuon"
comFirst = "pathena --outDS %s --inDS %s ../share/z_pt.py --site ANALY_OU_OCHEP_SWT2"
comLater = "pathena --outDS %s --inDS %s --libDS LAST ../share/z_pt.py --site ANALY_OU_OCHEP_SWT2"
for i,inDS in enumerate(inDSs):
    if i==0:
        os.system(comFirst % (outDS,inDS))
    else:
        os.system(comLater % (outDS,inDS))
#####################################################################

How to debug pathena job failures

Please refer to pathena wiki page

In case of problems things to check

  • Look at the status of analysis queues from the Panda monitor, see Analysis link. You can see the statistics of jobs (finshed/failed, failures due to transformation errors, other errors, etc.) for all sites in the last 24 hours.
  • Look at the Panda shift elog (linked from main Panda monitor) to see if a problem reported affecting user analysis jobs (site problems, DDM related problem, Panda server problems, etc.).
  • See if there is a Savannah bug opened already for the problem you are experiencing, from Panda Savannah page.
  • See if there is any discussion going on for the problem you are experiencing, from Panda/pathena hypernews.
  • Send a message to the person on shift for direct contact (atlas-project-adc-operations@cern.ch).
  • Send a message to the physics analysis support hypernews (HN-PhysicsAnalysisSupport@bnl.gov) in case you need analysis support.

Available HighPtView DPD's on the FDR-1 Data

Alden and Amir at UTA made DPD's using HighPtView package on all FDR data for SWT2 physics analysis groups. You can get them by dq2_get if you are interested in looking at:

dq2_ls user.AldenStradling.fdr08*HPTV_NOR  (overlap removal off)
dq2_ls user.AldenStradling.fdr08*HPTV_OR    (overlap removal on)

Future developments/user requests with pathena

  • Automatic redirection of analysis jobs within a cloud. Namely, no need to specify site - pathena will choose the best site based on data availability and available CPU's (needs couple of weeks as of writing on 3/17/2008).
  • Can not retrieve the output dataset if the same output dataset was used in an earlier unsuccessful pathena submission. A protection is put in pathena. (3/21/2008)
  • Deleting user datasets made with pathena: the work already started, the implementation is in place, some tests have been done, final requirements/features to be added (as of writing on 3/17/2008).

Analysis packages tested at the Tier2 analysis queues with pathena

The list I'm aware of so far:
  • PhysicsAnalysis/SUSYPhys/SUSYValidation
  • PhysicsAnalysis/HighPtPhys/HighPtView
  • PhysicsAnalysis/TopPhys/TopPhysDPDMaker
  • PhysicsAnalysis/AnalysisCommon/AnalysisExamples
  • PhysicsAnalysis/AnalysisCommon/AnalysisExamples/ZeeZmmOnAOD
  • PhysicsAnalysis/AnalysisCommon/UserAnalysis/AnalysisSkeleton_topOptions.py

References


Major updates:
-- TWikiAdminGroup - 20 Nov 2017

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback