r1 - 30 Sep 2009 - 10:48:31 - RobertGardnerYou are here: TWiki >  Admins Web > UserAnalysisTest

UserAnalysisTest

Background

  • An ATLAS-wide User Analysis Test (UAT) is under discussion. This is a twiki page is not the official twiki organizing the ATLAS-wide activity but a reference for the US ATLAS computing facilities group.
  • Date: 21-23 Oct. (28-30 Oct. as a backup date)
  • Coordinator: J. Shank
  • Announcement
    Dear ADC ops,
      Below are some details emerging as we firm up plans for the next analysis test. I suggest we go over this at the next ADC ops meeting.
    Regards,
                Jim
    
    User Analysis Test (UAT)
    Date: 21-23 Oct. (28-30 Oct. as a backup date)
    Coordinator: J. Shank
    
    Scope: to get as many user analysis jobs running over the world-wide resources. Users are meant to run their normal analysis jobs during this time. We will have instructions for including (trivially!) a metric-gathering option for both Ganga and pAthena. We are distributing a few large datasets that we are encouraging people to run over. Details of the datasets below.
    
    Note: This is NOT an analysis Jamboree--a phrase we use for more of the tutorial-type meeting. This is a follow-on test to the STEP09 exercise and our last one before data taking.
    
    Pretest (starts now): Replication of datasets to as many analysis sites as possible; setup Hammercloud test to verify sites
    Action Item: Kaushik: specify disk space required.
    
    Details of test.
     Users will run analysis jobs on a set of large AOD datasets for the first 2 days, 3rd day is for copying output to T3's or local disks.
     Participants should have some experience running analysis jobs on the grid (Ganga or pAthena). This is not the exercise to learn how to use the grid--read the Physics Analysis Workbook, take (or follow on the web) tutorials, go to jamborees for this type of introduction.
    
    The plan is to have 5 large containers (say 100M each of AOD) and distributed two to each T2 and then give the (expert) users each a list of three to run over (so that no one user had jobs which ran only on a single T2). We are of course flexible on this and I suspect a more realistic test would be to give users  a set of 2-3 containers to run on which are on two T2s that are not necessarily in the same cloud.
    
    In keeping with the requests from physics coordination to not disrupt users any more than necessary, the proposal is for the users to run whatever their usual jobs are over these big samples (for those who run RAW, ESD, and cosmic, we plan to have them distributed as well but not at the hundred million event scale of course). This has the added advantage of including potential "problem jobs" that might be missed in a more controlled test. The plan is to have experienced (power or group) users run the majority of jobs over AODs and produce ntuples (DPDs). We will then have users (including less experienced users) dq2_get the resulting ntuples to their local storage (T3's, campus cluster, or desktops).
    
    The large datasets are AOD and should not in general require db access, but this could depend on the the particular user's analysis. The current plan is to instead provide smaller RAW, ESD, and cosmic data sets which do need various kinds of db access and replicate them to the various T2s. Since they are smaller this should not be a major issue.
    
    Action item: Jim Cochran will set up a Twiki with details of the the large dataset and instructions on how to include hammerCloud-type metrics.
    Action Item: Jim Cochran will identify other datasets that need to be replicated to analysis sites.
    Action Item: Jim Shank and Massimo will push to get more users involved.
    Action Item: Massimo and Jim S. will set up a generic UAT Twiki for this.
    
    Goals:
    The aim is  to get a measurement of the "efficiency", time to ntuple, etc. --essentially the same as hammercloud, but with users actively involved, including a large amount of file movement with dq2_get.
    
    Large dataset:
    
    Event type in the dataset:
    Essentially JF35, which is primarily multijet but with appropriate amounts of W, Z, J/Psi, DY, ttbar etc., that satisfy the JF35 cut - it was noted that most W->munus (and Z->mumu as well) will be lost since the 35 GeV "jet cut" doesn't include muons. One of the initial containers (27M events) is actually JF17, should still be useful to include in the test.
    
    The first two containers are:
    
    #  step09.00000011.jetStream_medcut.recon.AOD.a84/
     * jet pt > 35 GeV, estimated total size 14900 GB, 9769 files, 97.69M events
     * cross section: 75,075 nb, filter efficiency = 0.1385 -> 10,398 nb. 97.69M events -> 9.4pb-1 integrated luminosity
    # step09.00000011.jetStream_lowcut.recon.AOD.a84/
     * jet pt > 17 GeV, stimated total size 3674 GB, 2749 files, 27.49M events
     * cross section: 1,453,600 nb, filter efficiency = 0.0706 -> 102,624 nb. 27.49M events -> 0.26 pb-1 integrated luminosity
    
    These two containers above have been replicated fully to all US Tier 2's.  Nurcan has run validation tests on them.  They are ready for other users.
    
     In addition, I have made the following new containers.  Their names still contain the filter used - this can be easily masked by creating a new container name, if we want to. The following container are only available at BNL - not replicated to Tier 2's yet.  But there is some overlap in events between the containers below and the containers above (with step09* names):
    
    groupmc08.105807.JF35_pythia_jet_filter.merge.AOD.e418_a84_t53/
    199.88M events, ~30 TB, ~19 pb-1
    
    groupmc08.105807.JF35_pythia_jet_filter.merge.AOD.e359_a84_t53/
    22.81M events
    
     Finally, we have some unmerged containers.  It would take a couple of days to merge them, if we decide to use them.  Some (small fraction) of these are also in the step09* containers:
    
    groupmc08.105807.JF35_pythia_jet_filter.recon.AOD.e449_a84/
    109.85M events
    
    groupmc08.105802.JF17_pythia_jet_filter.recon.AOD.e347_a84/
    99.94M events
    
     So, total is ~500M, but not all of the containers fully non-overlapping or consistent naming yet.
    
    Action Item: Kaushik will   merge and make new containers.

References


-- RobertGardner - 30 Sep 2009

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback