r8 - 08 Mar 2011 - 09:40:52 - TorreWenausYou are here: TWiki >  AtlasSoftware Web > PscNsfProposal2010

Physics Support and Computing Supplementary Materials for the US ATLAS NSF Cooperative Agreement Proposal 2010


Tier 2 proposals for 2012-2016

Great Lakes Tier 2 (AGLT2)

AGLT2 is NSF funded and is included in the NSF Cooperative Agreement Proposal. Their 2012-2016 proposal is attached below.

Midwest Tier 2 (MWT2)

MWT2 is NSF funded and is included in the NSF Cooperative Agreement Proposal. Their 2012-2016 proposal is attached below.

Northeast Tier 2 (NET2)

NET2 is NSF funded and is included in the NSF Cooperative Agreement Proposal. Their 2012-2016 proposal is attached below.

Southwest Tier 2 (SWT2)

SWT2 is NSF funded and is included in the NSF Cooperative Agreement Proposal. Their 2012-2016 proposal is attached below.

Western Tier 2 (WT2)

WT2 is DOE funded and is not included in the NSF Cooperative Agreement Proposal. Their 2012-2016 proposal is attached below.

University of Illinois/NCSA proposal for participation as a US ATLAS Tier 2 site 2012-2016

University of Illinois/NCSA submitted a proposal, attached below, for participation in 2012-2016 as a US ATLAS Tier 2 site within an existing Tier 2 consortium. MWT2 and UI/NCSA are currently developing an integration plan for incorporation of UI/NCSA into MWT2.

US ATLAS Tier 2 computing requirements estimates 2011-2016

The baseline for the resource projections we are using is provided by ATLAS Computing Management at this site.

Calculations as to resource needs are made according to the Computing Model available in this document and the Computing Technical Design Report.

Recent ATLAS parameters as to trigger rate, anticipated LHC machine life time, number of real and simulated events, event sizes, processing times/event etc were taken from the following ATLAS document, page 2 table 1 and combined with the data placement policy described on page 3 table 2.

Since the above document was published by ATLAS in March 2010 we incorporated most recent developments as they were applied to these parameters, most notably the updated ratio as to simulation conducted at the Tier-1 versus the Tier-2 centers (was 60% at the Tier-1s vs. 40% at the Tier-2s. In deviation from the latter now used for our resource projections is a ration of 25% at the Tier-1s and 75% at the Tier-2s. We have also folded in the officially agreed efficiency factors when calculating the resource needs: 85% for CPU and 70% for disk.

We summarize details as to parameters related to simulation below.

Resources for Simulation

Browsing the ATLAS documents on resource usage in 2010 we realize that there are no firm numbers for planned full/fast simulated event production per year.

As to this year’s production we found

  • 2 Geant 4 simulation campaigns:
    • In January and February the extension with 15.3.1.X and 15.6.1.X production caches at 900 GeV? and 2.36 TeV? : 95M events total
    • In March (still being extended), MC09 at 900 GeV? , 2.36 TeV? and 7 TeV? : 900M events in total (200M single particles)
  • 3 Digitization and reconstruction campaigns:
    • In March campaign with 15.6.6.5 at 900 GeV? , 2.36 TeV? and 7 TeV? : 400M events total.
    • in April campaign with 15.6.8.7 at 900 GeV? , 2.36 TeV? and 7 TeV? : 450M events total
    • In May campaign (still being extended) with 15.6.9.8 at 900 GeV? , 2.36 TeV? and 7 TeV? : 900M events total
  • Pileup digitization and reconstruction validated with 15.6.12.1: 90M events so far

Meaning that the total number of simulated G4 events in 2010 is O(1B) so far (note in the resource estimate of March 2010 ATLAS was planning to simulate 600M G4 events)

In a second document associated with resource usage in 2010 one can find

“Simulation: We have fully simulated more events than originally planned. The need to simulate more events was driven in part by the need to simulate at 3 energies (0.9, 2.36 and 7 TeV? ). We were able to simulate more events than projected because we had more CPU cycles available, largely from opportunistic resources at T2 and because of the lower number of events recorded (thus quicker re-processing).”

And (in the same document)

“Tier-2 CPU usage:To date, CPU usage at T2 has been dominated by simulation; however, we have seen a steady increase in group and user usage. The ATLAS accounting, from the ATLAS production database, shows that averaged over the year to date, the T2’s spent 82% of the CPU usage on simulation and the rest on group and user analysis. We expect group and user usage to increase dramatically as we get larger derived data sets from higher luminosity running and as need for ESD diminishes.

Our interpretation of the above is

  • given the rapidly increasing needs for CPU for analysis due to the rapidly growing real data sample the available CPU power for simulation at Tier-2s will significantly shrink from 2011 on. As is stated in the document the processing power devoted to analysis at Tier-2s was underutilized in the first half of 2010, which is shifting toward much higher utilization lately due to an increasing data sample and dynamic data brokerage. We therefore predict a division of resources between production and analysis according to the CM, i.e. 50% for production and 50% for analysis. We will, however, not be using this algorithm since it was explicitly requested by US ATLAS Computing Management to increase the simulation fraction at Tier-2 centers to 75%.
  • In the ATLAS resource requirements document as of March 17th, 2010 one finds a number of 600M fully simulated events to be produced in 2011 and 2012. This is probably a conservative estimate that we have used for the low bracket, while we have used 1B events to calculate the high bracket.
  • According to the March resource requirements document the division of resources for simulation is 60% at the Tier-1s and 40% at the Tier-2s in 2010, 2011 and 2012. ATLAS current practice is that 75% of simulation is done at the Tier-2s. This is what we have folded in when calculating the new low and high bracket.

Note, not incorporated in our calculations are resources needed to simulate events at 8 TeV? . As this mode of machine operation is likely in 2011 more resources for simulation may be needed beyond what we have planned for.

Resources for Analysis at the Tier-2 centers

User analysis

The algorithm used is: for 2010 1000 users want to analyze 10% of the sum of all existing AOD data (real, MC and atlfast) every week and use 0.2 HS seconds per event. The U.S. is supposed to carry a share of 23% of the total analysis load. For 2011, 2012 and beyond we assume the same but only 50% of the simulated data of the previous year will be analyzed as well.

Group analysis

The algorithm uses is: for 2010 20 groups want to analyze 10% of the sum of all existing AOD data (real, MC, atlfast) every month and use 20 HS seconds per event. The U.S. is supposed to carry a share of 23% of the total analysis load. For 2011, 2012 and beyond we assume the same but only 50% of the simulated data of the previous year will be analyzed as well.

LHC machine “lifetime”

Resource needs are heavily dependent upon the LHC machine’s ability to provide “stable beams” to the experiments, as well as beam intensity and parameters relevant to the beam quality. As to intensity and beam quality there are lots of uncertainties for 2011, in addition to the currently ongoing discussion as to an extension of the run into 2012. Details as to run conditions in 2011 can be found on slides 3 – 6 of the following presentation that was recently made to the LHCC, found here.

On slide 4 “reasonable” (achievable?) numbers are presented. The estimated integrated luminosity of 2.2 fb ^-1 over a period of 200 days is already more than twice as high as what was anticipated earlier this year. On slide 5 they even mention a scenario (“ultimate reach”) that could deliver an integrated luminosity of as high as 7.6 fb ^-1.

As to our resource projections we have incorporated a scenario based upon a factor of ~2 with respect to the integrated luminosity in 2011 (note the ATLAS estimate as to 2011 lifetime is the same as for 2010, which is far too pessimistic: 6.55 Msec per year) and 2014-2016 for the high bracket. Our calculations for 2012 and 2013 were made under the assumption that there is a 15 month shutdown in 2012/2013 and a much reduced machine lifetime in 2013 due to a slow ramp-up of a new machine operating at 14 TeV? center of mass. As indicated above the latter may change; a decision is expected after the Chamonix meeting in late February 2011 or a few months later. The consequence of running in 2012 will be, while under the currently foreseen budget guidance we may be able to provision the resources needed given a shutdown in 2012 we will have to be innovative in order to appropriately respond to the computing needs with a run extension until Q3 2012 or later.



Major updates: -- TorreWenaus - 23 Nov 2010

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments


pdf NET2_Renewal_Proposal.pdf (1035.3K) | SaulYoussef, 23 Nov 2010 - 10:27 | NET2 renewal proposal
pdf AGLT2_Renewal_Proposal.pdf (226.7K) | ShawnMckee, 29 Nov 2010 - 15:51 | AGLT2 renewal proposal as of November 29, 2010
pdf mwt2-illinois-v4.pdf (274.1K) | RobertGardner, 01 Dec 2010 - 05:50 | Integrated MWT2 proposal
pdf mwt2-nextgen-v4.pdf (442.8K) | RobertGardner, 01 Dec 2010 - 05:52 | Original MWT2 proposal (UChicago+Indiana)
pdf WERC_ACC_Server_Room_BOD_Rev_20101115.pdf (1028.1K) | RobertGardner, 01 Dec 2010 - 08:20 | Preliminary basis of design for new server room at UChicago
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback