r4 - 25 Jun 2007 - 04:44:21 - RobertGardnerYou are here: TWiki >  Admins Web > FacilityNotesTier2Jun22

US ATLAS Facility Plan Requirements Discussion


These are notes taken from the facilities planning discussion at Tier3/Tier2 Workshop at Indiana, http://indico.cern.ch/conferenceDisplay.py?confId=15523.

General production goals

  • Kaushik notes that we need to triple our production capacity by December 2007. This might be achieved by doubling installed capcity of cores, since we are already producing 1/3 more than the US share of the overall ATLAS quota.
  • We also need to maximize our efficiency to meet this goal, within the context of the Integration Program. Our target in 2007 is ~3000 cores (see Michael's table, assuming a 1.3 factor between core and Si2K? ).
  • Kaushik notes that about 1200 jobs on average are running. For storage, 1PB capacity at the Tier2s by end of 2007.
  • Beyond capacity, performance issues, e.g., I/O and disk access, is a concern. This may depend on the access methods, e.g., TAG based analysis into files for data skimming.

Scaling problems

  • Scaling issues arising from increase in the CPUs over the past few months have mostly been involved with data transfers. No Panda scaling issues have been observed.
  • Large numbers of files backlogged at the Tier2 centers. Will DQ2 0.3 solve these problems?
  • We need to understand better which pieces of the infrastructure are introducing latencies; for example the SRM, especially for small files.
  • Regarding storage at BNL, within a month there will be about 1.5 PB of storage available.

Storage management at Tier2s

  • For scalability, we need a managed entry point into the backend storage systems at the Tier2s.
  • We thus need a SRM at each Tier2 with load balancing between Gridftp servers behind these. We will not be able to live with single Gridftp door at the site.

DDM issues

  • Alexei reporting on a situation of high latency in a single transfer being investigated; discovered that it was a dcache problem. Wei: tests with a 1000 file subscription to SLAC.
  • Data distribution issues: schedule and timeline? Alexei would like to do the functional tests as soon as possible.
  • By next Wednesday all sites should be upgraded to DQ2 0.3 so that functional tests and AOD replication may begin.

Analysis requirements and decisions

  • Complete copies of the AODs to be replicated at each Tier2.
  • Separate analysis queues need to be setup at the site. Kaushik will provide the the recipe in the integration page. (See AnalysisQueueP1)
  • Produced DPD's to be replicated back to BNL and CERN.

-- RobertGardner - 22 Jun 2007

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback