r2 - 23 May 2011 - 15:05:22 - RichardMountYou are here: TWiki >  AtlasSoftware Web > Minutes25Mar2011

Minutes25Mar2011 RAC Minutes, March 25, 2011


Members (*=present, #=apologies)

*Richard Mount, Kevin Black (Physics Forum Chair), Jim Cochran (Analysis Support Manager), Alexei Klimentov (ATLAS ADC), Ian Hinchliffe (Physics Advisor), Rik Yoshida (Tier3 Coordinator), *Michael Ernst (U.S. Facilities Manager), *Rob Gardner (Integration Coordinator), *Kaushik De, (U.S. Operations Manager), *Armen Vartapetian (U.S. Operations Deputy)

Ex-Officio: #Torre Wenaus, Stephane Willocq, Mike Tuts, Howard Gordon

US ATLAS *Saul Youssef

Approval or Correction of Minutes

The minutes of March 4 were approved

Operations Report (Kaushik)

With a few exceptions, the US T2 complex was completely out of production jobs to run. Some other clouds still had a substantial amount of reconstruction production in their queues. The MWT2 was running jobs on behalf of the German cloud and was currently full of reconstruction. Monte-Carlo production was sporadic and was currently halted.

Analysis had also dropped by about a factor two. Space at T2s was not a limiting issue - storage was 65% to 75% full.

Michael: the T1s will be getting a lot of mu-reco jobs in about two weeks.

Discussion:

  1. A communications campaign (email, Jim/Ric, physics management) was needed to encourage production and analysis that may be suppressed or on hold due to perceived resource limitations.
  2. The current situation illustrates the desirability of opening up ATLAS OSG resources to opportunistic use by other OSG members. Rob was interested in coordinating this effort, targeting a limited number of OSG VOs initially to maximize the benefit at minimal and acceptable support cost. There was general support for this idea. It was agreed that this needed to be discussed at the L2/L3 Computing Management meeting.

WLCG/OSG reporting versus reality (status of investigations)

The saga continues. Rob said that a short-term fix was available. There are two open tickets in OSG to address the issue:

  1. Verify the WLCG normalization constants for ATLAS sites ISSUE=10127 PROJ=71; https://ticket.grid.iu.edu/goc/viewer?id=10127
  2. Incorporating Hyperthreading into WLCG normalization constant ISSUE=10129 PROJ=71; https://ticket.grid.iu.edu/goc/viewer?id=10129

Policy and Process managing LocalGroupDisk

Michael: A lot of analysis runs at BNL where people archive their data. BNL has 50-100 TB of LocalGroupDisk (Kaushik says he sees 100 TB).

Kaushik: There is a lot of variation at the US T2s, for example MWT2 has 141 TB, NET2 has 9 TB. The existing "policy" is not to advertise the availability of LocalGroupDisk, but also not to turn down any requests for space. Larger request are moderated by Kaushik et al., but never turned down. There are now several individual users, perhaps working on behalf of analysis groups, who each have 20-50 TB.

Discussion:

  1. LocalGroupDisk allocations are outside the pledge.
  2. A target of ~50TB of LocalGroupDisk at each US T2s seemed a good initial idea. This should be considered acceptable provided the total of the US T2s and the T1 meets or exceeds the US disk pledge.
  3. We needed to establish this initial availability of LocalGroupDisk, publicize it and encourage its use in order to understand its value in supporting physics analysis and begin to expose and understand the resource management issues.
  4. This was not just a US issue. It should be raised at the CREM, aiming at an ATLAS-wide approach to understanding the role of LocalGroupDisk.

AOB

Michael asked about the status of the cavern background production that should run at SLAC. Richard admitted ignorance but will find out.

Action Items

  1. 3/25/2011: Richard: Email US ATLAS about the availability of production and analysis resources; ask Jim/Ric why there is so little analysis activity.
  2. 3/25/2011: Richard: Raise the "opportunisitic use by other VOs" issue at the L2/L3 Management Meeting.
  3. 3/25/2011: Richard: inquire about the status of the cavern background production and report back.
  4. 9/24/2020: Kaushik: Organize the nature and timing of the effort to create an automated mechanism to give US-requested additional production priority access to US non-pledged resources. On hold - manual system works just fine for now

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback