r3 - 29 Mar 2011 - 19:24:15 - RichardMountYou are here: TWiki >  AtlasSoftware Web > Minutes4Mar2011

Minutes4Mar2011 RAC Minutes, March 4, 2011


Members (*=present, #=apologies)

*Richard Mount, Kevin Black (Physics Forum Chair), *Jim Cochran (Analysis Support Manager), Alexei Klimentov (ATLAS ADC), Ian Hinchliffe (Physics Advisor), Rik Yoshida (Tier3 Coordinator), *Michael Ernst (U.S. Facilities Manager), #Rob Gardner (Integration Coordinator), Kaushik De, (U.S. Operations Manager), *Armen Vartapetian (U.S. Operations Deputy)

Ex-Officio: *Torre Wenaus, Stephane Willocq, Mike Tuts, Howard Gordon

US ATLAS *Saul Youssef

Approval or Correction of Minutes

The minutes of January 7 and 28 were approved

Operations Report (Armen)

Production was generally stable. The first heavy ion reprocessing jobs had arrived.

Half of the transfer activity in the US was the migration of MCDisk to DataDisk at BNL, the migration at Tier 2s was complete.

Apart from the migration, there was not too much movement of data, but central deletion continued to be an issue. Central deletion was saturated during UserDisk cleanups and stretched a cleanup out over almost a month. The next cleanup was scheduled for mid March.

Michael pointed out that BNL was seeing about 300 MB/s of transfers from the T0.

WLCG/OSG reporting versus reality (status of investigations)

Richard described the origin of the investigation (evidence that the WLCG-reported WT2 numbers were high by 20% or more) and some of the subsequent investigation. WT2 had found that its real CPU-seconds reported to OSG were correct to better than 1%, so the problem must lie in how these real seconds were translated into reported HS06. The investigation continued very actively in several email threads that also involved CMS. It was hoped that the presence of many experts at next weeks OSG meeting would lead to a resolution.

Saul noted that he had analyzed Panda database data from Torre's dumps for all T2s. He found that NET2 reporting was low by a factor 2 because the Harvard CPUs were not being included.

Policy, Process and Technology for managing GroupDisk and LocalGroupDisk

Armen presented tables reporting on !GroupDisk and !LocalGroupDisk in the US and other clouds.

GroupDisk

Richard commented that Wei Yang, who manages the WT2 (and Richard) were unaware that SLAC had been signed up to host space for six analysis groups at 27.5 TB per group now and 55 TB per group later this year. As a consequence, WT2 had set the quota on GroupDisk far too low, and Wei was alarmed to see the GroupDisk space expand up to the quota whenever he raised it. Wei also noted that GroupDisk, while only about occupying around 5% of the space, contained 75% of the files at WT2.

Michael commented that this was typical of communications issues to be addressed at the Tuesday Facilities Meeting.

How WT2 had been assigned six analysis groups was a minor mystery (but WT2 was not objecting).

Discussion circled around whether the processes and tools to manage GroupDisk were in reasonable shape. There was no immediate disaster, but Armen believed that the development of tools to make it easier for groups to manage their space should get a higher priority with ADC. Nobody dissented.

LocalGroupDisk

Saul gave the URL of plots describing US LocalGroupDisk usage (http://atlas.bu.edu/~youssef/atlas/localgroupdisk/by-site.html). The plots were not discussed during the meeting.

Michael noted that LocalGroupDisk was not part of the pledged resource. When the pledges adjusted upwards in April, some sites would have zero or negative ability to offer LocalGroupDisk (at least in principle). It would not make sense to abolish LocalGroupDisk at a site and then reinstate it when more disk was installed. It would make more sense to look at the total available-pledged space in the US and keep the total LocalGroupDisk within this bound.

Richard raised the question of how to allocate LocalGroupDisk: a) at the discretion of site administrators, or b) as a RAC responsibility. Site administrators should not be faced with deciding physics priorities. Assuming most of the allocation were to be a RAC responsibility, Richard favored requiring physics group approval (for validity as opposed to priority) before the RAC allocated significant space to an activity. This would parallel the approach used in allocating above-pledge US CPU resources. There was no dissent, but the issue merits continued consideration.

AOB

Richard noted the need to have a RAC agenda to which material could be attached. After brief discussion it was agreed to use Indico in future.

Action Items

  1. 1/28/2011: Armen: report on data in GroupDisk and LocalGroupDisk within the US. It would also be valuable to find out about other clouds. Complete (thanks Armen!)
  2. 9/24/2020: Kaushik: Organize the nature and timing of the effort to create an automated mechanism to give US-requested additional production priority access to US non-pledged resources. On hold - manual system works just fine for now

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback