r2 - 12 Dec 2010 - 19:45:44 - RichardMountYou are here: TWiki >  AtlasSoftware Web > Minutes19Nov2010

Minutes19Nov2010 RAC Core Group Minutes, November 19, 2010

Core Group Members Present

Richard Mount, Jim Cochran, Michael Ernst, Kaushik De

Other RAC Members Present

Armen Vartapetian

Status of US Additional Production (Kaushik)

The work already in the queue (track jets for the SM group) was close to finishing. All the CPU-intensive simulation was done. Some reco remained. The total elapsed time to execute this request (~100M events) was around 6 weeks.

Kaushik advised putting out a solicitation for additional production to be in the queue ready to run in any gaps that appeared over the holidays.

Status of Official Production (Kaushik)

Approaching the end of the simulation campaign, but still 200M events remaining. Nevertheless, spare capacity can appear at any time – for example during a production glitch yesterday.

Other Production Issues (Kaushik)

Space was not currently a problem at the T2s thanks to PD2P. The T1 was a different story. Heavy Ion raw data and ESDs were swamping BNL – currently about 90% full with much more to come soon (Alexei estimates a total of 1.5 PB for the HI ESD.) Michael had been forced to put some retired disk space back into service (temporarily). Should we consider deleting some things we thought we had to keep?

Current US cleaning practice had anticipated the recent CREM decisions (apart from raw data), so minimal further benefit would arise from these decisions. Rapid cleaning of the unmerged datasets from recent reprocessing would help – this should be done by ADC whose central deletion had been working well recently.

The relative priority of HI and proton data with respect to DOE funding was discussed - relatively little DOE funded analysis of HI data was expected. The RAC concluded that the use case for both raw and ESD HI data was access for organized production, so BNL could meet ATLAS needs by moving most of these data to tape and retrieving them as needed.

Taking all the above into account, we believe we can survive until January.

There was serious discussion of implementing PD2P for T1-T1 data distribution in the new year. Currently there were 5-6 copies of ESDs, 20 copies of AODs and 10 copies of DESDs. After implementing PD2P, these might fall as low as 1, 2 and 2 copies. Initially PD2P trials between T1s would focus on reducing the copies of the ESD.

Status of the Large-Memory Pilot Production at SLAC

This was stalled (no jobs run yet) for no obvious reason. Richard proposed holding a meeting with Charlie Young, Borut, Kaushik and Wei Yang to remove the roadblocks. This proposal was accepted.



Action Items

  1. 11/19/2010, Richard: organize a meeting to unblock the stalled large-memory pilot production. Done after the meeting
  2. 11/19/2010, Richard: solicit US additional production for the end of year period. Done after the meeting
  3. 9/24/2020: Kaushik: Organize the nature and timing of the effort to create an automated mechanism to give US-requested additional production priority access to US non-pledged resources. On hold - manual system works just fine for now
  4. 8/27/2010: Kaushik and Torre, Investigate the current state of the technology to route large memory jobs to the sites/queues prepared to execute them. Reported at this meeting to be ready but untested. Overtaken by first action item

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback