r26 - 31 May 2012 - 16:34:07 - WeiYangYou are here: TWiki >  Admins Web > VmemSitesSurvey

VmemSitesSurvey

Introduction

From Ueda,
Begin forwarded message:

From: I Ueda 
Date: March 27, 2012 9:53:06 AM CDT
To: "atlas-adc-cloud-all (contact for all the ATLAS cloud supports)" 
Subject: Survey on vmem situation

Dear the clouds,

As presented at the ADC weekly today,
can we ask each cloud a table with sites, vmem limits (no limit, >=4GB, >=3.5GB or <3.5GB), cpu capacity, timescale?
https://indico.cern.ch/materialDisplay.py?contribId=12&materialId=slides&confId=183608

this is to understand how much CPU capacity we have to run reconstruction jobs consuming >3.5 GB vmem.
and if there are many sites which have problems, then to understand feasibility and time scale to reach this target.

please note  that there is no problem for the sites with physical memory 2GB and no vmem limit.


an item has been prepared on the next weekly agenda for uploading reports
 https://indico.cern.ch/conferenceDisplay.py?confId=183609

regards,      ueda

Table

Site Job Slots HS06 RAM/Slot VMem/Slot Kill on overusage Comments
 Tier1  2824 40715  3.0Gb  no limit  No  No killing of ATLAS jobs based on memory usage
 Tier1  9496 80715   2.0Gb  no limit  No  Ditto -- plenty of swap is available for large memory jobs if necessary
AGLT2 4200 35700 2GB 4GB Yes All AGLT2 WN allow up to 4GB/job
MWT2 5760 51816 2 GB no limit No Swap is available; it has been rarely used to date
MWT2(UIUC) 384 2624 2GB 2.7GB Yes 4GB vmem could be provided. Jobs are killed if over 8GB
NET2 (BU) 1344 11200 2GB 2.3GB No could increase ram/thread by decreasing job slots per node
NET2 (BU) 448 3700 2GB 2.6GB No could increase ram/thread by decreasing job slots per node
NET2 (HU) 1500 14200 2GB 2.6GB No could increase ram/thread by decreasing job slots per node
NET2 (HU) 480 4000 2GB 2.6GB No could increase ram/thread by decreasing job slots per node
SWT2 (UTA_SWT2) 1180 10852 2.0GB no limit No Nodes are provisioned so that slots = RAM/2GB; Swap is set to RAM; no limits in Torque
SWT2 (SWT2_CPB) 1412 11994 2.0GB no limit No Nodes are provisioned so that slots = RAM/2GB; Swap is set to RAM; no limits in Torque
SWT2 (OU) 844 7614 2GB 3.5GB No We don't kill jobs, but would appreciate if no jobs over 3.5GB were sent, since they can crash our worker nodes.
WT2 2760 29376 2GB 4GB No 4 RAM/Yes 4 VMEM SLACXRD_LMEM and ANALY_SLAC_LMEM can schedule jobs with up to 4GB physical memory


-- RobertGardner - 27 Mar 2012

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback