r24 - 08 Aug 2007 - 13:51:46 - HorstSeveriniYou are here: TWiki >  Admins Web > SummaryReportP1



This is the first report, Phase 1, of the newly organized IntegrationProgram which began with discussions in May 2007, and formally launched in June in the first week of June 2007. Meetings during this period:

Summary of milestones achieved

  • WBS 1.1: ATLAS releases through 13.0.10 on all sites, see AtlasReleasesP1.
  • WBS 1.2: Major new release of DQ2 0.3 (site services component) deployed on all sites, see: DQ2SiteServicesP1.
  • WBS 1.3: OSG 0.6 deployed and validated on all sites except one (OU) which was being relocated, see OSGservicesP1.
  • WBS 1.4: SRM 1.1 storage elements have been deployed only on three sites in the US ATLAS infrastructure: BNL, MWT2_UC and MWT2_IU.
  • WBS 1.5: Monitoring services implemented through Panda (jobs execution) and Gratia (accounting). Network monitoring via the Internet 2 monitoring toolkit NDT deployed on 3 sites. Tier1 based Nagios monitors of US ATLAS services (Globus gatekeepers, DQ2 services) implemented for all sites. LRC MySQL catalog monitors for all DQ2 site services in place for US ATLAS. See further MonitoringServicesP1.
  • WBS 1.6: Logging infrastructure, based on syslog-ng, in place for 6 sites. VDT-based packaging used. Central logging host deployed, with Splunk server.
  • WBS 1.7: SiteCertificationP1 was established as a deployment and validation coordination device for both dedicated and closely affiliated, leveraged facilities.
  • WBS 1.8: LoadTestsP1 program was established, initial requirements development, and first steps for a execution control on monitoring framework were made. A development testbed at BNL, and one Tier2 (MWT2_UC) was created.
  • WBS 1.9: This report: completed 8/6/07.

Procurement reports and capacity status

  • T1: 1000 TB storage (400 TB distributed disk-heavy worker nodes and 600 TB storage arrays)
  • AGLT2: Added 45 job slots from our committed "Tier-3" equipment as per our original proposal
  • MWT2_IU: 85.4 TB usable storage added. 500GB scratch disks added for compute nodes (18).
  • MWT2_UC: 72.1 TB usable storage added. 500GB scratch disks added for compute nodes (15).
  • NET2: 52 TB usable storage added. Cisco 6509 installed.
  • SWT2-UTA: 200 cores, 60 TB, being installed, to be available ~Aug, 2007.
  • SWT2-OU: 184 cores, 15 TB, mostly delivered, available ~Sep, 2007.
  • WT2: 51 TB usable server based storage added. 450GB scratch disk added to each node (78 nodes).

Capacity status: (dedicated processing cores, usable storage)

  • T1: 1600 cores, 1000 TB
  • AGLT2: 200 cores, 40 TB
  • NET2: 392 cores, 144 TB
  • MWT2_IU: 128 cores, 110 TB
  • MWT2_UC: 136 cores, 102 TB
  • SWT2-UTA: 380 cores, 30 TB
  • SWT2-OU: 80 cores, 4 TB
  • WT2: 312 cores, 51 TB

Summary of failures and problem areas

  • Recovering from DQ2 0.3 deployment was difficult, as operational startup difficulties with central catalogs, and lack of sufficient time for Panda integration, created long down-time periods.
  • AOD replication continues to interfere with production, causing high load on DQ2 sites services and slow-down in distribution of production datasets.

Carryover issues to next Phase

  • Establish DQ2 integration testbed for testing and integrating updates to DQ2.
  • Make progress on load testing execution and monitoring framework, and load-test module development.
  • Discuss NDT security requirements with US ATLAS and OSG security officers and develop policy for US ATLAS software deployment.
  • Establish distributed analysis infrastructure throughout the facility by setting up AnalysisQueueP1.

-- RobertGardner - 10 May 2007

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback