r6 - 27 Oct 2008 - 13:42:13 - RobertGardnerYou are here: TWiki >  Admins Web > SummaryReportP6

SummaryReportP6

Introduction

This report covers Phase 6 of the IntegrationProgram which covers the period of July 1 - Sep 30, 2008. Meetings during this period:

Summary of milestones achieved and carryover tasks

ATLAS release installation via Pacballs & DDM

  • 05-Sep: Integrate Andrea de Salvo's installation script into a panda job.
    • Test installation script locally at BNL DONE
    • Suggest installation script changes needed for OSG environment to Alessandro DONE
    • Test revised installation script on OSG and LCG DONE
    • Request any changes needed in Panda DONE
      • deliver pacball via DQ2 to Tier1s DONE
      • make panda job script calling AD's script DONE

Carryover tasks:

* make the job definition script to submit the panda install jobs * 19-Sep: Test panda install jobs within panda production framework.

    • Setup US submit host to submit pilots using "software" role proxy (usatlas2)
    • Send test jobs to all US sites and verify that the installation works properly
    • Test any additional changes required in panda.
  1. 30-Sep: Have a fully functioning system.

Space management

Storage capacity recommendations/guidance for the Facility (320 TB capacity, from Kaushik's model on MinutesJune11).

Token Sep 1 Oct 15
PRODDISK 20 TB 20 TB
MCDISK 60 TB 66 TB
DATADISK 20 TB 168 TB
USERDISK 10 TB 35 TB
GROUPDISK 10 TB 10 TB

These numbers do not include the 64 TB for US regional quota, which will most likely be distributed among USER, GROUP and LOCALUSER tokens.

Milestone

* September 1 reservation groupings in place at all Tier 2 facilities, as per SiteCertificationP6.

Carryover

  • The October 15 targets with new storage procurements.

Network Monitoring

Deployment (procurement, installation, testing, integration) of network monitoring infrastructure throughout the US ATLAS Facility. Includes:
  • Procurement of two network monitoring hosts
  • Installation of US-LHC network monitoring OS and toolkits

Milestone

File Catalogs

  • See FileCatalog for a the full program of work during this period.

Milestones

  • LFC server deployed at BNL DONE
  • LFC client, client-interfaces, and server packaged in VDT, initial tests: Aug 20 - Charles, Marco DONE
  • LFC-enabled OSG worker-node client prepared and tested: Aug 27 - Marco
    • Development worker-node client package with VDT-based LFC clients now available and in use. DONE
    • Production version of new worker-node client package (with VDT-based LFC clients) Sep 22 - Alain(VDT), Marco
  • LFC deployment process defined: Aug 27 - Rob
    • See InstallLFConOSG DONE; install of LFC server DONE; MySQL backend for Tier 2 DONE
    • LRC migration script and process for migration defined - Hiro, private script DONE
  • Pilot candidate based on existing (ad-hoc) assembled packages and testbed at BNL: Aug 27 Paul, Facility-contact (Xin) DONE
  • Pilot candidate based on new OSG worker-node client at BNL: Sep 3 Paul, Facility-contact (Xin) DONE
    • Integration complete, validating at BNL w/ 10's jobs. DONE
  • LFC-based utilities package: Aug 27 - Charles, Patrick
    • Assess existing LFC utilities, map legacy US tools
      • checkse, cleanse - Patrick, Charles: Sep 30
      • pool-cleaner for dcache sites - Charles Sep 30
      • User dataset deletion package (Charles or LCG sites already) Sep 30
      • First LFC utilities package created (if necessary): Sep 30 - Charles, Patrick; AtlasLFCUtilities
  • Create python-based www interface for reading content (Jean-Philip)
  • BNL LFC fully deployed with LRC retirement (or leave as read-only): Sep 30
  • LFC deployed at two Tier 2 sites for Panda testing:
    • UTA - Sep 24 - Patrick DONE
    • UC - Sep 10 - Charles DONE
  • LFC deployment fully defined and readied for Facility: Sep 30 DONE
    • Conversion and full-scale operations on Tier 1, Tier 2 test sites (Panda, DDM)

Carryover

  • LFC fully deployed at US Tier2's: Oct 15 - all Tier 2 facilities DONE
    • Site-by-site migration

Load testing

Most activity during this period focused on site upgrades and configuration changes.

Milestones

  • 200 MB/s, 400 MB/s benchmarks achieved at some Tier 2 facilities, as per SiteCertificationP6.

Analysis milestones

Analysis benchmarks of various types, and at various scales. The benchmarks defined here are performed on each Tier 2 facility and noted in the the SiteCertificationP6.

  • standard means a standard pathena job defined by a specific release, application, and input dataset.
    • D3DP making jobs with release 14.2.20
    • Run on a FDR2 container dataset, jamboree08_run2.0052280.physics_Egamma.merge.AOD.o3_f8_m10/
  • suite means a package of templated jobs of various types, used to validate functionality of the analysis queues.
    • D3DP making jobs
    • AODtoDPD making jobs
    • TAG selection jobs
    • ARA jobs
  • Analysis benchmarks 1-3: 100/200/400 jobs.

US ATLAS facility client support

This task covers the provisioning of various ATLAS and OSG client tools for use in the US ATLAS Facility by site administrators and physicist-users. This includes:
  • Providing a package of OSG client middleware components that can be used in conjunction with dq2-client tools released by DDM to successfully and efficiently access US ATLAS storage elements. See further WlcgClient.
  • A worker node client component which includes LFC client utilities packaged by VDT for use with the standard OSG 1.0 worker-node client package. This provides Panda pilots the LFC client programs it needs to access LFC catalogs.

Milestones

  • Pre-release WlcgClient for testing: June 30 DONE
  • First release of WlcgClient for wider adoption: July 31 DONE
  • Report of SE testing from client tools: Aug 13 (MinutesAug13) DONE
  • First production release of WlcgClient: Sep 22 DONE
  • New production release of WN-Client (including LFC support): Sep 22 DONE
  • Facility-wide client throughput performance report: Sep 30 DONE
  • For LFC related milestones, see also FileCatalog.

Procurement reports and capacity status

There is also the CapacitySummary in which we compare pledge and deployed capacities for each phase of the integration program.

Procurements during Phase 6 (Jul 1 - Sep 30, 2008):

  • T1: 5 GridFTP dCache door nodes, 10 GE attached
  • AGLT2:
    • 2xM1000e Dell Blade Chassis, 16 M600 blade servers/chassis each with: dual E5440 processors, 8x2GB RAM, 2x146GB SCSI 10K RAID0
    • 52xPE1950 1U servers, dual E5440 processors, 4x4GB RAM, 2x250GB SATA disks RAID0, Energy Smart P/S
    • 6x(PE2950, 4xMD1000) storage node.
      • PE2950 has either dual X5460(3) or dual E5450(3), 8x4GB RAM, DRAC5, Myricom 10GE, 2xPerc6/E w 512MB cache, 2x250GB RAID1, Redundant P/S
      • MD1000 has 15 1TB disks, dual enclosure contoller, redundant P/S
    • Compute: Adding 672 cores, 2.15 MSI2K?
    • Storage: Adding 312 TB of RAID6 pools
  • MWT2_IU: none
  • MWT2_UC: none
  • NET2: none
  • SWT2-UTA: none
  • SWT2-OU: none
  • WT2: none

Capacity status: (dedicated processing cores, usable storage) as of September 30, 2008

  • T1: 4000 cores, 2.1 PB
  • AGLT2: 924 cores, 400 TB plus 170 TB in dCache. NOTE: Doesn't reflect acquisitions purchased above (they likely won't be online by September 30th)
  • NET2: 570 cores, 170TB
  • MWT2_IU: 608 cores, 110 TB
  • MWT2_UC: 996 cores, 102 TB
  • SWT2-UTA: 520 cores, 81 TB
  • SWT2-OU: 260 cores, 16 TB
  • WT2: 33%*(1852) cores, 211 TB

Other carryover issues to Phase 7

  • See individual items noted above.
  • High level goals in Integration Phase 7 (from BNL workshop):
    • Pilot integration with space tokens
    • LFC deployed and commissioned: DDM, Panda-Mover, Panda fully integrated
    • Transition to /atlas/Role=Production proxy for production
    • Storage * Procurements - keep to schedule * Space management and replication
    • Network and Throughput
      • Monitoring infrastructure and new gridftp server deployed
      • Throughput targets reached
    • Analysis * New benchmarks for analysis jobs coming from Nurcan
    • Upcoming Jamborees
    • Probably will hold another US ATLAS Tier 2/Tier3 meeting, Winter/Early Spring
    • OSG site admins meeting coming up: https://twiki.grid.iu.edu/bin/view/SiteCoordination/SiteAdminsWorkshop2008


-- RobertGardner - 20 Aug 2008

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback