Results from previous phases:

General load testing activity in this Phase 4 (Jan 1 - Mar 31, 2008)

  • Continue setting up sites and performing tests in LoadTestsP3

Load test framework deliverables

  • Optimize scheduling for measurement tests to provide meaningful results of throughput performance.
  • Continue plotting adjustments (selection of quantities, scale, labels) in Monalisa to provide best views for facility peformance monitoring.
  • Provide ability to adjust number of streams, payloads (file size, number of files), concurrent transfers.
  • Provide an easily viewable web display of selected Monalisa plots/views
  • Produce a table showinghttps://www.usatlas.bnl.gov/twiki/bin/edit/Admins/LoadTestsP3?t=1201626472 each storage system endpoint for each Tier-2. Should include Host, Directory, Network Link, Read-Speed, Write-Speed.

Storage Endpoint Table

This should be filled out by EACH Tier-2 site. List all used storage endpoints and their relevant information. For the Read and Write you should provide a link to the measurement details, either a web page describing the tests and results or a figure showing the results. Make sure the "Big" testing uses files > 2X the size of the physical memory.

Wenjing can you produce a web page which describes how to run the Read/Write tests by 'time dd' and IOZone and link it here?

Site Host Directory Size(TB) ReadBig WriteBig   Read(132MB) Write(132MB) Notes
BNL dc002.usatlas.bnl.gov /data/ 16TB 570MB/s 523MB/s   NA NA Representative Thumper, RAID60 (3x9+2, 1x10+2)
BNL acas0013.usatlas.bnl.gov /data/test 800GB 88MB/s NA   473MB/s NA 2004 representative read farm node
BNL acas0173.usatlas.bnl.gov /data/test 650GB 34MB/s NA   795MB/s NA 2005 representative read farm node
BNL acas0023.usatlas.bnl.gov /data/test 4TB 31.69MB/s NA   737MB/s NA 2006 representative read farm node
BNL acas0513.usatlas.bnl.gov /data/test 4TB 222MB/s NA   709.7MB/s NA 2007 representative read farm node
AGLT2_MSU msufs01.aglt2.org /export/vdisk_{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_MSU msufs02.aglt2.org /export/vdisk_{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_MSU msufs03.aglt2.org /export/vdisk_{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_MSU msufs04.aglt2.org /export/vdisk_{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_MSU msufs05.aglt2.org /export/vdisk_{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_UM umfs01.grid.umich.edu /data07 4.4TB 303MB/s 312MB/s   682MB/s 2471MB/s  
AGLT2_UM umfs02.grid.umich.edu /atlas/data08 11TB 129MB/s 250MB/s   338MB/s 872MB/s  
AGLT2_UM umfs03.aglt2.org /atlas/data13 16TB 498MB/s 653MB/s   681MB/s 1645MB/s  
AGLT2_UM umfs04.aglt2.org /atlas/data14 16TB 430MB/s 487MB/s   605MB/s 1267MB/s
AGLT2_UM umfs05.aglt2.org /atlas/data16 20TB 271MB/s 688MB/s   873MB/s 2407MB/s  
AGLT2_UM umfs06.aglt2.org /atlas/data17{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_UM umfs07.aglt2.org /atlas/data17{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_UM umfs08.aglt2.org /atlas/data19{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_UM umfs09.aglt2.org /atlas/data22{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_UM umfs10.aglt2.org /atlas/data22{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
AGLT2_UM umfs11.aglt2.org /atlas/data22{a,b} 20TB 390MB/s 800MB/s   973MB/s 2658MB/s  
MWT2_IU iut2-c001.mwt2.org /dcache/ 1.8T 55MB/s 17 MB/s   790MB/s 670MB/s LSI Raid card
MWT2_IU iut2-c019.mwt2.org /dcache/ 1.5T 96MB/s 20 MB/s   880MB/s 800MB/s 3ware raid card
OU_OCHEP_SWT2 tier2-01.ochep.ou.edu /ibrix/data/dq2-cache/test 12TB 200MB/s 100MB/s   200MB/s 100MB/s  
OU_OCHEP_SWT2 tier2-02.ochep.ou.edu /ibrix/data/dq2-cache/test 12TB 200MB/s 100MB/s   200MB/s 100MB/s  
NET2 atlas.bu.edu /gpfs1/loadtest 60TB 800MB/s 500MB/s   NA NA i/o from 9 workers to gpfs
WISC higgs02.cs.wisc.edu /atlas/xrootd/test 4TB 77MB/s 82MB/s   NA NA
WISC higgs04.cs.wisc.edu /atlas/xrootd/test 4TB 75MB/s 91MB/s   NA NA
WISC higgs05.cs.wisc.edu /atlas/xrootd/test 4TB 77MB/s 91MB/s   NA NA
WISC higgs06.cs.wisc.edu /atlas/xrootd/test 4TB 75MB/s 95MB/s   NA NA
WISC higgs07.cs.wisc.edu /atlas/xrootd/test 4TB 75MB/s 87MB/s   NA NA
WISC higgs08.cs.wisc.edu /atlas/xrootd/test 4TB 66MB/s 87MB/s   NA NA
WISC glow-s007.cs.wisc.edu /atlas/xrootd/test 4TB 130MB/s 97MB/s   NA NA
WISC glow-s009.cs.wisc.edu /atlas/xrootd/test 4TB 66MB/s 93MB/s   NA NA
WISC glow-s010.cs.wisc.edu /atlas/xrootd/test 4TB 126MB/s 78MB/s   NA NA
WT2 atl-xrdr.slac.stanford.edu /xrootd/atlas/dq2 48TB 500MB/s 410MB/s   NA NA IO on individual Thumper.

IO Test Instructions

See here for instructions to benchmark IO performance of lcoal disks.

Target benchmarks

Establish the following benchmarks for tansfers between BNL and Tier2 centers:
  • Memory-to-memory (iperf): >950 Mbps (1 Gpbs links), > 8000 Mbps (10 Gpbs links)
  • Writing (standard "dataset" payload-type): > 110 MB/s sustained over 10 minutes (72 GB) for a 1 Gbps link
  • Reading (standard "dataset" payload-type): > 100 MB/s sustained over 10 minutes for a 1 Gpbs link

  • Each site 200MB/s? (or best possible value)
  • 10GE sites 400MB/s?
  • Long-term (24+ hours) of 500MB/sec BNL->Tier-2s?
  • Demonstration of BNL->ALL_Tier-2s at 200MB/s (or best possible) EACH (1GB/sec) for long period?
  • Measurement of “maximum” burst mode bandwidth for each site (20-60 minute period?)

Goal Status Table

200MB/sec > 2 hrs YES 100 YES 80 Not done Not Done YES YES
400MB/sec (10GE) > 2 hrs YES NO YES NO Not done Not Done NO NO
500MB/sec (BNL->Mulit-Tier2) > 24 hrs Yes
1GB/sec BNL->All Tier-2s No
Max-rate to Tier-2 (30 minute avg)                

  • MWT2-IU IO testing

Update April 7, 2008

  • AGLT2 - upgrading endpoints to use behind SRM, keeping some as stand-alone gridftp servers
  • MWT2 - fair amount of trouble with dCache with pnfs fragility. Backlog of pnfs register requests. Some kind of a race condition noticed in gridftp logs - transfers start, seem to complete, but subsequent requests for the file fail. Hiro suggests splitting the directories up (each directory gets a separate database in pnfs).
  • NET2 - still waiting for gatekeeper to be setup. Working on device drivers for 10G NIC and fiber channels. 32G RAM, 8 cores. Expect to have running later this week. (Notify Jay and Hiro.)
  • OU - still waiting for 10G equipment. Expect later this week.
  • WT2 - no report.
  • UTA - no report.
  • BNL - no activity at present. Still need to setup space tokens. Hiro suggests making a new DQ2 site if you want to setup space tokens.

