Part of the NetworkMonitoring activity in US ATLAS, Phase 16 (FY11Q2), c.f. SiteCertificationP16.

For this quarter there are two primary certification tasks:

  • Getting each site's perfSONAR instances properly configured and updated.
  • Remeasuring the BNL->Site throughput limit via a Loadtest

perfSONAR Configuration Goal

  • All sites should be running perfSONAR v3.2 and review and implement the recommendations in Jason Zurawski's document on perfSONAR maintenance guide.
  • Sites should also consider (re)installing as a disk-based install, rather than burning and booting from CDrom (see http://psps.perfsonar.net/toolkit/FAQs.html#Q34).
  • The Nagios server at BNL is testing each of our USATLAS Tier-1/Tier-2 perfSONAR instances and this will be used to determine when a site has complied with this goal.
  • Each site should have both the Latency and Throughput matrices completely "Green".
  • For example the current (March 6, 2011) matrices are shown here:



  • Based upon these results only BNL has green rows and columns for the Latency matrix (row/column 2).
  • A number of other sites are close but have at least one non-green box (which may not even be their site's problem) to resolve. For the throughput matrix only AGLT2_UM (row/column 1) qualifies though, again, a number of other sites are close.
  • Summarizing: to certify your site in the NetperfSONAR table you need to have all green rows AND columns in both the Latency and Throughput Nagios matrices.

Remeasure Throughput Baseline for each Tier-2

  • Each Tier-2 should contact Hiro and schedule a 1 hour Loadtest.
  • The goal is to achieve the maximum throughput possible from BNL to each site. This will indicate the expected upper-bound on transfers. Each site listed below should document the test results here. Our goal is an average of 400MB/sec (for 10GE connected sites).
  • Once a site has completed the tests and posted the results here they can check-off this on the SiteCertificationP16 table.


AGLT2 requested 4 sets of throughput tests spanning March 3 through March 4th. The final test results from Friday, March 4th, 2011 are shown here. First a graphic showing a number of Cacti graphs showing network and storage node activity. I put "red" arrows to denote the loadtest start. On the upper right plot I marked the approximate "incoming" traffic on our dCache storage nodes. Almost all this is from the loadtest. The average of the last hour is approximately 1 GByte/sec.


The next plot shows Hiro's FTSmon results during the test. We started with 45 concurrent transfers (AGLT2 is normally 30) and ramped up to 100 by the end of the test (you can see the impact of changing the number of concurrent transfers in the above plots as well).


So AGLT2 retest results are 1GB/sec, completed on March 4, 2011.


This first plot shows the throughput on our link to campus, includes all bandwidth in/out of our site:


This second plot shows one single s-node throughput:

Single S-node throughput at MWT2_UC from BNL

Finally, this is the BNL report for the MWT2_UC channel:


In summary, MWT2 retest results are 1GB/sec, completed on March 18, 2011.


This plot show throughput from BNL into Indiana.

  • Screen_shot_2011-04-14_at_3.13.11_PM.png:

MWT2 IllinoisHEP

This plot shows the throughput to IllinoisHEP from BNL





Test performed April 14, 2011


(UTA has a 2x1GE limit, OU should have 10GE; may want to test separately for each)


Forgot to keep the plots. Stable at ~450MB/s (BNL to SLAC)

