r12 - 11 May 2007 - 13:32:03 - JohnDeStefanoYou are here: TWiki >  Admins Web > TierTwoNetworking

Draft Tier-2 Networking Planning and Milestones

During the Tier-2 meeting at UChicago we held a breakout session on Clusters and Networking. The planning, questions, and milestones from this meeting are captured here.

The following list includes the general network related actions we would like to see happen in prioritized order:

Networking questions

  • Need to map out details of existing networking connections.
  • Need to locate the bottlenecks.
  • Need to gather plans for upgrades.
  • Monitoring tools important: which tools to deploy?
  • Important: instructions for tuning and optimizing network use.

LAN

  • WANs much faster these days.
  • Need to understand the details here, and that each Tier2 has the required infrastructure.

WAN status

  • Can MWT2 work with a regional provider to obtain a single subnet block between the sites.

Network configuration considerations

  • Hosts, NIC, processor; LAN, cabling, switches.
  • TCP packet loss - how to measure?

Infrastructure considerations

  • Links too small; congestion, scenic routing; broken equipment; admin restrictions
  • Host problems - cpu utilization, memory limitations, i/o bus speed, disk
  • App problems - chappty protocol,

Tuning

  • Receive window size.
  • TCP max buffer size -- Linux 2.6 kernel. Depends on largest round trip time.
  • Q: which kernel should be used for high I/O edge servers.

Diagnostics

  • NDT (Network Diagnostic Tool) from Internet2, they use a server with a Web100 kernel. Finding things like duplex mis-matches.
  • Use iperf to see achievable bandwidth.
  • Lots of tools. Need to identify most useful ones and deploy on Tier2.
  • Need to examine each piece of software to check for internal buffer and window size limitations, eg., scp as per Shawn's example.
  • Need to examine the stack of DQ2 tools and Panda pilot software in play.

Network Research Projects

  • Will these produce useful components for Tier2 soon?
  • What tools from Ultralight?

Milestones for Tier-2 Networking

  • Two weeks (Sept 1) - Identify hardware to run NDT service at each major resource location and run it (CDROM Image or install) (send email with details to Shawn at smckee@umich.edu)
  • One month (Sept 15) - Create US ATLAS Network page at BNL to agregate diagrams, site network details, monitoring, etc.
  • One month (Sept 15) - Register/document all NDT services
    • All sites should "register" their NDT info on the common BNL website so that all NDT servers are easily located
    • Each site should add the NDT info to their local pages as well as the "common" info from the BNL site (easy to find each sites NDT servers from all other sites).
  • ~One month (October 1) - produce detailed network diagrams for each Tier-2 site (send to Shawn at smckee@umich.edu)
    • Provide WAN diagram including (Example Tier2 WAN diagram):
      • IP subnet information for EACH site involved
      • Contact info and date on drawing
      • Provide locations and IP info for major switches/routers
      • Show type of connections involved (FastEthernet? , Gig, 10Gig, ATM, etc.)
    • List "typical" path from compute node (CE) to site egress
    • List "typical" path from storage (SE) to site egress (could be same as CE above)
    • Include switch/router details, primary uplink port information
  • Two months (October 15) - Document current network performance between Tier-2 sites
    • Memory to memory performance (Iperf or pathchar)
      • Out of the box (current config)
      • Tuned and optimized
    • Disk to disk (bbcp or gridftp)
      • Out of the box (current config)
      • Tuned and optimized
  • Two months (October 30) - Deploy initial Tier-2 network monitoring
    • MonALISA (add US ATLAS Networking ML "group", determine viability to "share" ML install from OSG)
    • IEPM "client" (Must confirm with SLAC/Les/Connie, determine functionality we can expect)
    • PerfSONAR measurement point (Work with PerfSONAR? group...help in installing?)
  • Three months (???) - Packaging and deployment of network related tools at Tier-2s (Iperf, Thrulay, etc.)
  • Four months (???) - Demonstrate WAN disk-to-disk transfers utilizing 90% of site's bottleneck bandwidth between two Tier-2 sites
    • Document tuning and optimization at both sites neccessary to achieve this
  • Six months (???) - Deploy "beta" end-host agent (LISA or descendant) on selected edge servers
  • Six months (???) - Update/tune network monitoring system(s). Review usefulness of various components.
  • Nine months (???) - Review Tier-2 site network status. Review tool usefulness. Plan for upgrades. Update maps and tools.
  • Ongoing (as required) - Integrate, test and deploy network research "products" at our sites (QoS? , light-path management, etc)
  • Ongoing - Update site network details as they change. Document network and changes. Provide site info pages to "harvest" and organize monitoring and measuring information.

US ATLAS Tier-2 Networking Information for Tier-2 Meeting

There are a number of areas in networking that Tier-2 centers should be aware of:

  • Mapping existing networking infrastructure
  • Host related details and tuning
  • Monitoring and network measurements
  • Diagnostics and debugging

My talk will cover a bit about all of these topics. We can flesh out action items and milestone planning during the meeting.

Network Talks (Diagnosis, Bottleneck Determination)

Below are references of interest covering various aspects of networking relevant to the Tier-2 sites.

A link for Les Cottrell's talk on Diagnosing Network problems (for non-networkers) is at:
http://www.slac.stanford.edu/grp/scs/net/talk05/nfnn2-jun05.ppt

Rich Carlson has a nice set of presentations on network bottleneck determination at:
http://people.internet2.edu/~rcarlson/presentations/

Russ Hobby (Inernet2) pointed out an interesting talk on networks to support moving football video from the Internet2 Spring 2006 meeting: http://www.internet2.edu/presentations/spring06/20060425-pac10-thomas.ppt

Network Research Relevant to Tier-2

UltraLight? : http://www.ultralight.org
Terapaths: http://www.atlasgrid.bnl.gov/terapaths
Lambda Station: http://www.lambdastation.org
OSCARS: http://www.es.net/oscars/index.html
UltraScienceNet? : http://www.csm.ornl.gov
HOPI http://networks.internet2.edu/hopi
Web100 http://www.web100.org

HENP Network Related URLs

HENP Internet2 Sponsored Interest Group: http://henp.internet2.edu
International Committee on Future Accelerators - Standing Committee on Interregional Connectivity: http://icfa-scic.web.cern.ch/ICFA-SCIC/
The LHC Optical Private Network (LHC-OPN): http://lhcopn.web.cern.ch/lhcopn/
DISUN (Data Intensive Science University Network): http://www.disun.org

Network Tuning/Optimization URLs

PSC Tuning Page: http://www.psc.edu/networking/projects/tcptune
LBL Tuning Page: http://www-didc.lbl.gov/TCP-tuning/TCP-tuning.html

Network Tools

The Internet2 End-to-End Initiative tracks lots of tools: http://e2epi.internet2.edu/
MonALISA? and LISA http://monalisa.cern.ch
NDT http://e2epi.internet2.edu/ndt/
NDT Test site http://miranda.ctd.anl.gov:7123
There is a comprehensive list of network monitoring tools at: http://www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html

Network Examples and Case Studies

Some network case studies page are at:
http://www.slac.stanford.edu/grp/scs/net/case/html/

SLAC has a web page that is supposed to tell users what to do in case of WAN problems. It is at:
http://www.slac.stanford.edu/comp/net/problem-reporting.html

Network Backbones of Relevance

USLHCNet: http://www.uslhcnet.org/ Abilene: http://abilene.internet2.edu/
ESNet: http://es.net
GEANT: http://www.geant.net/
NLR: http://www.nlr.net

-- ShawnMckee - 08 May 2006

-- JohnDeStefano - 11 May 2007: Fixed broken links to WAN status images.

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments


jpg ATLAS-BNL-300copy.jpg (1454.5K) | ShawnMckee? , 17 Oct 2006 - 09:55 | Example WAN Tier2 diagram
jpg southWest.jpg (50.2K) | DantongYu? , 17 Oct 2006 - 09:55 | SouthWest? Tier 2 Clusters
jpg NorthEastTier2Cluster.jpg (31.4K) | DantongYu? , 17 Oct 2006 - 09:55 | NorthEast? Tier 2
jpg ATLAS-BNL-300copy.JPG (617.7K) | ShawnMckee? , 17 Oct 2006 - 09:55 | Example WAN Tier2 diagram
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback