Draft Tier-2 Networking Planning and Milestones
During the Tier-2 meeting at UChicago we held a breakout session on Clusters and Networking. The planning, questions, and milestones from this meeting are captured here.
The following list includes the general network related actions we would like to see happen in prioritized order:
Networking questions
- Need to map out details of existing networking connections.
- Need to locate the bottlenecks.
- Need to gather plans for upgrades.
- Monitoring tools important: which tools to deploy?
- Important: instructions for tuning and optimizing network use.
LAN
- WANs much faster these days.
- Need to understand the details here, and that each Tier2 has the required infrastructure.
WAN status
- Can MWT2 work with a regional provider to obtain a single subnet block between the sites.
Network configuration considerations
- Hosts, NIC, processor; LAN, cabling, switches.
- TCP packet loss - how to measure?
Infrastructure considerations
- Links too small; congestion, scenic routing; broken equipment; admin restrictions
- Host problems - cpu utilization, memory limitations, i/o bus speed, disk
- App problems - chappty protocol,
Tuning
- Receive window size.
- TCP max buffer size -- Linux 2.6 kernel. Depends on largest round trip time.
- Q: which kernel should be used for high I/O edge servers.
Diagnostics
- NDT (Network Diagnostic Tool) from Internet2, they use a server with a Web100 kernel. Finding things like duplex mis-matches.
- Use iperf to see achievable bandwidth.
- Lots of tools. Need to identify most useful ones and deploy on Tier2.
- Need to examine each piece of software to check for internal buffer and window size limitations, eg., scp as per Shawn's example.
- Need to examine the stack of DQ2 tools and Panda pilot software in play.
Network Research Projects
- Will these produce useful components for Tier2 soon?
- What tools from Ultralight?
Milestones for Tier-2 Networking
- Two weeks (Sept 1) - Identify hardware to run NDT service at each major resource location and run it (CDROM Image or install) (send email with details to Shawn at smckee@umich.edu)
- One month (Sept 15) - Create US ATLAS Network page at BNL to agregate diagrams, site network details, monitoring, etc.
- One month (Sept 15) - Register/document all NDT services
- All sites should "register" their NDT info on the common BNL website so that all NDT servers are easily located
- Each site should add the NDT info to their local pages as well as the "common" info from the BNL site (easy to find each sites NDT servers from all other sites).
- ~One month (October 1) - produce detailed network diagrams for each Tier-2 site (send to Shawn at smckee@umich.edu)
- Provide WAN diagram including (Example Tier2 WAN diagram):
- IP subnet information for EACH site involved
- Contact info and date on drawing
- Provide locations and IP info for major switches/routers
- Show type of connections involved (FastEthernet? , Gig, 10Gig, ATM, etc.)
- List "typical" path from compute node (CE) to site egress
- List "typical" path from storage (SE) to site egress (could be same as CE above)
- Include switch/router details, primary uplink port information
- Two months (October 15) - Document current network performance between Tier-2 sites
- Memory to memory performance (Iperf or pathchar)
- Out of the box (current config)
- Tuned and optimized
- Disk to disk (bbcp or gridftp)
- Out of the box (current config)
- Tuned and optimized
- Two months (October 30) - Deploy initial Tier-2 network monitoring
- MonALISA (add US ATLAS Networking ML "group", determine viability to "share" ML install from OSG)
- IEPM "client" (Must confirm with SLAC/Les/Connie, determine functionality we can expect)
- PerfSONAR measurement point (Work with PerfSONAR? group...help in installing?)
- Three months (???) - Packaging and deployment of network related tools at Tier-2s (Iperf, Thrulay, etc.)
- Four months (???) - Demonstrate WAN disk-to-disk transfers utilizing 90% of site's bottleneck bandwidth between two Tier-2 sites
- Document tuning and optimization at both sites neccessary to achieve this
- Six months (???) - Deploy "beta" end-host agent (LISA or descendant) on selected edge servers
- Six months (???) - Update/tune network monitoring system(s). Review usefulness of various components.
- Nine months (???) - Review Tier-2 site network status. Review tool usefulness. Plan for upgrades. Update maps and tools.
- Ongoing (as required) - Integrate, test and deploy network research "products" at our sites (QoS? , light-path management, etc)
- Ongoing - Update site network details as they change. Document network and changes. Provide site info pages to "harvest" and organize monitoring and measuring information.
US ATLAS Tier-2 Networking Information for Tier-2 Meeting
There are a number of areas in networking that Tier-2 centers should be aware of:
- Mapping existing networking infrastructure
- Host related details and tuning
- Monitoring and network measurements
- Diagnostics and debugging
My talk will cover a bit about all of these topics. We can flesh out action items and milestone planning during the meeting.
Network Talks (Diagnosis, Bottleneck Determination)
Below are references of interest covering various aspects of networking relevant to the Tier-2 sites.
A link for Les Cottrell's talk on Diagnosing Network problems (for non-networkers) is at:
http://www.slac.stanford.edu/grp/scs/net/talk05/nfnn2-jun05.ppt
Rich Carlson has a nice set of presentations on network bottleneck determination at:
http://people.internet2.edu/~rcarlson/presentations/
Russ Hobby (Inernet2) pointed out an interesting talk on networks to support moving football video from the Internet2 Spring 2006 meeting:
http://www.internet2.edu/presentations/spring06/20060425-pac10-thomas.ppt
Network Research Relevant to Tier-2
UltraLight? :
http://www.ultralight.org
Terapaths:
http://www.atlasgrid.bnl.gov/terapaths
Lambda Station:
http://www.lambdastation.org
OSCARS:
http://www.es.net/oscars/index.html
UltraScienceNet? :
http://www.csm.ornl.gov
HOPI
http://networks.internet2.edu/hopi
Web100
http://www.web100.org
HENP Network Related URLs
HENP Internet2 Sponsored Interest Group:
http://henp.internet2.edu
International Committee on Future Accelerators - Standing Committee on Interregional Connectivity:
http://icfa-scic.web.cern.ch/ICFA-SCIC/
The LHC Optical Private Network (LHC-OPN):
http://lhcopn.web.cern.ch/lhcopn/
DISUN (Data Intensive Science University Network):
http://www.disun.org
Network Tuning/Optimization URLs
PSC Tuning Page:
http://www.psc.edu/networking/projects/tcptune
LBL Tuning Page:
http://www-didc.lbl.gov/TCP-tuning/TCP-tuning.html
Network Tools
The Internet2 End-to-End Initiative tracks lots of tools:
http://e2epi.internet2.edu/
MonALISA? and LISA
http://monalisa.cern.ch
NDT
http://e2epi.internet2.edu/ndt/
NDT Test site
http://miranda.ctd.anl.gov:7123
There is a comprehensive list of network monitoring tools at:
http://www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html
Network Examples and Case Studies
Some network case studies page are at:
http://www.slac.stanford.edu/grp/scs/net/case/html/
SLAC has a web page that is supposed to tell users what to do in case of WAN problems. It is at:
http://www.slac.stanford.edu/comp/net/problem-reporting.html
Network Backbones of Relevance
USLHCNet:
http://www.uslhcnet.org/
Abilene:
http://abilene.internet2.edu/
ESNet:
http://es.net
GEANT:
http://www.geant.net/
NLR:
http://www.nlr.net
--
ShawnMckee - 08 May 2006
--
JohnDeStefano - 11 May 2007: Fixed broken links to WAN status images.
About This Site
Please note that this site is a content mirror of the BNL USATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your BNL USATLAS account.
Attachments
ATLAS-BNL-300copy.jpg (1454.5K) | ShawnMckee? , 17 Oct 2006 - 09:55 | Example WAN Tier2 diagram
southWest.jpg (50.2K) | DantongYu? , 17 Oct 2006 - 09:55 | SouthWest? Tier 2 Clusters
NorthEastTier2Cluster.jpg (31.4K) | DantongYu? , 17 Oct 2006 - 09:55 | NorthEast? Tier 2
ATLAS-BNL-300copy.JPG (617.7K) | ShawnMckee? , 17 Oct 2006 - 09:55 | Example WAN Tier2 diagram