[root@uct2-dc1 ~]# tracepath dct00.usatlas.bnl.gov
1: uct2-dc1.uchicago.edu (128.135.158.238) 0.083ms pmtu 9000
1: v624router.uchicago.edu (128.135.158.193) 0.991ms
2: MREN-IWIRE-10G-router.uchicago.edu (128.135.247.122) 14.105ms
3: chi-gev124-mren.es.net (198.125.140.93) 1.752ms
4: chislsdn1-chislmr1.es.net (134.55.219.25) asymm 5 1.344ms
5: chiccr1-chislsdn1.es.net (134.55.207.33) asymm 6 1.495ms
6: aofacr1-chicsdn1.es.net (134.55.218.94) asymm 7 28.348ms
7: bnlmr1-aoacr1.es.net (134.55.217.57) 30.755ms
8: bnlsite-bnlmr1.es.net (198.124.216.178) 30.828ms
9: bnlsite-bnlmr1.es.net (198.124.216.178) asymm 8 30.909ms pmtu 1500
10: dct00.usatlas.bnl.gov (192.12.15.8) asymm 9 30.030ms reached
Resume: pmtu 1500 hops 10 back 9
Config info about the 10G Chelsio(cxgb) NIC we're using:
[root@uct2-dc1 ~]# ifconfig eth3
eth3 Link encap:Ethernet HWaddr 00:07:43:01:18:D0
inet addr:128.135.158.238 Bcast:128.135.158.255 Mask:255.255.255.192
inet6 addr: fe80::207:43ff:fe01:18d0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:3485382813 errors:0 dropped:0 overruns:0 frame:0
TX packets:1677980538 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2582111292 (2.4 GiB) TX bytes:2794471488 (2.6 GiB)
Interrupt:217 Memory:fc4ff000-fc4fffff
ethtool reports for the NIC:
[root@uct2-dc1 ~]# ethtool -k eth3 Offload parameters for eth3: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: on [root@uct2-dc1 ~]# ethtool -i eth3 driver: cxgb version: 2.1.4a firmware-version: N/A bus-info: 0000:01:03.0 [root@uct2-dc1 ~]# ethtool -g eth3 Ring parameters for eth3: Pre-set maximums: RX: 16384 RX Mini: 0 RX Jumbo: 16384 TX: 16384 Current hardware settings: RX: 4096 RX Mini: 0 RX Jumbo: 512 TX: 1024First test:
[root@uct2-dc1 ~]# iperf -c dct00.usatlas.bnl.gov -w4M -i2 -t60 ------------------------------------------------------------ Client connecting to dct00.usatlas.bnl.gov, TCP port 5001 TCP window size: 256 KByte (WARNING: requested 4.00 MByte) ------------------------------------------------------------ [ 3] local 128.135.158.238 port 34433 connected with 192.12.15.8 port 5001 [ 3] 0.0- 2.0 sec 12.0 MBytes 50.2 Mbits/sec [ 3] 2.0- 4.0 sec 13.4 MBytes 56.0 Mbits/sec [ 3] 4.0- 6.0 sec 13.4 MBytes 56.0 Mbits/sec [ 3] 6.0- 8.0 sec 13.5 MBytes 56.7 Mbits/sec [ 3] 8.0-10.0 sec 13.4 MBytes 56.3 Mbits/sec [ 3] 10.0-12.0 sec 13.5 MBytes 56.6 Mbits/sec [ 3] 12.0-14.0 sec 13.3 MBytes 55.9 Mbits/sec [ 3] 14.0-16.0 sec 13.2 MBytes 55.5 Mbits/sec [ 3] 16.0-18.0 sec 13.4 MBytes 56.0 Mbits/sec [ 3] 18.0-20.0 sec 13.3 MBytes 55.8 Mbits/sec [ 3] 20.0-22.0 sec 13.6 MBytes 57.0 Mbits/sec [ 3] 22.0-24.0 sec 13.6 MBytes 57.0 Mbits/sec [ 3] 24.0-26.0 sec 13.4 MBytes 56.2 Mbits/sec [ 3] 26.0-28.0 sec 13.4 MBytes 56.0 Mbits/sec [ 3] 28.0-30.0 sec 13.5 MBytes 56.5 Mbits/sec [ 3] 30.0-32.0 sec 13.2 MBytes 55.5 Mbits/sec [ 3] 32.0-34.0 sec 13.3 MBytes 55.8 Mbits/sec [ 3] 34.0-36.0 sec 13.6 MBytes 57.0 Mbits/sec [ 3] 36.0-38.0 sec 13.5 MBytes 56.7 Mbits/sec [ 3] 38.0-40.0 sec 13.4 MBytes 56.1 Mbits/sec [ 3] 40.0-42.0 sec 13.6 MBytes 57.2 Mbits/sec [ 3] 42.0-44.0 sec 13.8 MBytes 57.9 Mbits/sec [ 3] 44.0-46.0 sec 13.7 MBytes 57.4 Mbits/sec [ 3] 46.0-48.0 sec 13.9 MBytes 58.3 Mbits/sec [ 3] 48.0-50.0 sec 13.8 MBytes 58.1 Mbits/sec [ 3] 50.0-52.0 sec 13.6 MBytes 57.2 Mbits/sec [ 3] 52.0-54.0 sec 13.8 MBytes 58.1 Mbits/sec [ 3] 54.0-56.0 sec 14.2 MBytes 59.6 Mbits/sec [ 3] 56.0-58.0 sec 13.8 MBytes 57.8 Mbits/sec [ 3] 58.0-60.0 sec 13.5 MBytes 56.5 Mbits/sec [ 3] 0.0-60.0 sec 405 MBytes 56.5 Mbits/secLooks like a window size problem. Taking a look at our TCP kernel parameters:
[root@uct2-dc1 ~]# sysctl -a | grep tcp | grep mem net.ipv4.tcp_rmem = 4096 87380 174760 net.ipv4.tcp_wmem = 4096 16384 131072 net.ipv4.tcp_mem = 786432 1048576 1572864Changing our window size via editing /etc/sysctl.conf:
# Tuning with Shawn McKee 2007-09-21 net.ipv4.tcp_rmem = 4096 87380 20000000 net.ipv4.tcp_wmem = 4096 87380 20000000 # maximum receive socket buffer size, default 131071 net.core.rmem_max = 20000000 # maximum send socket buffer size, default 131071 net.core.wmem_max = 20000000...and enable those changes with sysctl -p. Rerunning the test:
[root@uct2-dc1 ~]# iperf -c dct00.usatlas.bnl.gov -w4M -i2 -t60 ------------------------------------------------------------ Client connecting to dct00.usatlas.bnl.gov, TCP port 5001 TCP window size: 8.00 MByte (WARNING: requested 4.00 MByte) ------------------------------------------------------------ [ 3] local 128.135.158.238 port 34814 connected with 192.12.15.8 port 5001 [ 3] 0.0- 2.0 sec 152 MBytes 636 Mbits/sec [ 3] 2.0- 4.0 sec 221 MBytes 925 Mbits/sec [ 3] 4.0- 6.0 sec 85.3 MBytes 358 Mbits/sec [ 3] 6.0- 8.0 sec 194 MBytes 814 Mbits/sec [ 3] 8.0-10.0 sec 141 MBytes 591 Mbits/sec [ 3] 10.0-12.0 sec 223 MBytes 935 Mbits/sec [ 3] 12.0-14.0 sec 225 MBytes 945 Mbits/sec [ 3] 14.0-16.0 sec 223 MBytes 936 Mbits/sec [ 3] 16.0-18.0 sec 119 MBytes 499 Mbits/sec [ 3] 18.0-20.0 sec 211 MBytes 884 Mbits/sec [ 3] 20.0-22.0 sec 158 MBytes 661 Mbits/sec [ 3] 22.0-24.0 sec 208 MBytes 871 Mbits/sec [ 3] 24.0-26.0 sec 225 MBytes 945 Mbits/sec [ 3] 26.0-28.0 sec 224 MBytes 939 Mbits/sec [ 3] 28.0-30.0 sec 166 MBytes 697 Mbits/sec [ 3] 30.0-32.0 sec 167 MBytes 699 Mbits/sec [ 3] 32.0-34.0 sec 224 MBytes 939 Mbits/sec [ 3] 34.0-36.0 sec 224 MBytes 941 Mbits/sec [ 3] 36.0-38.0 sec 142 MBytes 594 Mbits/sec [ 3] 38.0-40.0 sec 223 MBytes 935 Mbits/sec [ 3] 40.0-42.0 sec 225 MBytes 944 Mbits/sec [ 3] 42.0-44.0 sec 225 MBytes 944 Mbits/sec [ 3] 44.0-46.0 sec 111 MBytes 466 Mbits/sec [ 3] 46.0-48.0 sec 224 MBytes 938 Mbits/sec [ 3] 48.0-50.0 sec 85.2 MBytes 357 Mbits/sec [ 3] 50.0-52.0 sec 192 MBytes 806 Mbits/sec [ 3] 52.0-54.0 sec 195 MBytes 817 Mbits/sec [ 3] 54.0-56.0 sec 197 MBytes 827 Mbits/sec [ 3] 56.0-58.0 sec 222 MBytes 930 Mbits/sec [ 3] 58.0-60.0 sec 89.2 MBytes 374 Mbits/sec [ 3] 0.0-60.1 sec 5.39 GBytes 770 Mbits/secA diff of ethtool statistics for the NIC taken before and after the adjustments:
[root@uct2-dc1 ~]# diff after_tuning_00_eth3_stats.log initial_eth3_stats.log 2c2 < TxOctetsOK: 812456491889 --- > TxOctetsOK: 805960749892 4c4 < TxUnicastFramesOK: 1683359034 --- > TxUnicastFramesOK: 1678109675 6c6 < TxBroadcastFramesOK: 708 --- > TxBroadcastFramesOK: 706 19c19 < RxOctetsOK: 4721266899279 --- > RxOctetsOK: 4718950957998 21,23c21,23 < RxUnicastFramesOK: 3486758917 < RxMulticastFramesOK: 1081152 < RxBroadcastFramesOK: 746883 --- > RxUnicastFramesOK: 3483897061 > RxMulticastFramesOK: 1079493 > RxBroadcastFramesOK: 746241 38c38 < TSO: 39174502 --- > TSO: 39151913 41,42c41,42 < RxCsumGood: 3486755677 < TxCsumOffload: 321193185 --- > RxCsumGood: 3483893866 > TxCsumOffload: 317506335 52,54c52,54 < tx_reg_pkts: 1494008784 < tx_lso_pkts: 39174502 < tx_do_cksum: 321193161 --- > tx_reg_pkts: 1489378895 > tx_lso_pkts: 39151913 > tx_do_cksum: 317506311From BNL:
[root@uct2-dc1 ~]# iperf -s -w4M -i5 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 8.00 MByte (WARNING: requested 4.00 MByte) ------------------------------------------------------------ [ 4] local 128.135.158.238 port 5001 connected with 192.12.15.8 port 40401 [ 4] 0.0- 5.0 sec 360 MBytes 604 Mbits/sec [ 4] 5.0-10.0 sec 561 MBytes 942 Mbits/sec [ 4] 10.0-15.0 sec 561 MBytes 942 Mbits/sec [ 4] 15.0-20.0 sec 561 MBytes 942 Mbits/sec [ 4] 20.0-25.0 sec 561 MBytes 942 Mbits/sec [ 4] 25.0-30.0 sec 561 MBytes 941 Mbits/sec [ 4] 30.0-35.0 sec 561 MBytes 942 Mbits/sec [ 4] 35.0-40.0 sec 561 MBytes 942 Mbits/sec [ 4] 40.0-45.0 sec 561 MBytes 942 Mbits/sec [ 4] 45.0-50.0 sec 561 MBytes 941 Mbits/sec [ 4] 50.0-55.0 sec 561 MBytes 941 Mbits/sec [ 4] 55.0-60.0 sec 561 MBytes 942 Mbits/sec [ 4] 0.0-60.0 sec 6.38 GBytes 913 Mbits/sec-- Main.jau - 21 Sep 2007
Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.