r21 - 07 Nov 2014 - 14:11:10 - HorstSeveriniYou are here: TWiki >  Projects Web > LHCperfSONAR

perfSONAR for LHC Use

This page documents tools and tips regarding perfSONAR, perfSONAR-PS and perfSONAR-MDM use for LHC.

Installation or Upgrade to v3.4

Please see notes at https://twiki.opensciencegrid.org/bin/view/Documentation/InstallUpdatePS

Current Issues for v3.3.2

Since the release of 3.3.2 on February 3, 2014 we have found a few issues that may need addressing for particular sites.

       You can apply it by doing:
             cd /opt/perfsonar_ps/perfsonarbuoy_ma
             patch -p2 -i /path/to/psb_logging.patch
       If you  changed  the  logging level,  you  can change the logging back to INFO, and restart the service. That should turn the debugging output there to be actual DEBUG output instead of INFO.
  • The PingER tests seem to trigger some DNS lookup issues for particular sites. The error generates emails to host and mesh administrators shown below. This is under investigation and we are waiting for some solution from the developers.
Mesh Error:
  Mesh: WLCG sites
  Host: ps1.ochep.ou.edu
  Error: Problem adding PingER tests: Problem adding test Ping Test Between WLCG Latency Hosts: Problem looking up address: ps-latency.atlas.unimelb.edu.au at /opt/perfsonar_ps/mesh_config/bin/../lib/perfSONAR_PS/MeshConfig/Generators/PingER.pm line 139.
  • The performance of the Latency instances can decrease with time, due to the large amount of data in the OWAMP MySQL database. To address this Dave Lesny/UIUC found that running mysqlcheck  -o -A on the MySQL databases can significantly improve performance. Sites may want to run this if they are experience slow plotting times for OWAMP results on their hosts.
  • There was an issue with MySQL on CentOS that could cause a socket file to be left on the system at /var/lib/mysql/mysql.sock . Rebooting the system or trying to restart MySQL won't fix the issue. Site will see their PerfSONARBuoy_MA on port 8085 being "down" and this can generate "UNKNOWN" (Orange) results on the MaDDash dashboard. To fix this site administrators will need (as 'root') to stop MySQL, verify no MySQL processes are left running and delete the /var/lib/mysql/mysql.sock socket file if it exists. Then restart MySQL and verify it starts OK. The relevant commands:
    • /sbin/service  mysqld  stop
    • ps auxww  |  grep mysql (To verify no processes are still running)
    • rm  -f /var/lib/mysql/mysql.sock
    • /sbin/service  mysqld  start
    • The MySQL issue above should be addressed by an update in the YUM repo. Sites should ensure they are up-to-date via 'yum update' even if they recently installed 3.3.2. If there are updates installed it is recommended to reboot the node to make sure they get into place.

Maintenance Tips for LHC perfSONAR-PS Installations

Jason Zurawski has provided a PDF file which documents some basic maintenance, troubleshooting and repair steps to address some issues in perfSONAR-PS. This document is a little dated now for v3.3.2 but still has useful information in it. Have a look at 20120204-USATLAS-pSPT.pdf. NOTE: All LHC testing sites need to make sure they have provided a sufficient number of ports for testing...see section 6 in the PDF file.

There have been a few issues noticed when we utilize perfSONAR-PS at a scale that is larger than it was tested at. One example is the amount of local disk that is allowed to keep current test results. For latency tests with a mesh of about 10 sites we can exceed the default storage of 1GB of test results within a day. If your limit within perfSONAR-PS is set a 1GB, new tests will fail once you reach 1GB. There are automatic cleaning scripts which will cleanup old files every day but insufficient space can still cause testing failures during the day. The recommendation is to increase the allowed storage space to 3GB (assuming you are not pressed for local disk space). You should do this on your latency nodes:

  • Login via the gui https://your_latency_node/toolkit/admin/owamp/ (or click "External OWAMP Limits" from the left-side of your latency node web interface)
  • For the "Unprivileged Clients" box, click the "Edit Group Limits" URL
  • Set 3GB (or something larger than 1GB) in the pop-up box:
    set_OWAMP_disk_limit.png
  • Click "Save" at the bottom of the screen

You can check on your current usage of temporary owamp result files with `du -sh /var/lib/owamp`

NOTE: Current version 3.3.2 should have appropriate limits already

Setup 'cron' Tasks for DB Cleanup

During normal operations at LHC scale we are accumulating a lot of data. Dave Lesny/Illinois created some cleanup scripts (thanks!). We recommend implementing some regular clean-up scripts for those perfSONAR-PS DB instances installed on a local disk (the 'netinstall' variants). Here are two scripts (one for OWAMP nodes and one for BWCTL nodes) that should be set executable and be added to the /etc/cron.weekly

The script details: cleanupdb_owamp.sh

#!/bin/sh

# Cleanup the DB first
/opt/perfsonar_ps/perfsonarbuoy_ma/bin/check_pSB_db --dbtype=owamp --verbose

# Backup 
/opt/perfsonar_ps/perfsonarbuoy_ma/bin/clean_pSB_db.pl --mysqldump-opts="--skip-lock-tables" --dbtype=owamp --maxdays=45 --owmesh-dir=/opt/perfsonar_ps/perfsonarbuoy_ma/etc/ --dumpdir=/var/log/BACKUP/owamp

And the corresponding file for the BWCTL 'netinstall' node: cleanupdb_bwctl.sh

#!/bin/sh

# Cleanup the DB first
/opt/perfsonar_ps/perfsonarbuoy_ma/bin/check_pSB_db --dbtype=bwctl --verbose

# Backup 
/opt/perfsonar_ps/perfsonarbuoy_ma/bin/clean_pSB_db.pl --mysqldump-opts="--skip-lock-tables" --dbtype=bwctl --maxmonths=6 --owmesh-dir=/opt/perfsonar_ps/perfsonarbuoy_ma/etc/ --dumpdir=/var/log/BACKUP/bwctl

Remember to setup the right mode and ownership:

chmod 755 /etc/cron.weekly/cleanupdb_owamp.sh
chown root:root /etc/cron.weekly/cleanupdb_owamp.sh
chmod 755 /etc/cron.weekly/cleanupdb_bwctl.sh
chown root:root /etc/cron.weekly/cleanupdb_bwctl.sh

perfSONAR-PS Node Tuning

The perfSONAR-PS installation provides a default installation which is designed for sites supporting a few tests. For LHC use we typically configure significantly more scheduled tests than the default configuration provided by perfSONAR-PS was designed for. Because of this we need to make some host and perfSONAR-PS changes to better support LHC-scale use-cases.

Configure perfSONAR-PS Limits and Ports

The initial install of perfSONAR-PS assigns a limited number of ports that are available for testing. If a site has a few tests this is OK. However many of the LHC sites are testing to with many different sites. The problem is that ports used in tests are not always quickly released back for reuse. Philippe Laurens/MSU tracked this down (thanks!). The result is that tests may fail to run because there are insufficient free ports available for the test.

BWCTL Port Configuration

For BWCTL (Throughtput) nodes you need to increase the number of ports available The way that BWCTL works is that there is a connection that done before iperf is run to synchronize the two testers, and then the connection for the iperf test itself. If you make a change through the GUI it splits the port range into two equal parts: the first range 'peer_port' is for the control connection; the second range 'iperf_port' is for the iperf connection. We recommend providing 500 ports for BWCTL's use: 5001-5500. If you want to edit the file manual it is /etc/bwctld/bwctld.conf on your throughput node. Change it to look something like:

group   bwctl
iperf_port      5251-5500
user    bwctl
peer_port       5001-5250
facility        local5

NOTE: Current 3.3.2 version should have appropriate configuration already

Increase Disk Space Limit for OWAMP (Latency) Host

There have also been a few issues noticed with the OWAMP hosts as we utilize them at a scale that is larger than they were tested at. One example is the amount of local disk that is allowed to keep current test results. For latency tests with a mesh of about 10 sites we can exceed the default storage of 1GB of test results within a day. If your limit within perfSONAR-PS is set a 1GB, new tests will fail once you reach 1GB. Philippe Laurens/MSU found this (thanks!). There are automatic cleaning scripts which will repair this every day but it can cause testing failures during the day. The recommendation is to increase the allowed storage space to 3GB (assuming you are not pressed for local disk space). You should do this on your latency nodes:

  • Login via the gui https://your_latency_node/toolkit/admin/owamp/ (or click "External OWAMP Limits" from the left-side of your latency node web interface)
  • For the "Unprivileged Clients" box, click the "Edit Group Limits" URL
  • Set 3GB (or something larger than 1GB) in the pop-up box:
    set_OWAMP_disk_limit.png
  • Click "Save" at the bottom of the screen

Configure 'syslog' Not to 'sync' Every Line

The syslog logging on the perfSONAR-PS nodes (especially the OWAMP node) can put a lot of I/O stress on the system disk. This was noted by Garhan Attebury/Nebraska (thanks!). We recommend implementing a syslog tweak which removes the requirement to 'sync' every log line entry for specific files. The downside is the possibility of losing some logging records in a crash. To do this edit the /etc/syslog.conf file and prepend a '-' to any log file so that it won't 'sync' every output line. Here is an example of the changes made on a OWAMP host:

*.info;mail.none;authpriv.none;cron.none                -/var/log/messages
mail.*                                                  -/var/log/maillog
local5.*                                                -/var/log/perfsonar/owamp_bwctl.log

Make corresponding changes on the BWCTL host. You need to run service syslog restart after the change is made.

MySQL Tuning Recommendations

You might find that the performance of a latency node is improved by some MySQL? tuning. This is accomplished by making changes in /etc/my.cnf and restarting the mysqld service.

As the number of test endpoints grows, so does the underlying database size. A fully loaded latency node testing against 50 WLCG endpoints might generate as many as 2.5M records per day. This results in very slow access to to database when generating service graphs.

To help speed up access, the following type of changes could be made to /etc/my.cnf These are extreme values and most likely can be tuned down for normal systems.


[mysqld]

skip-innodb
#NOTE from Shawn...this failed on my v3.3 RC3 latency host with "[ERROR] /usr/libexec/mysqld: unknown option '--skip-bdb'"
#skip-bdb  

max_connections=1000

key_buffer = 384M
max_allowed_packet = 1M
table_cache = 512
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 32M

# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 4

Modifications For 10GE Enabled Bandwidth/Throughput Nodes

Overruns when a 10Gb node tests against a 1Gb node

When a 10Gb node performs a throughput test against a 1Gb node, it is very possible that a serious overrun will occur. There is no flow control or throttling in throughput testing so it is very likely that the 1Gb node will drop most of the packets sent to it. This will result in an erroneous report falsely indicating a network problem between the two nodes. The problem appears to be worse the more hops between the two nodes. For example, a test path which would normally test at greater than 900Mb, might appear to be less than 200Mb.

It is possible to compensate for the speed mismatch by dual homing the 10Gb host with a second 1Gb connection. By testing 10Gb to 10Gb and 1Gb to 1Gb, the overrun will be avoided. The attached two scripts are wrapper scripts which replace bwctl and iperf. They allow for table driven static routes to be created on a dual homed node. By default, all test hosts configured on the node will use the 1Gb interface unless the FQDN of the remote node is listed in the "ten-g-hosts.txt" file. Incoming tests are controlled by the interface used by the remote host to initiate the connection. A static route is created on the fly to make certain the reverse test uses the same interface as the incoming test.

The dual homed throughput node should be configured with two interfaces, 10Gb and 1Gb, connected to the same subnet but with different IPs. Each IP needs to have a FQDN associated with it, such uct2-net02.uchicago.edu (10Gb) and uct2-net02-1g.uchicago.edu (1Gb)

Perfsonar should be configured to use the 10Gb interface by default. Edit the file "/opt/perfsonar_ps/toolkit/etc/discover_external_address.conf" and add an entry such as

external_interface      10GNIC

where 10GNIC is the interface name of the 10Gb interface, such as "eth2", "eth3.634", etc. After making this change, the node will need to be rebooted for this change to take effect.

The following three files should be downloaded into /usr/local/perfsonar on the 10Gb throughput node

Each script contains documentation on other steps which need to be taken. The following are some of those steps

Edit each script and makes changes to the site specific configuration parameters. These would include "DEFAULT_GW", "TEN_G_FQDN", "TEN_G_IP", "ONE_G_FDQN", "ONE_G_IP", "TEN_G_DEV" and "ONE_G_DEV".

In /etc/sudoers, comment out (or remove) the line "Defaults requiretty" and add the following two lines

#Defaults requiretty
perfsonar    ALL=(ALL) NOPASSWD: /sbin/route
bwctl        ALL=(ALL) NOPASSWD: /sbin/route

Optionally you can perform some additional logging by adding the following to /etc/syslog.conf

local5.*        /var/log/perfsonar/bwctl.log
local6.*        /var/log/perfsonar/iperf.log

Edit the "ten-g-hosts.txt" file and add/remove any remote 10Gb nodes which will be used in the testing. These entries must be the same FQDN as used in "Scheduled Tests".

To enable these scripts, the existing bwctl and iperf are renamed to bwctl.bin and iperf.bin, and these scripts are inserted in their place.

cd /usr/bin
mv bwctl bwctl.bin
mv iperf iperf.bin
ln -s /usr/local/perfsonar/bwctl.sh bwctl
ln -s /usr/local/perfsonar/iperf.sh iperf

The easiest way to force the creation of the static routes is to use the perfsonar web interface on the node. Go to the "Enabled Services" page and click on "Save". This will cause perfsonar to restart its services, which in turn invokes "bwctl" for each node configured in the "Scheduled Tests". The bwctl.sh script will then create the appropriate static routes for each test node. You can use the "route" or "ip route" to verify each remote host has a static route configured to the appropriate interface.

Considerations Regarding ``Jumbo-Frames'' (MTU>1500)

In general the LHCOPN has recommended all their instances be configured with Jumbo frames (9000 bytes). Many of the other WLCG sites are using the standard MTU. As long as MTU Path Discovery is properly functioning sites shouldn't see any issues testing between endpoints with different MTU values. However if there are firewalls along the path the inappropriately filter ICMP packets, it can cause problems with the larger MTU packets being dropped or fragmented and giving poor test results.

Jumbo frames can help improve throughput, especially on WAN paths.

Draft Section on "Mesh" Configuration

We are working with the perfSONAR-PS developers to test and use the so-called "mesh" configuration for our WLCG instances. A page was setup to document the update procedure for USATLAS sites at http://www.usatlas.bnl.gov/twiki/bin/view/Projects/PerfSONAR_PS_Mesh I

More Documentation

Other useful installation and configuration notes on the CERN TWiki are LCG/PerfsonarDeployment and LHCONE/SiteList

Miscellaneous Notes

Please send along any comments or suggestions about this information and planning. Also you can directly edit the Twiki but please send Shawn McKee (smckee@umich.edu) a brief note when you do so I can keep everyone informed.

-- ShawnMckee - 12 February 2014

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments


pdf 20120204-USATLAS-pSPT.pdf (322.3K) | ShawnMckee, 31 May 2012 - 10:38 | Jason Zurawski's perfSONAR-PS maintenance/troubleshooting document
png set_OWAMP_disk_limit.png (9.3K) | ShawnMckee, 31 May 2012 - 10:43 | Pop-up box to set OWAMP disk limit
sh bwctl.sh (3.2K) | DavidLesny, 10 Sep 2012 - 16:32 |
sh iperf.sh (3.3K) | DavidLesny, 10 Sep 2012 - 16:32 |
txt ten-g-hosts.txt (0.3K) | DavidLesny, 10 Sep 2012 - 16:32 |
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback