DIAL debugging

D. Adams

DIAL is primarily wriiten in C++. There are a number of tools available at BNL to aid in debugging. These include debuggers and memory checkers.

The following comments are relevant to the acas machines on the ATLAS computing farm at BNL. The setup files referenced below may be found in ~dladams/bin.

GCC 2.96 is installed at /usr. This is ancient and shoudl be avoided where possible.
GCC 3.2 is installed at /usr/local/gcc-alt-3.2. This is the ATLAS default. If you are an ATLAS developer, you likely get this by default.
GCC 3.2.3 is installed at ~dladams/apps/gcc/3.2.3/rh73. It appears to be compatible with 3.2. I set this up with

> . setup_gcc.sh 3.2.3/rh73

GDB 5.2.2 is installed in /usr/bin. If you want a more recent version, try 6.1 in ~dladams/apps/gdb. I set this up with

> .setup_gdb.sh

The totalview debugger can be started with the command

> totalview myexe

Valgrind can be used to check for memory errors and leaks. I have installed version 2.1.1 at ~/apps/valgrind/2.1.1. I setup and run with

> . setup_valgrind.sh
> memcheck myexe

Insure is a licensed commercial memory checker. It is isntalled at /afs/usatlas.bnl.gov/i386_redhat72/app/insure-6.1. I set up with

> . setup_insure.sh
Insure expects you to compile your sources with insure. If you add ~dladams/apps/g++insure to the front of your path, insure will be used in place of g++ for compiling and linking.

Insure provides a memory checker called Chaperon that may be used on an executable not built with insure:

> Chaperon myexe


Codewizard is installed at /afs/usatlas.bnl.gov/i386_redhat72/app/codewizard-4.3. I have not used it.

System tools
To examine libraies use ldd and nm.
Set LD_DEBUG to get information as libraries are loaded. The value help will provide describe the options.
Use MALLOC_CHECK_ to use version of malloc that prints messages or crashes upon memory errors.
Use "netstat -lp" to see which processes are listening on which ports.

C++ standard

For viewing only. If you want your own copy, please order from techstreet.

Difficult problems
We have run across a couple difficult problems.

In order for threads to work properly, the main program must be linked with pthread (-lpthread). The distributed root 3.10.xx was built without this option and had to be rebuilt to avoid crashes whe the DIAL thread-enabled libraries were loaded.

Globus provide its own SSL library and conflicts with the system version cause mysterious crashes. Solution is to make sure all libraries requiring SSL are linked against the globus version.