DIAL is primarily wriiten in C++. There are a number of tools available at BNL to aid in debugging. These include debuggers and memory checkers.
The following comments are relevant to the acas machines on the ATLAS computing farm at BNL. The setup files referenced below may be found in ~dladams/bin.
GCC 2.96 is installed at /usr. This is ancient and shoudl be avoided where possible.
GCC 3.2 is installed at /usr/local/gcc-alt-3.2. This is the ATLAS default. If you are an ATLAS developer, you likely get this by default.
GCC 3.2.3 is installed at ~dladams/apps/gcc/3.2.3/rh73. It appears to be compatible with 3.2. I set this up with
> . setup_gcc.sh 3.2.3/rh73
GDB 5.2.2 is installed in /usr/bin. If you want a more recent version, try 6.1 in ~dladams/apps/gdb. I set this up with
The totalview debugger can be started with the command
> totalview myexe
Valgrind can be used to check for memory errors and leaks. I have installed version 2.1.1 at ~/apps/valgrind/2.1.1. I setup and run with
> . setup_valgrind.sh > memcheck myexe
Insure is a licensed commercial memory checker. It is isntalled at /afs/usatlas.bnl.gov/i386_redhat72/app/insure-6.1. I set up with
> . setup_insure.shInsure expects you to compile your sources with insure. If you add ~dladams/apps/g++insure to the front of your path, insure will be used in place of g++ for compiling and linking.
Insure provides a memory checker called Chaperon that may be used on an executable not built with insure:
> Chaperon myexe
Codewizard is installed at /afs/usatlas.bnl.gov/i386_redhat72/app/codewizard-4.3. I have not used it.
To examine libraies use ldd and nm.
Set LD_DEBUG to get information as libraries are loaded. The value help will provide describe the options.
Use MALLOC_CHECK_ to use version of malloc that prints messages or crashes upon memory errors.
Use "netstat -lp" to see which processes are listening on which ports.
We have run across a couple difficult problems.
In order for threads to work properly, the main program must be linked with pthread (-lpthread). The distributed root 3.10.xx was built without this option and had to be rebuilt to avoid crashes whe the DIAL thread-enabled libraries were loaded.
Globus provide its own SSL library and conflicts with the system version cause mysterious crashes. Solution is to make sure all libraries requiring SSL are linked against the globus version.