Overview
TAU Performance System® is a portable profiling and tracing toolkit
for performance analysis of parallel programs written in Fortran, C, C++,
Java, and Python.
TAU (Tuning and Analysis Utilities) is capable of gathering
performance information through instrumentation of functions, methods,
basic blocks, and statements.
TAU's profile visualization tool, paraprof, provides graphical displays
of the performance analysis results, to help the user visualize
the collected data.
NCSA-specific information
How to load the software (obtain the proper environment variables)
TAU was installed on Lincoln (version 2.19)
and Forge (version 2.20.3),
under /usr/apps/tools/tau/.
- On Lincoln, use the softenv key "+tau" by issuing:
soft add +tau
- On Forge, use the module "tau" by issuing:
module load tau
or
module load tau/2.20.3
for a specific version such as 2.20.3 in the above.
OpenMP support in TAU
- Lincoln
The following TAU makefiles contain OpenMP support:
Makefile.tau-icpc-openmp-opari
Makefile.tau-icpc-pdt-openmp-opari
Makefile.tau-icpc-mpi-pdt-openmp-opari
They all require the Intel compiler version 11.1.038,
instead of the default 10.1.017. To use them, do:
soft delete +intel-10.1.017
soft add +intel-11.1.038
and in your makefile, add the following 2 lines:
TAUROOTDIR = /usr/apps/tools/tau/current/
include $(TAUROOTDIR)/x86_64/lib/<one_of_the_above_Makefiles>
and use the variables defined in the TAU make file to compile and link.
- Forge
The following TAU makefiles contain OpenMP support:
Makefile.tau-icpc-openmp-opari
Makefile.tau-icpc-pdt-openmp-opari
Makefile.tau-icpc-mpi-pdt-openmp-opari
To use them, in your makefile, use the following 2 lines:
TAUROOTDIR = /usr/apps/tools/tau/<version>/
include $(TAUROOTDIR)/x86_64/lib/<one_of_the_above_Makefiles>
replace "<version>" with a specific version string,
such as "2.20.3-forge",
and use the variables defined in the TAU make file to compile and link.
It could be useful to study the file "Makefile" in the MPI Pi computation
example in /usr/apps/tools/tau/2.20.3-forge/examples/pi/ to see how it is done.
GPU support in TAU on Forge
TAU added support for profiling and tracing CUDA and CUPTI calls in recent
versions. About 18 TAU configurations with CUDA and CUPTI support have been
installed on Forge. To use them, after you performed "module load tau",
use:
tau_exec -T <execution_type> -XrunTAU-<tau_makefile_options>
<tracking_option> <path_to_my_program>
to profile or trace the CUDA or CUPTI calls, where
execution_type:
"serial" for non-MPI programs, or "mpi" for MPI programs
tracking_option:
"-cuda" to track CUDA calls, or "-cupti" to track CUPTI calls
tau_makefile_options:
part of the TAU make file names ($TAUROOT/x86_64/lib/Makefile.tau-*),
e.g., use "cupti-icpc" for Makefile.tau-cupti-icpc
For example, to profile CUDA calls with a serial (non-MPI) program, use:
tau_exec -T serial -XrunTAU-cupti-icpc -cuda ./my_program
and to profile CUPTI calls with an MPI program, one could use:
tau_exec -T mpi -XrunTAU-cupti-icpc-mpi -cupti ./my_program
To perform tracing instead of profiling, set the environment variable TAU_TRACE to 1,
before running "tau_exec".
Courtesy of Galen Arnold of NCSA, some example results of profiling CUDA calls
in HPL are available on Forge in
/uf/ncsa/arnoldg/hpl-2.0_FERMI_v13/bin/CUDA/ .
The file "README.gpu" in TAU's source code package contains additional
information, especially regarding tracing, CUPTI, and TAU's requirement
of having a call to the function cudaDeviceReset() or cudaThreadExit()
at the end of the execution. You can access it here:
README.gpu or on Forge in the "/usr/apps/tools/tau/" directory.
For more information
General usage information can be found at TAU's documentation web page,
at: http://www.cs.uoregon.edu/research/tau/docs.php.
If you have questions about or need assistance with TAU at NCSA,
please send email to:
consult (at) ncsa.illinois.edu