TotalView User Guide : Part V: Using the CUDA Debugger : About the TotalView CUDA Debugger
About the TotalView CUDA Debugger
The TotalView CUDA debugger is an integrated debugging tool capable of simultaneously debugging CUDA code that is running on the host Linux-x86_64 and the NVIDIA® GPU. CUDA support is an extension to the standard version of Linux-x86_64 TotalView, and is capable of debugging 64-bit CUDA programs on Linux-x86_64. Debugging 32-bit CUDA programs is currently not supported.
Supported major features:
*Debug CUDA application running directly on GPU hardware
*Set breakpoints, pause execution, and single step in GPU code
*View GPU variables in PTX registers, local, parameter, global, or shared memory
*Access runtime variables, such as threadIdx, blockIdx, blockDim, etc.
*Debug multiple GPU devices per process
*Support for the CUDA MemoryChecker
*Debug remote, distributed and clustered systems
*Support for directive-based programming languages
*All Linux-x86_64 host debugging features are supported, except for ReplayEngine
Requirements:
*CUDA SDK 3.0, 3.1, 3.2, 4.0, 4.1 or 4.2
*With SDK 3.0, TotalView version 8.9.0 or 8.9.1
*With SDK 3.1 or 3.2, TotaView version 8.9.1 or higher
*With SDK 4.0, TotalView 8.9.2 or higher
*With SDK 4.1, TotalView 8.10 or higher
*With SDK 4.2, TotalView 8.11 or higher
*Tesla or Fermi hardware supported by NVIDIA
*A linux-x86_64 distribution supported by NVIDIA