Getting Started with TotalView® : Debugging CUDA Programs
Debugging CUDA Programs
The TotalView CUDA debugger is an integrated debugging tool capable of simultaneously debugging CUDA code that is running on the host Linux-x86_64 and the NVIDIA® GPU. CUDA support is an extension to the standard version of Linux-x86_64 TotalView, and is capable of debugging 64-bit CUDA programs on Linux-x86_64. Debugging 32-bit CUDA programs is currently not supported.
Supported major features:
*Debug CUDA application running directly on GPU hardware
*Set breakpoints, pause execution, and single step in GPU code
*View GPU variables in PTX registers, local, parameter, global, or shared memory
*Access runtime variables, such as threadIdx, blockIdx, blockDim, etc.
*Debug multiple GPU devices per process
*Support for the CUDA MemoryChecker
*Debug remote, distributed and clustered systems
*All Linux-x86_64 host debugging features are supported, except for ReplayEngine
See More
*On using the CUDA debugger: About the CUDA Debugger” in the TotalView User Guide
*On the CLI dcuda command: dcuda in the TotalView Reference Guide