Totalview® for HPC User Guide : PART V Using the CUDA Debugger : Chapter 26 About the TotalView CUDA Debugger
Chapter 26 About the TotalView CUDA Debugger
The TotalView CUDA debugger is an integrated debugging tool capable of simultaneously debugging CUDA code that is running on the host system and the NVIDIA® GPU. CUDA support is an extension to the standard version TotalView, and is capable of debugging 64-bit CUDA programs. Debugging 32-bit CUDA programs is currently not supported.
Supported major features:
Debug CUDA application running directly on GPU hardware
Set breakpoints, pause execution, and single step in GPU code
View GPU variables in PTX registers, local, parameter, global, or shared memory
Access runtime variables, such as threadIdx, blockIdx, blockDim, etc.
Debug multiple GPU devices per process
Support for the CUDA MemoryChecker
Debug remote, distributed and clustered systems
Support for directive-based programming languages
Support for host debugging features
Requirements:
CUDA SDK 7.5, 8.0, and 9.0
With SDK 7.5, TotalView 8.15.10 through TotalView for HPC 2016.06
With SDK 8.0, TotalView for HPC 2016.07
With SDK 9.0, TotalView for HPC 2017.3
Tesla, Fermi, Kepler, or Pascal hardware supported by NVIDIA
A host distribution supported by NVIDIA. For a list of supported hosts, please see the TotalView for HPC Supported Platforms guide.