Implementation

IMSL C Numerical Library incorporates the use of many Basic Linear Algebra Subprograms (BLAS) throughout the product. These functions are named using IMSL conventions and used internally. They are not accessible directly by the user.

NVIDIA Corp. implemented certain Level 1, 2 and 3 BLAS in the NVIDIA CUDA Toolkit. The NVIDIA external names and argument protocols are different from those used by the IMSL C Numerical Library. Wrappers have been written to allow for the IMSL C Numerical Library to access selected routines in the NVIDIA CUDA Toolkit.

In Table 12.9, we document an enumeration that includes those BLAS for which a CUDA Toolkit implementation is provided in the IMSL C Numerical Library. The naming convention used is the name of the BLAS function prefaced by ‘IMSL_CUDA_’.

NVIDIA CUDA Toolkit implementations of complex two-dimensional FFT (Fast Fourier Transform) functions can be accessed when using functions imsl_c_fft_2d_complex and imsl_z_fft_2d_complex. The enumerations defined to enable the user to manipulate the parameters used by these function are documented in Table 12.9.

There are three utility functions provided in the IMSL C Math Library that can be used to help manage the use of NVIDIA CUDA Toolkit. These utilities appear in Table 12.10 and are described in more detail in their corresponding function descriptions.

Note: Some NVIDIA hardware does not provide double precision arithmetic. Since the double precision functions are included in the NVIDIA CUDA Toolkit library, those functions will appear to execute correctly even though they do not return correct results. When the IMSL software detects that the correct results are not returned, a warning error message will be printed and the IMSL equivalent of the function which does not use the GPU will be used. The user can eliminate this error by using function imsl_cuda_set to set the threshold value to zero.

Table 12.9 — Enumerations of NVIDIA Toolkit-Enabled Functions
IMSL_CUDA_SGEMV	IMSL_CUDA_DGER	IMSL_CUDA_STRSM
IMSL_CUDA_SGER	IMSL_CUDA_DSYR	IMSL_CUDA_DTRSM
IMSL_CUDA_SSYR	IMSL_CUDA_DGEMM	IMSL_CUDA_C_FFT_2D_COMPLEX
IMSL_CUDA_SGEMM	IMSL_CUDA_SGBMV	IMSL_CUDA_Z_FFT_2D_COMPLEX
IMSL_CUDA_DGEMV	IMSL_CUDA_DGBMV

Table 12.10 — NVIDIA CUDA Toolkit Utilities
imsl_cuda_get
imsl_cuda_set
imsl_cuda_free

Portions of the NVIDIA SGEMM and DGEMM library routines were written by Vasily Volkov and are subject to the Modified Berkeley Software Distribution License as follows:

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. (See CUDA Toolkit 4.0, CUBLAS Library, April, 2011, for these remaining conditions.)