Abstract:
Conjugate Gradient (CG) and Biconjugate Gradient Stabilized (BiCGSTAB) are two classical and efficient iterative methods for solving sparse linear systems, widely used in scientific computing and engineering applications. Although GPUs and other parallel processors have enhanced the parallelism of these methods, the computing power of the latest computing units, Tensor Cores, have not yet been fully exploited for these two methods. This work proposes a Tensor Core-accelerated CG solver that leverages Tensor Cores for the key components in the CG and BiCGSTAB methods, such as sparse matrix-vector multiplication (SpMV) and dot product computation, thereby exploiting the computational capability of Tensor Cores to improve the overall performance of both methods. Experimental results on NVIDIA A100 and H100 GPUs demonstrate that both approachs proposed in this work achieve significant speedups compared to the baseline version using the CUDA official librarys on various sparse matrices.