Properly handle cuda arch for unsupported function #1853

yhmtsai · 2025-05-28T12:03:10Z

For example, bfloat16 are natively supported after CC 80, we need to throw an exception or avoid failed comfiguration.
It is mostly handled by cmake option when user only compiles for one cuda arch.
Cuda allows to compile library for different arch.
However, CUDA_ARCH is only available in device code not host code, so using macro on host side does not have effect actually.
There is one host macro but will give the entire list.
We can only rely on the runtime dispatch on CC in this case and throw an exception when they are not available.
To achieve that, we need to provide a working version of atomic add just for compilation.

Side note: there are still some issue that compiling bfloat16 kernel in templated lambda after 12.2 on the architecture not natively supporting bfloat16 leads unknown device kernel in runtime. but if duplicate the kernel with full specialization will work. This requires further investigation.

common/cuda_hip/matrix/coo_kernels.cpp

Co-authored-by: Pratik Nayak <pratikvn@protonmail.com>

yhmtsai self-assigned this May 28, 2025

ginkgo-bot added mod:cuda This is related to the CUDA module. type:solver This is related to the solvers type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. labels May 28, 2025

yhmtsai requested a review from a team May 28, 2025 16:17

pratikvn requested changes Jul 22, 2025

View reviewed changes

common/cuda_hip/matrix/coo_kernels.cpp Outdated Show resolved Hide resolved

yhmtsai and others added 3 commits July 30, 2025 14:45

add unsupported atomic add

666763d

proper dispatch the kernel with atomic_add but can only in runtime

61107dd

add get_compute_capability in cudaExecutor

432fe5a

Co-authored-by: Pratik Nayak <pratikvn@protonmail.com>

yhmtsai force-pushed the properly_handle_cuda_arch branch from 66748e1 to 432fe5a Compare July 30, 2025 12:45

yhmtsai requested a review from pratikvn July 30, 2025 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Properly handle cuda arch for unsupported function #1853

Properly handle cuda arch for unsupported function #1853

Uh oh!

yhmtsai commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

Properly handle cuda arch for unsupported function #1853

Are you sure you want to change the base?

Properly handle cuda arch for unsupported function #1853

Uh oh!

Conversation

yhmtsai commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!