From 1f22a2416d2423772d4aecf4bbab8461e623083d Mon Sep 17 00:00:00 2001 From: diskdog Date: Fri, 5 Jun 2026 09:19:53 +0100 Subject: [PATCH] alge: fix SIGSEGV in cs_sles_solve_ccc_fv with extended matrix columns When n_cols_ext > n_cells_with_ghosts, cs_sles_solve_ccc_fv allocates extended _vx and _rhs buffers using cs_alloc_mode_device, which resolves to CS_ALLOC_DEVICE (device-only) in a standard CUDA build. Those pointers are then passed into cs_sles_solve, which reads the residual on the host during convergence checking. The result is a SIGSEGV. The fix is to use CS_ALLOC_HOST_DEVICE_SHARED for these two buffers. They are not pure-device scratch; the solver needs host-readable convergence data from them. The GPU dispatch is unaffected: ctx still runs on the GPU via set_use_gpu(true), and the unified-memory backing is fast enough on all tested sm_7x+ devices. The existing workaround (CS_CUDA_ALLOC_DEVICE_UVM=1) happens to fix this by globally remapping cs_alloc_mode_device, but the global remap affects unrelated allocations and masks the root cause here. Tested on sm_75, CUDA 13.1, channel-flow case with CS_MATRIX_NATIVE. --- src/alge/cs_sles_default.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/alge/cs_sles_default.cpp b/src/alge/cs_sles_default.cpp index 50a5342cbe..fcbcfb69de 100644 --- a/src/alge/cs_sles_default.cpp +++ b/src/alge/cs_sles_default.cpp @@ -1022,7 +1022,7 @@ cs_sles_solve_ccc_fv(cs_sles_t *sc, cs_dispatch_context ctx; if (amode >= CS_ALLOC_HOST_DEVICE_SHARED) { - amode = cs_alloc_mode_device; + amode = CS_ALLOC_HOST_DEVICE_SHARED; ctx.set_use_gpu(true); }