From 1f22a2416d2423772d4aecf4bbab8461e623083d Mon Sep 17 00:00:00 2001
From: diskdog <diskdog@protonmail.com>
Date: Fri, 5 Jun 2026 09:19:53 +0100
Subject: [PATCH] alge: fix SIGSEGV in cs_sles_solve_ccc_fv with extended
 matrix columns

When n_cols_ext > n_cells_with_ghosts, cs_sles_solve_ccc_fv allocates
extended _vx and _rhs buffers using cs_alloc_mode_device, which resolves
to CS_ALLOC_DEVICE (device-only) in a standard CUDA build. Those pointers
are then passed into cs_sles_solve, which reads the residual on the host
during convergence checking. The result is a SIGSEGV.

The fix is to use CS_ALLOC_HOST_DEVICE_SHARED for these two buffers.
They are not pure-device scratch; the solver needs host-readable
convergence data from them. The GPU dispatch is unaffected: ctx still
runs on the GPU via set_use_gpu(true), and the unified-memory backing
is fast enough on all tested sm_7x+ devices.

The existing workaround (CS_CUDA_ALLOC_DEVICE_UVM=1) happens to fix
this by globally remapping cs_alloc_mode_device, but the global remap
affects unrelated allocations and masks the root cause here.

Tested on sm_75, CUDA 13.1, channel-flow case with CS_MATRIX_NATIVE.
---
 src/alge/cs_sles_default.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/alge/cs_sles_default.cpp b/src/alge/cs_sles_default.cpp
index 50a5342cbe..fcbcfb69de 100644
--- a/src/alge/cs_sles_default.cpp
+++ b/src/alge/cs_sles_default.cpp
@@ -1022,7 +1022,7 @@ cs_sles_solve_ccc_fv(cs_sles_t           *sc,
 
     cs_dispatch_context ctx;
     if (amode >= CS_ALLOC_HOST_DEVICE_SHARED) {
-      amode = cs_alloc_mode_device;
+      amode = CS_ALLOC_HOST_DEVICE_SHARED;
       ctx.set_use_gpu(true);
     }