Skip to content

vulkan: fix 32-bit integer overflow in CEIL_DIV#25245

Open
hokanosekai wants to merge 1 commit into
ggml-org:masterfrom
hokanosekai:fix/vulkan-ceil-div-overflow
Open

vulkan: fix 32-bit integer overflow in CEIL_DIV#25245
hokanosekai wants to merge 1 commit into
ggml-org:masterfrom
hokanosekai:fix/vulkan-ceil-div-overflow

Conversation

@hokanosekai

Copy link
Copy Markdown

Overview

Fixes #23057.

Mobile Vulkan drivers (Mali, Adreno, CIX) report maxComputeWorkGroupCount = UINT32_MAX. CEIL_DIV's numerator (M + N - 1) wraps in 32-bit arithmetic and yields 0, so ggml_vk_matmul requests 0 descriptor sets for batched matmuls while still dispatching one, tripping GGML_ASSERT(descriptor_set_idx < descriptor_sets.size()) at model load.

This rewrites the macro with division-based math that has no overflowing intermediate, as suggested by @jeffbolznv in the issue.

Tested on a Mali-G68 (MediaTek Dimensity 900): Gemma 3n E2B Q4_K_M with full Vulkan offload went from a 100% reproducible abort at warmup to generating normally, same throughput as a 64-bit promotion variant.

Additional information

Full root cause analysis and instrumentation logs are in #23057.

Requirements

@hokanosekai hokanosekai requested a review from a team as a code owner July 2, 2026 17:23
@github-actions github-actions Bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Jul 2, 2026
@0cc4m

0cc4m commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vulkan: GGML_ASSERT(descriptor_set_idx < descriptor_sets.size()) crash on ARM UMA (Mali-G720-Immortalis, CIX CP8180)

3 participants