Skip to content

[SYCL] support op col2im_1d#25264

Open
arthw wants to merge 2 commits into
ggml-org:masterfrom
arthw:support_col2im_1d
Open

[SYCL] support op col2im_1d#25264
arthw wants to merge 2 commits into
ggml-org:masterfrom
arthw:support_col2im_1d

Conversation

@arthw

@arthw arthw commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

support op col2im_1d.
all related UT cases are passed.

@arthw arthw requested a review from a team as a code owner July 3, 2026 09:54
@github-actions github-actions Bot added documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jul 3, 2026
@ServeurpersoCom

ServeurpersoCom commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Hello. Thanks for the port, the kernel math matches the CPU/CUDA/Metal/Vulkan implementations (gather formulation, tight bounds, f32 accumulator, same layouts).

Missing PR template? The description fields were removed, including the mandatory AI usage disclosure from the contributing guidelines.

I'm reviewing this from my experience implementing this op on the other backends, to save other reviewers from repeating the same points. I don't have the hardware for this backend (yet) so I can't validate it locally.

  • supports_op should also check ggml_is_contiguous(op): the kernel writes dst assuming it's contiguous. The CPU backend asserts it, CUDA and Vulkan check it in supports_op.

  • supports_op advertises BF16 but the kernel only handles it under #ifdef GGML_SYCL_HAS_BF16, so a build without the macro would abort at runtime. Gating the BF16 arm with the same ifdef keeps supports_op honest.

  • 32 bit indices with fast_div_modulo (already in common.hpp, same helpers the CUDA version uses) would avoid the per thread 64 bit div/mod.

  • note sure: docs/ops/SYCL.csv has two stray "zjy 2" lines. There is a printf("zjy 2\n") in ggml-sycl.cpp that came with [SYCL] support MUL_MAT and OUT_PROD with Q1_0 #24721 and looks like leftover debug ? its stdout probably ends up in the CSV during generation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants