-
Notifications
You must be signed in to change notification settings - Fork 20.2k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
llama-batch: enable parallel sequences for partial rollback
model
Model specific
#25278
opened Jul 3, 2026 by
am17an
Contributor
Loading…
common : resolve non-positive --threads to the number of math cores
documentation
Improvements or additions to documentation
server
testing
Everything test related
#25277
opened Jul 3, 2026 by
samagameditation-byte
•
Draft
1 task
ggml-backend-meta: abort if we see a multi buffer
ggml
changes relating to the ggml tensor library for machine learning
#25276
opened Jul 3, 2026 by
netrunnereve
Collaborator
Loading…
CANN: Refactor issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#ifdef blocks to avoid unreachable code after return
Ascend NPU
#25273
opened Jul 3, 2026 by
rauletorresc
Contributor
Loading…
ggml, server: add ggml_backend_dev_reset() for sleep mode
AMD ZenDNN
Issues related to the AMD ZenDNN backend
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
Ascend NPU
issues specific to Ascend NPUs
CUDA
Related to the CUDA backend
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
IBM zDNN
issues specific to IBM zDNN Accelerator
OpenCL
Issues specific to the OpenCL backend
OpenVINO
server
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
Vulkan
Issues specific to the Vulkan backend
WebGPU
#25271
opened Jul 3, 2026 by
ngxson
Collaborator
Loading…
[SYCL] support OP OPT_STEP_ADAMW, OPT_STEP_SGD
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25268
opened Jul 3, 2026 by
arthw
Contributor
Loading…
[SYC:] support op get_rows_back, only support fp32/fp16
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25266
opened Jul 3, 2026 by
arthw
Contributor
Loading…
[SYCL] support op col2im_1d
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25264
opened Jul 3, 2026 by
arthw
Contributor
Loading…
Kmoren/add penalties cu backend
testing
Everything test related
#25262
opened Jul 3, 2026 by
kmorennv
Loading…
feat: add --threads-all option to llama-bench
examples
#25261
opened Jul 3, 2026 by
xiaobai0529
Loading…
ggml : fix broken CPU concat implementation for quantized types
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
testing
Everything test related
#25247
opened Jul 2, 2026 by
fairydreaming
Collaborator
Loading…
vulkan: fix 32-bit integer overflow in CEIL_DIV
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#25245
opened Jul 2, 2026 by
hokanosekai
Loading…
vulkan: for small AMD GPUs, reduce submission threshold based on CU count
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#25240
opened Jul 2, 2026 by
0cc4m
Contributor
Loading…
chat: trim messages sent to StepFun parser (fixes long reasoning loops)
testing
Everything test related
#25238
opened Jul 2, 2026 by
pwilkin
Member
Loading…
common: Set optimal default thread count for ppc ( linux as well as AIX)
#25237
opened Jul 2, 2026 by
shalinib-ibm
Contributor
Loading…
[SYCL] support OP cross_entropy_loss, cross_entropy_loss_back
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25236
opened Jul 2, 2026 by
arthw
Contributor
Loading…
common,server : fix custom preset dedup against cached models
server
#25235
opened Jul 2, 2026 by
angt
Member
Loading…
[UT] enhance UT to show all real unsupported backends
testing
Everything test related
#25234
opened Jul 2, 2026 by
arthw
Contributor
Loading…
llama : clear error when MTP draft shares KV cache across backends
#25232
opened Jul 2, 2026 by
liminfei-amd
Contributor
Loading…
1 task done
[SYCL] fix unsupported UT cases of CONT & CPY
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25231
opened Jul 2, 2026 by
arthw
Contributor
Loading…
Ensure unique node names and add org_src to track the org tensor for OpenVINO backend
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#25230
opened Jul 2, 2026 by
zhaixuejun1993
Contributor
Loading…
vulkan: when using transfer queue for async copies, sync on event_wait to avoid race
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#25229
opened Jul 2, 2026 by
0cc4m
Contributor
Loading…
CUDA: Support CUDA Virtual Devices
CUDA
Related to the CUDA backend
ggml
changes relating to the ggml tensor library for machine learning
#25228
opened Jul 2, 2026 by
anavp-nvidia
Contributor
Loading…
server : don't list cached models when a preset is used
server
#25226
opened Jul 2, 2026 by
angt
Member
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-06-03.