-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[https://nvbugs/6074943][fix]Disable new aiperf server metrics to stop hang.
#13311
opened Apr 22, 2026 by
dominicshanshan
Collaborator
Loading…
1 task done
[https://nvbugs/6084447][fix] Fix MoE DeepGEMM workspace size with attention_dp
#13310
opened Apr 22, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6094066][fix] Handle OOM gracefully during warmup and KV cache estimation
#13309
opened Apr 22, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[TRI-978][fix] Fix streaming=None crash and CI test failures in L0_backend_trtllm
#13308
opened Apr 22, 2026 by
mc-nv
Loading…
[https://nvbugs/6093713][fix] GLM-4.7-Flash bf16 model requires ~55.79 GB for weights, exceeding L40S GPU capa
#13305
opened Apr 22, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][test] Waive 2 failed cases for main in QA CI
#13304
opened Apr 22, 2026 by
xinhe-nv
Collaborator
Loading…
[https://nvbugs/6094071][fix] Fix PP warmup shutdown deadlock and restore torch_compile guard
#13303
opened Apr 22, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][feat] WideEP FT: add EPGroupHealth thread-safe rank mask
#13302
opened Apr 22, 2026 by
chienchunhung
Collaborator
•
Draft
1 task
[https://nvbugs/6093820][fix] ** Three independent issues: (1)
rsync binary missing from CI Docker container
#13300
opened Apr 21, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6094070][fix] Integration test conftest.py was missing ray-skip logic that the unittest confte
#13299
opened Apr 21, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[AutoDeploy][feat]: Basic LorA
#13297
opened Apr 21, 2026 by
govind-ramnarayan
Collaborator
•
Draft
1 task
[https://nvbugs/6094107][fix] Exclude PP send/recv from piecewise CUDA graph capture
#13296
opened Apr 21, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6094072][fix] When
load_format="dummy" is used, load_quant_scales (which transforms MXFP4
#13295
opened Apr 21, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[#7091][fix] remove INT8 from trtllm-bench --quantization choices
Community want to contribute
PRs initiated from Community
#13292
opened Apr 21, 2026 by
Brumbelow
Loading…
1 task done
[TRTLLMINF-45][infra] Upload CI agent failure analysis to SwiftStack
#13291
opened Apr 21, 2026 by
dpitman-nvda
Collaborator
Loading…
1 task done
[None][feat] Add BaseLlmArgs.force_deterministic field
#13290
opened Apr 21, 2026 by
achartier
Collaborator
Loading…
1 task done
[https://nvbugs/6094208][fix] Nemotron-3-Nano-30B-A3B bf16 model weights require ~58.82 GB, exceeding L40S GPU
#13288
opened Apr 21, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6094112][fix]
resolve_moe_backend() resolves AUTO to CUTLASS for non-GptOss MoE models o
#13287
opened Apr 21, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][chore] Unwaive disagg tests for transformers upgrade
#13286
opened Apr 21, 2026 by
brb-nv
Collaborator
Loading…
1 task done
[TRTLLM-11767][feat] LTX2 pipeline refactor part1
#13285
opened Apr 21, 2026 by
yibinl-nvidia
Collaborator
Loading…
1 task
[https://nvbugs/6050481][chore] Unwaive passing GPT-OSS ep tests
#13284
opened Apr 21, 2026 by
dongfengy
Collaborator
Loading…
1 task done
[https://nvbugs/6098442][fix] Update fmha attention cubins and libs
#13282
opened Apr 21, 2026 by
heyuhhh
Collaborator
Loading…
1 task done
[https://nvbugs/5973199][fix] Add NCCL fallback for AutoDeploy MoE alltoall when MNNVL is unavailable
#13281
opened Apr 21, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
Previous Next
ProTip!
Follow long discussions with comments:>50.