Skip to content

[None][feat] Merge back DeepSeek V4 cache manager updates#15633

Draft
jiaganc wants to merge 27 commits into
NVIDIA:mainfrom
jiaganc:main-dsv4-cache-manager-mergeback-round2
Draft

[None][feat] Merge back DeepSeek V4 cache manager updates#15633
jiaganc wants to merge 27 commits into
NVIDIA:mainfrom
jiaganc:main-dsv4-cache-manager-mergeback-round2

Conversation

@jiaganc

@jiaganc jiaganc commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

@coderabbitai summary

Description

Merge back the PR1 DeepSeek V4 cache-manager work onto current main.

This includes DeepSeek V4 sparse cache-manager updates, KVCacheManagerV2
adaptations after its migration from resource_manager.py to
kv_cache_manager_v2.py, disaggregated KV-transfer/cache-reuse updates,
iteration stats reporting, OpenAI serving metrics updates, and rebase
compatibility fixes needed on top of current main.

This PR is opened as a draft because selected changed tests were intentionally
skipped by request and broader CI/build validation can still run before review.

Test Coverage

Remote unit tests were run on umb-b300-dp-186 inside container
trtllm-main-dsv4-cache-manager-mergeback-round2-jenkins-jiaganc at SHA
e188f8076c.

Passed:

python3 -m pytest -q \
  tests/unittest/llmapi/apps/test_disagg_perf_metrics_collector.py \
  tests/unittest/llmapi/apps/test_openai_server_iteration_stats.py \
  tests/unittest/disaggregated/test_openai_disagg_service.py \
  tests/unittest/disaggregated/test_cache_reuse_adapter.py \
  tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_stats_behavior.py \
  tests/unittest/llmapi/test_llm_args.py

Result: 283 passed, 3 skipped, 4 warnings in 129.98s

python3 -m pytest -q \
  tests/unittest/_torch/attention/sparse/deepseek_v4/test_compressor_module.py \
  tests/unittest/_torch/attention/sparse/deepseek_v4/test_deepseek_v4_cache_manager.py

Result: 149 passed, 4 warnings in 71.21s

python3 -m pytest -q \
  tests/unittest/disaggregated/region/test_page.py \
  tests/unittest/disaggregated/region/test_region.py \
  tests/unittest/disaggregated/test_cache_transceiver_single_process.py \
  tests/unittest/disaggregated/test_deepseek_v4_kv_transfer.py \
  tests/unittest/disaggregated/test_extractor.py \
  tests/unittest/disaggregated/test_kv_transfer.py \
  tests/unittest/disaggregated/test_peer.py \
  tests/unittest/disaggregated/test_pool_matching.py

Result: 392 passed, 66 skipped, 4 warnings in 146.77s

python3 -m pytest -q \
  tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py

Result: 63 passed, 12 skipped, 5 warnings in 338.84s

Skipped by request:

  • executor tests
  • tests/unittest/_torch/modeling/test_modeling_deepseekv4.py
  • tests/unittest/_torch/attention/sparse/deepseek_v4/test_deepseek_v4_indices_transform.py
  • tests/unittest/_torch/attention/sparse/deepseek_v4/test_deepseek_v4_sparse_mla.py
  • tests/unittest/_torch/speculative/test_eagle3.py

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

jiaganc added 27 commits June 25, 2026 08:22
Source-Commit: a7728aa
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…ndexMapper release (NVIDIA#14423)

Source-Commit: 02ac906
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: 7a4b113
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
NVIDIA#14308)

Source-Commit: 2de7acb
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…ength (NVIDIA#14627)

Source-Commit: eeb09d1
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: 69e7acc
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…euse' + evictable gauges (NVIDIA#14544)

Source-Commit: 772f750
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…s for safe order (NVIDIA#14674)

Source-Commit: 808b24f
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…_disagg_gen_init (NVIDIA#14695)

Source-Commit: be4aa75
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…4734)

Source-Commit: c5399fc
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…uests (+ probe hang-detector fix) (NVIDIA#14912)

Source-Commit: 49f1ce6ce5d6847af7217e3cf5ba3ff9f5e2fdb2
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: 95cd755
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: 337b4a5
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: 4f9d819
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: c185066
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: 0e22041
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…NVIDIA#15134)

Source-Commit: ec3da87a9b688016f8f76cc43be9cf30c92c868e
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
…or (NVIDIA#15211)

Source-Commit: c4cd713
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Source-Commit: d15f21c
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Reason: rebase adaptation for PR1 final validation build failure.
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant