[None][test] Add Kimi-K2.5 disaggregated GSM8K accuracy test#15617
[None][test] Add Kimi-K2.5 disaggregated GSM8K accuracy test#15617Shixiaowei02 wants to merge 1 commit into
Conversation
Add TestKimiK25.test_nvfp4 to test_disaggregated_serving.py: a disaggregated (ctx-TP4 + gen-TP4) GSM8K accuracy test for Kimi-K2.5 (NVFP4), modeled on the sibling TestKimiK2 disagg test. GSM8K runs the text-only DeepSeek-V3 MLA path; trust_remote_code is set on both servers since the checkpoint ships custom HF modeling code via auto_map. The disagg harness resolves kv_cache_quant_algo=None, so add a bare (NVFP4) GSM8K reference entry for moonshotai/Kimi-K2.5 alongside the existing (NVFP4, FP8-KV) one (same accuracy), mirroring Kimi-K2-Thinking. Register the test in qa/llm_function_core.txt. Signed-off-by: Xiaowei Shi <39303645+Shixiaowei02@users.noreply.github.com>
|
/bot run --add-multi-gpu-test --disable-fail-fast |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdds a disaggregated serving accuracy test for ChangesKimi-K2.5 NVFP4 accuracy coverage
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
PR_Github #55727 [ run ] triggered by Bot. Commit: |
|
PR_Github #55727 [ run ] completed with state
|
Description
This pull request adds a new integration test for the Kimi-K2.5 model with NVFP4 quantization, updates the GSM8K accuracy reference data, and ensures the new test is included in the test suite. The main changes are grouped below:
New integration test for Kimi-K2.5:
TestKimiK25class intest_disaggregated_serving.pyto test GSM8K accuracy for themoonshotai/Kimi-K2.5model with NVFP4 quantization in a disaggregated server setup. This includes configuration for context and generation servers, and uses thelaunch_disaggregated_llmutility.Test suite updates:
TestKimiK25::test_nvfp4in thellm_function_core.txttest list to ensure it runs as part of the integration test suite.Accuracy reference data update:
gsm8k.yamlto add an entry formoonshotai/Kimi-K2.5with NVFP4 quantization and the corresponding accuracy, aligning the reference data with the new test.Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either
api-compatibleorapi-breaking. Forapi-breaking, includeBREAKINGin the PR title.Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.Summary by CodeRabbit
New Features
Bug Fixes