Ck by lizexu123 · Pull Request #7869 · PaddlePaddle/FastDeploy

lizexu123 · 2026-05-20T11:09:20Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

…into kc_d

…into kkc

paddle-bot · 2026-05-20T11:09:29Z

Thanks for your contribution!

…into ck

CLAassistant · 2026-05-20T11:09:41Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ lonelygsh
✅ lizexu123
❌ root

root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-20 19:17:48

📋 Review 摘要

PR 概述：为 benchmark-compare skill 新增 vLLM 作为第三对比框架，支持 fd/sg/vllm 三框架中任意两个的灵活对比，允许单框架结果缺省。
变更范围：.claude/skills/benchmark-compare/scripts/（benchmark 工具脚本）
影响面 Tag：[Benchmark]

问题

级别	文件	概述
🟡 建议	`extract_metrics.py`	`baseline` 不在 active 框架中时，`diff_pct` 静默为空，无任何警告日志
🟡 建议	`generate_report.py:929`	`html.replace("FastDeploy", ...)` 全文替换可能误替换 HTML 注释、JS 变量名等非展示内容
❓ 疑问	`launch_service.sh:310`	vLLM `--data-parallel-size` 参数跨版本兼容性需确认
📝 PR 规范	—	标题 "Ck" 缺少 Tag 且无实质描述；Motivation / Modifications / Usage 节均为空

📝 PR 规范检查

标题 Ck 不符合 [Tag] 标题描述 格式；PR 描述中 Motivation、Modifications、Usage or Command 三节均为空，仅剩模板注释。

标题建议（可直接复制）：

[Benchmark] Add vLLM as third benchmark framework for comparison

PR 描述建议（可直接复制，完整结构）：

## Motivation

benchmark-compare skill 原仅支持 FastDeploy 与 SGLang 两框架对比，无法将 vLLM 纳入对比场景。本 PR 扩展为三框架（fd/sg/vllm）支持，任意框架的结果均可缺省，提升工具灵活性。

## Modifications

- `extract_metrics.py`：
  - 新增模块级常量 `HIGHER_IS_BETTER` / `LOWER_IS_BETTER`，取代原函数内部局部集合；
  - `compute_comparison` 接口改为 `(all_metrics, baseline="sg")`，支持任意框架子集对比；新增 `--baseline` CLI 参数控制 diff_pct 基准框架；
  - `--fd-result` / `--sg-result` 由 `required=True` 改为 `default=None`，新增 `--vllm-result` / `--vllm-config` 参数；至少提供一个结果文件才允许运行。
- `generate_report.py`：
  - `scan_log_dir` 文件名正则扩展，支持 `_vllm.txt` 后缀识别；
  - 新增 `--frameworks`（默认 `fd,sg`）支持三选二对比，新增 `--vllm-attention` / `--vllm-version` 参数；
  - 通过 left/right 框架映射把数据绑定到 HTML 模板 fd/sg 槽位，并在非默认框架组合时替换 HTML 显示文本标签。
- `launch_service.sh`：
  - 新增 `launch_vllm()` 函数，支持 vLLM 服务启动（TP/DP/EP/量化映射）；
  - `--framework` 参数扩展支持 `vllm` 选项。

## Usage or Command

```bash
# 提取 FD 与 vLLM 对比指标（以 vLLM 为 baseline）
python3 extract_metrics.py \
    --fd-result fd_result.txt --vllm-result vllm_result.txt \
    --model-path /path/to/model --baseline vllm --output metrics.json

# 生成 vLLM vs SGLang 对比报告
python3 generate_report.py \
    --log-dir ./logs --frameworks vllm,sg \
    --model-name DeepSeek-R1 --output report.html

# 单卡启动 vLLM 服务
bash launch_service.sh --framework vllm --model /path/to/model \
    --port 8380 --gpus 2 --venv /path/to/vllm_env/.venv
```

## Accuracy Tests

N/A（工具脚本变更，不涉及模型推理精度）

## Checklist

- [ ] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [x] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

工具脚本逻辑清晰，vLLM 三框架扩展思路合理，代码结构优化（常量提升、多框架循环）值得肯定。建议关注 baseline 缺省时的静默处理和 HTML 字符串替换边界问题，PR 标题和描述需按规范补全后合入。

PaddlePaddle-bot · 2026-05-20T11:24:17Z

+    left_name = framework_display[left_fw]
+    right_name = framework_display[right_fw]
+    if left_name != "FastDeploy" or right_name != "SGLang":
+        # 用 placeholder 中转避免 FastDeploy→X 后再被 SGLang 替换误伤


🟡 建议 html.replace("FastDeploy", "__LEFT_FW__") 对整个 HTML 字符串做全文替换，可能误替换 HTML 注释、JavaScript 变量名或其他非显示文本中的 "FastDeploy"/"SGLang" 字符串。

虽然当前三个框架名称在 CSS 类名（.fd/.sg）中不含全名，实际误触发风险较低，但建议改用更精确的正则替换或在 generate_html 阶段通过参数传入框架名，避免后处理字符串替换的脆弱性。

建议修复方式：在 generate_html(valid_data, config) 的 config 中传入 left_name/right_name，在模板渲染阶段用 {left_name} 占位符替换，彻底消除后处理替换。

PaddlePaddle-bot · 2026-05-20T11:24:17Z

+
+    # DP (data parallelism)
+    if [[ "$DP" -gt 1 ]]; then
+        CMD+=" --data-parallel-size $DP"


❓ 疑问 vLLM 的 --data-parallel-size 参数是较新版本才支持的特性（部分版本通过 VLLM_DP_SIZE 环境变量或多进程方式实现），如果目标环境 vLLM 版本不支持该参数，启动时会报错。

建议在脚本或文档中注明支持此参数的最低 vLLM 版本要求，或添加版本检测：

# 示例：检测 vLLM 是否支持 --data-parallel-size python -c "import vllm; assert vllm.__version__ >= '0.6.0'" 2>/dev/null || echo '[WARN] vLLM 版本可能不支持 --data-parallel-size'

PaddlePaddle-bot · 2026-05-20T11:41:55Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-21 12:48:43

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: 9655daa
Merge base: f254649 (branch: develop)
查看完整 Diff
CI 详情

1 任务总览

Required 任务当前 9/10 通过，仍有 1 个 required 任务失败：Approval。该失败为人工审批类阻塞项，需完成审批后再继续观察 CI。Optional 任务有 4 个失败，不阻塞合并。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
42(0)	42	37	5	0	0	0

2 任务状态汇总

日志列说明：失败任务直接使用 CI 日志链接；运行中任务使用 Job 链接。

2.1 Required任务 : 9/10 通过

必选任务阻塞合并，失败需优先处理。

状态	任务	耗时	根因	修复建议	日志	重跑
❌	`Approval`	7s	需要 Approval	请通过人工审批	Job	-
✅	其余 9 个必选任务通过	-	-	-	-	-

2.2 可选任务 — 28/32 通过

可选任务不阻塞合并，失败仅供参考。

状态	任务	耗时	日志	重跑
❌	`Run iluvatar Tests / run_iluvatar_cases`	15m8s	Job	-
❌	`Check PR Template`	13s	Job	-
❌	`CI_HPU`	1h47m	Job	-
❌	`Trigger Jenkins for PR`	2m39s	Job	-
✅	其余 28 个可选任务通过	-	-	-

3 失败详情（仅 required）

Approval — 需要 Approval（置信度: 高）

该 Job 需要人工 Approval，完成审批后 CI 才会继续执行。

根因摘要：需要人工审批。
修复建议摘要：请通过人工审批。

codecov-commenter · 2026-05-20T11:52:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@f254649). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7869   +/-   ##
==========================================
  Coverage           ?   63.69%           
==========================================
  Files              ?      462           
  Lines              ?    64435           
  Branches           ?     9881           
==========================================
  Hits               ?    41039           
  Misses             ?    20601           
  Partials           ?     2795

Flag	Coverage Δ
GPU	`72.84% <ø> (?)`
XPU	`7.12% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lonelygsh and others added 30 commits April 15, 2026 15:15

support eb5 fp4 cuda_graph

9f6c3c0

update

55d1a05

merge develop

3509714

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

deebd2a

…into kc_d

Support FP4 communication quantization

dd4118d

fix

3fdbc08

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

1226b27

…into kc_d

update

6c3cc4b

fix

e89dff7

support mix_quant and nvfp4

24d07c6

support prefill cuda_graph

19a7019

support fp4 communication quantization

842feba

support

141ac55

fix

4c076ce

fix

b643683

add test

2f4151c

merge develop

22ac5a0

update develop

8443d62

fix

d7f98f0

fix

264dbd8

delete

eae9a55

update

9205ac7

update

3d64926

add document

d2545c1

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

ca03742

…into kkc

fix

7a58d12

revert helper.h to develop

aad041d

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

fb82351

…into kkc

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

e282850

…into kkc

make_scale_interleaved

a96f2a5

lizexu123 added 7 commits May 19, 2026 17:11

fix

e20676d

fix

5284879

Merge remote-tracking branch 'origin/develop' into kkc

3a80481

fix

c22d3ea

fix

7eb643b

fix

9654004

update skills

47bee0a

lizexu123 had a problem deploying to Metax_ci May 20, 2026 11:09 — with GitHub Actions Error

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

9655daa

…into ck

lizexu123 had a problem deploying to Metax_ci May 20, 2026 11:10 — with GitHub Actions Failure

PaddlePaddle-bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ck#7869

Ck#7869
lizexu123 wants to merge 38 commits into
PaddlePaddle:developfrom
lizexu123:ck

lizexu123 commented May 20, 2026

Uh oh!

paddle-bot Bot commented May 20, 2026

Uh oh!

CLAassistant commented May 20, 2026

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot May 20, 2026

Uh oh!

PaddlePaddle-bot May 20, 2026

Uh oh!

PaddlePaddle-bot commented May 20, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

lizexu123 commented May 20, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented May 20, 2026

Uh oh!

CLAassistant commented May 20, 2026

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

📝 PR 规范检查

总体评价

Uh oh!

PaddlePaddle-bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 任务总览

2 任务状态汇总

2.1 Required任务 : 9/10 通过

2.2 可选任务 — 28/32 通过

3 失败详情（仅 required）

Uh oh!

codecov-commenter commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

PaddlePaddle-bot commented May 20, 2026 •

edited

Loading

codecov-commenter commented May 20, 2026 •

edited

Loading