bugfix AS block leaks by zccjjj · Pull Request #7890 · PaddlePaddle/FastDeploy

zccjjj · 2026-05-22T03:09:25Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-05-22T03:09:32Z

Thanks for your contribution!

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-22 11:16:02

📋 Review 摘要

PR 概述：修复在 enable_prefix_caching + 分层 KV Cache 路径下，can_allocate_gpu_blocks 未通过 _get_can_schedule_prefill_threshold_block 计算预留块阈值，导致 AS block 泄漏的问题。
变更范围：fastdeploy/engine/sched/resource_manager_v1.py（调度资源管理）
影响面 Tag：[Scheduler] [KVCache]

问题

级别	文件	概述
📝 PR 规范	—	标题缺少官方 Tag；描述模板各段均为空
❓ 疑问	`resource_manager_v1.py:1205`	新增注释拼写错误：`Warnig` → `Warning`

📝 PR 规范检查

PR 标题 bugfix AS block leaks 使用了非官方的小写 bugfix 形式，需规范化为 [BugFix]；描述模板中 Motivation、Modifications 等段落均为占位符/空白。

标题建议（可直接复制）：

[BugFix] Fix AS block leaks when enable_prefix_caching with hierarchical kvcache

PR 描述建议（可直接复制，必须复刻 checklist §D2 模板的完整结构）：

## Motivation
在开启 `enable_prefix_caching` 且配置了分层 KV Cache（`num_cpu_blocks > 0` 或 `kvcache_storage_backend`）时，调度器在准入新 Prefill 请求前的 GPU Block 可用性检查直接使用了原始 block 数（`need_prefill_tokens / block_size`），而未调用 `_get_can_schedule_prefill_threshold_block` 计算包含 running 请求预留块的阈值。这导致在阈值偏低时分层 Cache 会为请求分配 block 进行前缀匹配，但后续实际分配时 GPU block 不足，引发 AS block 泄漏（storage blocks leak）。

## Modifications
- `fastdeploy/engine/sched/resource_manager_v1.py`：在两处 `enable_prefix_caching + 分层 KV Cache` 路径的 `can_allocate_gpu_blocks` 检查中，改为先调用 `_get_can_schedule_prefill_threshold_block` 计算含预留块的阈值，再传入 `can_allocate_gpu_blocks`，与文件中其余调用点保持一致。同时在 `_free_blocks` 前添加注释说明潜在的 storage block leak 风险。

## Usage or Command
N/A

## Accuracy Tests
N/A

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

修复方案逻辑正确，将两处遗漏的 can_allocate_gpu_blocks 调用统一改为通过 _get_can_schedule_prefill_threshold_block 计算阈值，与文件内其他调用点保持一致，能有效防止分层 Cache 场景下的死锁和 block 泄漏。建议补充 PR 描述和标题格式，并修正注释中的拼写错误 Warnig → Warning。

PaddlePaddle-bot · 2026-05-22T03:35:15Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-22 15:35:45

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: 84fd8c2
Merge base: 3ec5011 (branch: develop)
查看完整 Diff
CI 详情

1 任务总览

✅ 所有 Required 任务均已通过，当前 CI 不阻塞合并；仍有 3 个 Optional 任务失败、1 个 Optional 任务等待中，仅供参考。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
54(13)	41	37	3	0	1	0

2 任务状态汇总

2.1 Required任务 : 10/10 通过

必选任务阻塞合并，失败需优先处理。

状态	任务	耗时	根因	修复建议	日志	重跑
✅	其余 10 个必选任务通过	-	-	-	-	-

2.2 可选任务 — 27/31 通过

可选任务不阻塞合并，失败/等待仅供参考。

状态	任务	耗时	日志	重跑
❌	`Run iluvatar Tests / run_iluvatar_cases`	15m51s	Job	-
❌	`Check PR Template`	15s	Job	-
❌	`Trigger Jenkins for PR`	7m22s	Job	-
⏸️	`CI_HPU`	-	-	-
✅	其余 27 个可选任务通过	-	-	-

3 失败详情（仅 required）

无 required 失败任务。

codecov-commenter · 2026-05-22T04:00:44Z

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@3ec5011). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/engine/sched/resource_manager_v1.py	50.00%	0 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7890   +/-   ##
==========================================
  Coverage           ?   63.98%           
==========================================
  Files              ?      462           
  Lines              ?    64482           
  Branches           ?     9880           
==========================================
  Hits               ?    41256           
  Misses             ?    20450           
  Partials           ?     2776

Flag	Coverage Δ
GPU	`73.04% <50.00%> (?)`
XPU	`16.01% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bugfix AS block leaks

84fd8c2

zccjjj had a problem deploying to Metax_ci May 22, 2026 03:09 — with GitHub Actions Failure

PaddlePaddle-bot reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix AS block leaks#7890

bugfix AS block leaks#7890
zccjjj wants to merge 1 commit into
PaddlePaddle:developfrom
zccjjj:ASBlockLeaks

zccjjj commented May 22, 2026

Uh oh!

paddle-bot Bot commented May 22, 2026

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot commented May 22, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zccjjj commented May 22, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented May 22, 2026

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

📝 PR 规范检查

总体评价

Uh oh!

PaddlePaddle-bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 任务总览

2 任务状态汇总

2.1 Required任务 : 10/10 通过

2.2 可选任务 — 27/31 通过

3 失败详情（仅 required）

Uh oh!

codecov-commenter commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PaddlePaddle-bot commented May 22, 2026 •

edited

Loading

codecov-commenter commented May 22, 2026 •

edited

Loading