[OP] support deepgeem for sm103 by BingooYang · Pull Request #7073 · PaddlePaddle/FastDeploy

BingooYang · 2026-03-30T05:48:10Z

Motivation

deepgeem支持sm103

Modifications

deepgeem适配

Usage or Command

none

Accuracy Tests

none

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-03-30T05:48:18Z

Thanks for your contribution!

codecov-commenter · 2026-03-30T07:27:34Z

Codecov Report

❌ Patch coverage is 33.33333% with 4 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@1a1d048). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...del_executor/layers/quantization/block_wise_fp8.py	25.00%	1 Missing and 2 partials ⚠️
...oy/model_executor/layers/quantization/fp8_utils.py	50.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7073   +/-   ##
==========================================
  Coverage           ?   73.39%           
==========================================
  Files              ?      402           
  Lines              ?    56435           
  Branches           ?     8904           
==========================================
  Hits               ?    41422           
  Misses             ?    12081           
  Partials           ?     2932

Flag	Coverage Δ
GPU	`73.39% <33.33%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

qingqing01 · 2026-03-31T01:44:13Z

fastdeploy/model_executor/layers/quantization/block_wise_fp8.py

这行代码和2.5分支不一样，https://github.com/PaddlePaddle/FastDeploy/pull/7081/changes

zoooo0820 · 2026-03-31T03:35:27Z

fastdeploy/model_executor/layers/quantization/block_wise_fp8.py

+    sm_version = get_sm_version()
+    if sm_version >= 100:
+        assert x_scale_tensor.dtype == paddle.uint8, "For sm100+, x_scale_tensor must be uint8 dtype."
+    if sm_version == 100 and current_platform.is_cuda():


这里>=100和==100的分支也可以合并吧，目前>100会落到90的分支？

zoooo0820 · 2026-03-31T03:41:11Z

fastdeploy/model_executor/layers/quantization/block_wise_fp8.py

 ):
-    if get_sm_version() == 100 and current_platform.is_cuda():
+    sm_version = get_sm_version()
+    if sm_version >= 100:


这个check是不是 fastdeploy/model_executor/layers/moe/fused_moe_deepgemm_backend.py 的group gemm处也可以加下

support deepgeem for sm103

d8fdf9d

BingooYang temporarily deployed to Metax_ci March 30, 2026 05:48 — with GitHub Actions Inactive

BingooYang temporarily deployed to Metax_ci March 30, 2026 09:07 — with GitHub Actions Inactive

BingooYang changed the title ~~support deepgeem for sm103~~ [OP] support deepgeem for sm103 Mar 30, 2026

add assert

703c828

BingooYang had a problem deploying to Metax_ci March 30, 2026 13:08 — with GitHub Actions Error

modify code style

5a2b811

BingooYang temporarily deployed to Metax_ci March 30, 2026 13:15 — with GitHub Actions Inactive

qingqing01 reviewed Mar 31, 2026

View reviewed changes

zoooo0820 reviewed Mar 31, 2026

View reviewed changes

add assert

a188b9a

BingooYang temporarily deployed to Metax_ci March 31, 2026 08:06 — with GitHub Actions Inactive

modify sm version condition

5d32f41

BingooYang temporarily deployed to Metax_ci March 31, 2026 11:55 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OP] support deepgeem for sm103#7073

[OP] support deepgeem for sm103#7073
BingooYang wants to merge 5 commits intoPaddlePaddle:developfrom
BingooYang:deepgemm_sm103

BingooYang commented Mar 30, 2026

Uh oh!

paddle-bot bot commented Mar 30, 2026

Uh oh!

codecov-commenter commented Mar 30, 2026 •

edited

Loading

Uh oh!

qingqing01 Mar 31, 2026

Uh oh!

zoooo0820 Mar 31, 2026

Uh oh!

zoooo0820 Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

BingooYang commented Mar 30, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Mar 30, 2026

Uh oh!

codecov-commenter commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

qingqing01 Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

zoooo0820 Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

zoooo0820 Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Mar 30, 2026 •

edited

Loading