support AR format FP8 in vLLM by Zhenzhong1 · Pull Request #1798 · intel/auto-round

Zhenzhong1 · 2026-05-11T05:48:31Z

Related Issues

Type of Change

New feature

Test

auto-round --model /models/Llama-3.1-8B --scheme FP8_BLOCK --iters 0 --format auto_round

Output Model:

Copilot

Pull request overview

This PR aims to enable exporting block-wise FP8 quantization using the auto_round / auto_round:fp8 format (targeting vLLM compatibility), rather than forcing users to export with the standalone fp8 format.

Changes:

Removes the format-compatibility rewrite that previously replaced auto_round with fp8 for block-wise FP8 configs.
Updates FP8 export to always emit weight_block_size for tuple group_size, and additionally emits modules_to_not_convert alongside ignored_layers for layers kept in high precision.

wenhuach21 · 2026-05-12T08:06:39Z

            formats = tmp_format_name.split(",")
-    if isinstance(ar.group_size, tuple) and any(["auto_round" in f.lower() for f in formats]):
-        logger.warning(
-            "`auto_round` format can't be used for deploying block-wise fp8 quantization now, use `fp8` instead."


Please keep the warning and change it to: auto_round:fp8 format only supports vLLM inference for now. We recommend using the FP8 format via --format fp8 instead .

fixed. 33c5923

wenhuach21 · 2026-05-12T08:20:45Z


+    if isinstance(ar.group_size, tuple) and any(["auto_round" in f.lower() for f in formats]):
+        logger.warning(
+            "auto_round:fp8 format only supports vLLM inference for now. We recommend using the FP8 format via --format fp8 instead."


add `` to -format fp8

ok. c167289

yiliu30

Please add some UTs; others LGTM.

support ar format fp8 in vLLM

bd3e06c

Copilot AI review requested due to automatic review settings May 11, 2026 05:48

Copilot started reviewing on behalf of Zhenzhong1 May 11, 2026 05:49 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Comment thread auto_round/formats.py Outdated

support ar format fp8 in vLLM

4ce39d8

This was referenced May 11, 2026

[AutoRound] Support AutoRound Format Block-Wise FP8 in vLLM vllm-project/vllm#42279

Draft

[vLLM]: Enable FP8 autoround format model inference on CUDA #1536

Open

Zhenzhong1 requested review from wenhuach21 and yiliu30 May 11, 2026 08:49

wenhuach21 reviewed May 12, 2026

View reviewed changes

wenhuach21 approved these changes May 12, 2026

View reviewed changes

add logger warning

33c5923

wenhuach21 reviewed May 12, 2026

View reviewed changes

Zhenzhong1 added 2 commits May 12, 2026 16:26

add logger warning

c167289

fixed logger warning pylint

23dd962

yiliu30 approved these changes May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support AR format FP8 in vLLM#1798

support AR format FP8 in vLLM#1798
Zhenzhong1 wants to merge 5 commits into
mainfrom
zhenzhong/arformat_fp8

Zhenzhong1 commented May 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

wenhuach21 May 12, 2026 •

edited

Loading

Uh oh!

Zhenzhong1 May 12, 2026

Uh oh!

wenhuach21 May 12, 2026

Uh oh!

Zhenzhong1 May 12, 2026

Uh oh!

yiliu30 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Zhenzhong1 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Type of Change

Test

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

wenhuach21 May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zhenzhong1 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

wenhuach21 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Zhenzhong1 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

yiliu30 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Zhenzhong1 commented May 11, 2026 •

edited

Loading

wenhuach21 May 12, 2026 •

edited

Loading