gguf better support for transformers5.0 and fix bug of Qwen3Next by n1ck-guo · Pull Request #1474 · intel/auto-round

n1ck-guo · 2026-02-27T06:42:02Z

Description

gguf better support for transformers5.0 and fix bug of Qwen3Next

Type of Change

Related Issues

Fixes or relates to #1454

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

…oder-Next Signed-off-by: n1ck-guo <heng.guo@intel.com>

Copilot

Pull request overview

This PR aims to improve GGUF export compatibility with Transformers v5.0 and address a Qwen3Next GGUF export failure reported in #1454.

Changes:

Re-enable GGUF-related CUDA/CPU tests for Transformers ≥ 5.0 by removing version-based skips.
Adjust GGUF export conversion logic to handle Qwen3Next tensor handling and tweak memory-clearing behavior.
Improve test fixture model saving by copying tokenizer.model when present, and reduce repeated calibration warnings via warning_once.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`test/test_cuda/export/test_gguf.py`	Removes Transformers v5 skip to run GGUF export tests on v5+.
`test/test_cuda/advanced/test_fp8_input.py`	Removes Transformers v5 skip for GGUF-related FP8 tests.
`test/test_cpu/export/test_gguf_format.py`	Removes Transformers v5 skip for CPU GGUF format tests.
`test/helpers.py`	Copies `tokenizer.model` into the saved tiny model directory for GGUF/tokenizer compatibility.
`auto_round/export/export_to_gguf/convert.py`	Adds Qwen3Next-specific tensor modification hook and changes memory clearing device selection.
`auto_round/compressors/utils.py`	Wraps gguf-py architecture detection to provide a clearer “upgrade gguf-py” error path.
`auto_round/compressors/base.py`	Switches an “insufficient samples” warning to `logger.warning_once`.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Signed-off-by: n1ck-guo <heng.guo@intel.com>

…el/auto-round into hengguo/gguf_transformers5.0

Signed-off-by: n1ck-guo <heng.guo@intel.com>

gguf foramt better support for transformers5.0 and fix bug of Qwen3-C…

5d81363

…oder-Next Signed-off-by: n1ck-guo <heng.guo@intel.com>

Copilot AI review requested due to automatic review settings February 27, 2026 06:42

n1ck-guo mentioned this pull request Feb 27, 2026

[Bug]: Qwen3-Coder-Next/MiniMax-M2.1-REAP-172B-A10B: RuntimeError: shape '[65536, -1]' is invalid for input of size 98304 #1454

Closed

Copilot started reviewing on behalf of n1ck-guo February 27, 2026 06:42 View session

n1ck-guo requested a review from xin3he February 27, 2026 06:42

Copilot AI reviewed Feb 27, 2026

View reviewed changes

Comment thread auto_round/export/export_to_gguf/convert.py Outdated

n1ck-guo and others added 4 commits February 27, 2026 15:03

Update auto_round/export/export_to_gguf/convert.py

274bd5b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

refactor

3c451be

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge branch 'hengguo/gguf_transformers5.0' of https://github.com/int…

d50f378

…el/auto-round into hengguo/gguf_transformers5.0

fix

cce59c7

Signed-off-by: n1ck-guo <heng.guo@intel.com>

xin3he approved these changes Mar 3, 2026

View reviewed changes

n1ck-guo merged commit ab14698 into main Mar 3, 2026
29 checks passed

n1ck-guo deleted the hengguo/gguf_transformers5.0 branch March 3, 2026 07:43

lvliang-intel pushed a commit that referenced this pull request Mar 3, 2026

gguf better support for transformers5.0 and fix bug of Qwen3Next (#1474)

2de46ac

Signed-off-by: n1ck-guo <heng.guo@intel.com>

WeiweiZhang1 pushed a commit that referenced this pull request Mar 4, 2026

gguf better support for transformers5.0 and fix bug of Qwen3Next (#1474)

3a49037

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf better support for transformers5.0 and fix bug of Qwen3Next#1474

gguf better support for transformers5.0 and fix bug of Qwen3Next#1474
n1ck-guo merged 5 commits intomainfrom
hengguo/gguf_transformers5.0

n1ck-guo commented Feb 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

n1ck-guo commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

n1ck-guo commented Feb 27, 2026 •

edited

Loading