gguf better support for transformers5.0 and fix bug of Qwen3Next#1474
Merged
gguf better support for transformers5.0 and fix bug of Qwen3Next#1474
Conversation
…oder-Next Signed-off-by: n1ck-guo <heng.guo@intel.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR aims to improve GGUF export compatibility with Transformers v5.0 and address a Qwen3Next GGUF export failure reported in #1454.
Changes:
- Re-enable GGUF-related CUDA/CPU tests for Transformers ≥ 5.0 by removing version-based skips.
- Adjust GGUF export conversion logic to handle Qwen3Next tensor handling and tweak memory-clearing behavior.
- Improve test fixture model saving by copying
tokenizer.modelwhen present, and reduce repeated calibration warnings viawarning_once.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
test/test_cuda/export/test_gguf.py |
Removes Transformers v5 skip to run GGUF export tests on v5+. |
test/test_cuda/advanced/test_fp8_input.py |
Removes Transformers v5 skip for GGUF-related FP8 tests. |
test/test_cpu/export/test_gguf_format.py |
Removes Transformers v5 skip for CPU GGUF format tests. |
test/helpers.py |
Copies tokenizer.model into the saved tiny model directory for GGUF/tokenizer compatibility. |
auto_round/export/export_to_gguf/convert.py |
Adds Qwen3Next-specific tensor modification hook and changes memory clearing device selection. |
auto_round/compressors/utils.py |
Wraps gguf-py architecture detection to provide a clearer “upgrade gguf-py” error path. |
auto_round/compressors/base.py |
Switches an “insufficient samples” warning to logger.warning_once. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…el/auto-round into hengguo/gguf_transformers5.0
xin3he
approved these changes
Mar 3, 2026
lvliang-intel
pushed a commit
that referenced
this pull request
Mar 3, 2026
Signed-off-by: n1ck-guo <heng.guo@intel.com>
WeiweiZhang1
pushed a commit
that referenced
this pull request
Mar 4, 2026
Signed-off-by: n1ck-guo <heng.guo@intel.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
gguf better support for transformers5.0 and fix bug of Qwen3Next
Type of Change
Related Issues
Fixes or relates to #1454
Checklist Before Submitting