fix save_quantized log conflict#1845
Conversation
Signed-off-by: WeiweiZhang1 <weiwei1.zhang@intel.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Pull request overview
This PR addresses the noisy “output_dir already exists, this may cause model conflict” warning that occurs when immediate saving (ShardWriter) creates the output directory during quantization, making the subsequent export step think it’s an overwrite scenario.
Changes:
- Adds an “immediate saving mode” detection helper and propagates an
immediate_savingflag into export/save paths. - Suppresses “already exists” conflict warnings when the directory existence is expected due to immediate saving.
- Adds a CPU test to validate immediate-saving exports don’t emit the spurious warning and produce loadable output artifacts.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/test_cpu/export/test_export.py | Adds a test for immediate-saving export behavior and log noise. |
| auto_round/export/utils.py | Introduces immediate-saving detection and extends save_model() with an immediate_saving option. |
| auto_round/export/export_to_llmcompressor/export.py | Passes immediate-saving status into save_model(). |
| auto_round/export/export_to_llmcompressor/export_to_static_fp.py | Suppresses “already exists” warning when immediate saving is active; passes flag to save_model(). |
| auto_round/export/export_to_llmcompressor/export_to_fp.py | Suppresses “already exists” warning when immediate saving is active; passes flag to save_model(). |
| auto_round/export/export_to_awq/export.py | Suppresses “already exists” warning when immediate saving is active; passes flag to save_model(). |
| auto_round/export/export_to_autoround/export.py | Adds conflict warning gated by immediate saving; passes flag to save_model(). |
| auto_round/export/export_to_autoround/export_to_nvfp_mx.py | Suppresses “already exists” warning when immediate saving is active; passes flag to save_model(). |
| auto_round/export/export_to_autoround/export_to_fp8.py | Passes immediate-saving status into save_model(). |
| auto_round/export/export_to_autogptq/export.py | Adds immediate-saving detection and passes flag into save_model(). |
| auto_round/compressors/base.py | Includes is_immediate_saving in serialization_dict so exporters can reliably detect immediate-saving mode. |
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: WeiweiZhang1 <weiwei1.zhang@intel.com>
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| from auto_round.version import __version__ | ||
|
|
||
| serialization_dict["autoround_version"] = __version__ | ||
| serialization_dict["is_immediate_saving"] = getattr(self.compress_context, "is_immediate_saving", False) |
There was a problem hiding this comment.
we'd better not put is_immediate_saving to serialization_dict as serialization_dict should only save args that may affect the accuracy or formats, which will be dumped to config.json
| AutoScheme automatically generates adaptive mixed-bit and mixed-data-type quantization recipes. For accuracy results, see [AutoScheme Accuracy Report](./auto_scheme_acc.md). | ||
|
|
||
| **Please note that mixed data types are supported during tuning, but cannot be exported to real models at this time..** | ||
| **Note:** Mixed data-type recipes (e.g., MXFP4/MXFP8, W2/W4/W8) are supported for both tuning and export. |
There was a problem hiding this comment.
better revert/refine this line of change as the meaning is different
| #### CLI Usage | ||
| use `iters=200`for tuning. | ||
|
|
||
| Use `--iters 0` for RTN-based scheme search (fastest). Add `--iters 200` if you want tuning-aware scheme selection. |
There was a problem hiding this comment.
please have a refinement, not easy to follow
#1841
Description
Please briefly describe your main changes, the motivation.
Type of Change
Bug fix
Related Issues
Fixes or relates to #
Checklist Before Submitting
/azp run Unit-Test-CUDA-AutoRound.