fix: clamp dynamic NVFP4 FP8 scale export#1465
Conversation
Signed-off-by: ShawRong <shawnrong1213@gmail.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughNVFP4 quantization now prevents numeric saturation to NaN when converting very large per-block scales to FP8 format by clamping values to 448.0 before the conversion. A regression test verifies the fix works by checking that exported FP8 scale bytes contain no NaN values and saturate correctly. ChangesNVFP4 FP8 Scale Saturation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
What does this PR do?
Type of change: Bug fix
Clamp the dynamic NVFP4 per-block FP8 scale before casting it to
torch.float8_e4m3fnduring export.The static quantizer-derived path already clamps the exported FP8 scale, but the dynamic path can still produce scale values larger than the finite FP8 E4M3 range. Since
torch.float8_e4m3fnhas no Inf representation, casting values such as480or1000produces NaN payloads in the exported scale tensor. This patch applies the same finite-range clamp to the dynamic path so exported dynamic NVFP4 scales remain finite.Usage
Testing
Result:
1 passed.Also ran pre-commit on the touched files:
Result: all applicable hooks passed.
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: N/AAdditional Information
This complements the existing static NVFP4 scale clamp by applying the same safety guard to the dynamic scale export path.
Summary by CodeRabbit
Bug Fixes
Tests