Fix ref average pooling divisor for count_include_pad with asymmetric padding#5007
Fix ref average pooling divisor for count_include_pad with asymmetric padding#5007HamzaIkhurram wants to merge 4 commits into
Conversation
|
Thank you for your contribution! Since this is an external pull request, a maintainer must review PR and add the "ok-to-test" label if it is approved for testing. |
… padding The reference (CPU) implementation of average pooling with count_include_pad=true clipped the window against the leading pad (padding_vals[d_2]) when a window reached into the trailing pad. For asymmetric padding where the trailing pad is larger than the leading pad, this shrank the window and divided by a too-small count, producing an incorrect result while the GPU result was correct. count_include_pad always divides by the full kernel window; out-of-bounds input elements are already skipped from the sum but still counted in pool_size. Clip the window against the trailing pad of the axis instead, which only trims a genuine ceil_mode overflow beyond the declared padding. Adds a ref golden-value test reproducing the issue and a verify test pinning ref == gpu for count_include_pad with asymmetric padding. Closes ROCm#4350
9bfe5e8 to
6dd9c7d
Compare
| const bool padding_is_asymmetric = padding_vals.size() == 2 * num_spatial_dims; | ||
|
|
||
| // padding_vals is either one value per axis or a {begin..., end...} list. | ||
| auto trailing_padding = [&](std::size_t axis) { |
There was a problem hiding this comment.
Make this a member function.
There was a problem hiding this comment.
Pull request overview
This PR fixes the reference (CPU/ref target) average-pooling divisor computation for count_include_pad=true when padding is asymmetric, aligning ref results with GPU behavior and unblocking verification for the reported case (#4350).
Changes:
- Update
calc_poolingin the pooling op to clip against the trailing pad (not leading) forcount_include_pad=trueaverage pooling. - Add a ref golden-value regression test reproducing the asymmetric-padding include-pad case.
- Add a verify test intended to pin
ref == gpufor the missing asymmetric-padding include-pad matrix cell, plus a changelog entry.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/include/migraphx/op/pooling.hpp |
Fixes the ref pooling window end limit for include-pad average pooling with asymmetric padding. |
test/ref/pooling.cpp |
Adds a ref-only golden-value regression test for count_include_pad=true with asymmetric padding. |
test/verify/test_avg_pooling_include_pad_asym.cpp |
Adds a verify test intended to assert ref == gpu for asymmetric padding + include-pad average pooling. |
CHANGELOG.md |
Documents the resolved issue in the changelog. |
| mm->add_instruction(migraphx::make_op("pooling", | ||
| {{"mode", migraphx::op::pooling_mode::average}, | ||
| {"padding", {1, 1, 2, 2}}, | ||
| {"stride", {1, 1}}, | ||
| {"lengths", {3, 3}}, | ||
| {"count_include_pad", true}}), | ||
| input); | ||
| return p; |
| // 1D case with asymmetric padding and count_include_pad=true (issue #4350). | ||
| // count_include_pad must always divide by the full window size; the window | ||
| // must not be shrunk by the (smaller) leading pad when it reaches into the | ||
| // trailing pad. The ref result must match the gpu here. |
…o test - Hoist the trailing_padding lambda in calc_pooling into a static member function on the pooling struct (per review feedback). - Add an explicit add_return for the pooling result in the test_avg_pooling_include_pad_asym verify program. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixes the licensing CI check, which requires the current year in the copyright stamp of any file modified by the PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
eddieliao
left a comment
There was a problem hiding this comment.
Minor nitpicks, mostly on comments. Good work.
| end = std::min(start + dilated_kernel_dim, in_lens[dim]); | ||
| } | ||
| // count_include_pad divides by the full window, so it keeps the declared | ||
| // trailing padding (issue #4350); every other case stops at the input edge. |
There was a problem hiding this comment.
No need to reference back to the original issue number.
| // trailing padding (issue #4350); every other case stops at the input edge. | |
| // trailing padding; every other case stops at the input edge. |
| // 1D case with asymmetric padding and count_include_pad=true (issue #4350). | ||
| // count_include_pad must always divide by the full window size; the window | ||
| // must not be shrunk by the (smaller) leading pad when it reaches into the | ||
| // trailing pad. The ref result must match the gpu here. |
There was a problem hiding this comment.
| // 1D case with asymmetric padding and count_include_pad=true (issue #4350). | |
| // count_include_pad must always divide by the full window size; the window | |
| // must not be shrunk by the (smaller) leading pad when it reaches into the | |
| // trailing pad. The ref result must match the gpu here. | |
| // 1D case with asymmetric padding and count_include_pad=true. | |
| // count_include_pad must always divide by the full window size; the window | |
| // must not be shrunk by the (smaller) leading pad when it reaches into the | |
| // trailing pad. |
| // count_include_pad=true combined with asymmetric padding: the ref backend used | ||
| // to shrink the window against the leading pad, producing a wrong divisor where | ||
| // the window reached into the (larger) trailing pad. This pins ref == gpu for | ||
| // that cell (issue #4350). |
There was a problem hiding this comment.
Not really needed, test name is descriptive enough.
|
|
||
| ### Resolved issues | ||
|
|
||
| * Fixed an incorrect reference (CPU) result for average pooling with `count_include_pad=true` and asymmetric padding, where the window was clipped against the leading pad and produced a too-small divisor (#4350). |
There was a problem hiding this comment.
This should reference the PR, not the original issue.
| * Fixed an incorrect reference (CPU) result for average pooling with `count_include_pad=true` and asymmetric padding, where the window was clipped against the leading pad and produced a too-small divisor (#4350). | |
| * Fixed an incorrect reference (CPU) result for average pooling with `count_include_pad=true` and asymmetric padding, where the window was clipped against the leading pad and produced a too-small divisor (#5007). |
| {"lengths", {3, 3}}, | ||
| {"count_include_pad", true}}), | ||
| input); | ||
| mm->add_return({r}); |
There was a problem hiding this comment.
Pretty sure this isn't actually needed; MIGraphX should take automatically treat the output of the last instruction as program output if a return isn't specified.
Motivation
The reference (CPU) implementation of average pooling returns an incorrect result for
count_include_pad=truewith asymmetric padding. The GPU produces the correct value, so verification fails for this case. Reported in #4350.Minimal repro (from the issue):
Technical Details
calc_pooling(insrc/include/migraphx/op/pooling.hpp) clipped the pooling window end againstin_lens[dim] + padding_vals[d_2], i.e. the leading pad. When a window reaches into the trailing pad and that pad is larger than the leading pad (asymmetric padding), the window was shrunk andpool_sizebecame too small, so the average divided by the wrong count. This was introduced in #1823, which was validated only with symmetric padding (where leading == trailing, hiding the bug).count_include_pad=truemust always divide by the full kernel window. Out-of-bounds input elements are already excluded from the sum and, withcount_include_pad=true, still counted inpool_size. The only legitimate reason to clip the window is a genuineceil_modeoverflow past the declared padding, so the fix clips against the trailing pad of the axis (padding[d_2 + nspatial]for asymmetric padding, falling back topadding[d_2]for symmetric), matching the begin/end padding convention already used incalc_spatial_dim_out.Tests:
test/ref/pooling.cpp:avgpool_rank3_count_include_pad_asym_pad_test— golden-value test reproducing Incorrect ref version for pooling with count_include_pad=True #4350 on the ref backend (fails before the fix, passes after).test/verify/test_avg_pooling_include_pad_asym.cpp— verify test pinningref == gpuforcount_include_padwith asymmetric padding (the missing matrix cell that let the regression through).The
count_include_pad=falseand max/lpnorm paths are unchanged.Changelog Category
Closes #4350