Skip to content

Fix ref average pooling divisor for count_include_pad with asymmetric padding#5007

Open
HamzaIkhurram wants to merge 4 commits into
ROCm:developfrom
HamzaIkhurram:fix-pooling-count-include-pad-asym-4350
Open

Fix ref average pooling divisor for count_include_pad with asymmetric padding#5007
HamzaIkhurram wants to merge 4 commits into
ROCm:developfrom
HamzaIkhurram:fix-pooling-count-include-pad-asym-4350

Conversation

@HamzaIkhurram

Copy link
Copy Markdown

Motivation

The reference (CPU) implementation of average pooling returns an incorrect result for count_include_pad=true with asymmetric padding. The GPU produces the correct value, so verification fails for this case. Reported in #4350.

Minimal repro (from the issue):

p = migraphx.program()
m = p.get_main_module()
p_x = m.add_parameter("x", migraphx.shape(type="float_type", lens=[1, 1, 2]))
x_1 = m.add_instruction(migraphx.op("pooling", padding=[1,2], stride=[1], lengths=[3], dilations=[1], count_include_pad=True), [p_x])
m.add_return([x_1])

Technical Details

calc_pooling (in src/include/migraphx/op/pooling.hpp) clipped the pooling window end against in_lens[dim] + padding_vals[d_2], i.e. the leading pad. When a window reaches into the trailing pad and that pad is larger than the leading pad (asymmetric padding), the window was shrunk and pool_size became too small, so the average divided by the wrong count. This was introduced in #1823, which was validated only with symmetric padding (where leading == trailing, hiding the bug).

count_include_pad=true must always divide by the full kernel window. Out-of-bounds input elements are already excluded from the sum and, with count_include_pad=true, still counted in pool_size. The only legitimate reason to clip the window is a genuine ceil_mode overflow past the declared padding, so the fix clips against the trailing pad of the axis (padding[d_2 + nspatial] for asymmetric padding, falling back to padding[d_2] for symmetric), matching the begin/end padding convention already used in calc_spatial_dim_out.

Tests:

  • test/ref/pooling.cpp: avgpool_rank3_count_include_pad_asym_pad_test — golden-value test reproducing Incorrect ref version for pooling with count_include_pad=True #4350 on the ref backend (fails before the fix, passes after).
  • test/verify/test_avg_pooling_include_pad_asym.cpp — verify test pinning ref == gpu for count_include_pad with asymmetric padding (the missing matrix cell that let the regression through).

The count_include_pad=false and max/lpnorm paths are unchanged.

Changelog Category

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

Closes #4350

@HamzaIkhurram HamzaIkhurram requested review from a team and causten as code owners June 24, 2026 01:28
@github-actions

Copy link
Copy Markdown
Contributor

Thank you for your contribution! Since this is an external pull request, a maintainer must review PR and add the "ok-to-test" label if it is approved for testing.

… padding

The reference (CPU) implementation of average pooling with
count_include_pad=true clipped the window against the leading pad
(padding_vals[d_2]) when a window reached into the trailing pad. For
asymmetric padding where the trailing pad is larger than the leading
pad, this shrank the window and divided by a too-small count, producing
an incorrect result while the GPU result was correct.

count_include_pad always divides by the full kernel window;
out-of-bounds input elements are already skipped from the sum but still
counted in pool_size. Clip the window against the trailing pad of the
axis instead, which only trims a genuine ceil_mode overflow beyond the
declared padding.

Adds a ref golden-value test reproducing the issue and a verify test
pinning ref == gpu for count_include_pad with asymmetric padding.

Closes ROCm#4350
@HamzaIkhurram HamzaIkhurram force-pushed the fix-pooling-count-include-pad-asym-4350 branch from 9bfe5e8 to 6dd9c7d Compare June 24, 2026 01:31
@causten causten requested review from kahmed10 and pfultz2 June 26, 2026 02:44
Comment thread src/include/migraphx/op/pooling.hpp Outdated
const bool padding_is_asymmetric = padding_vals.size() == 2 * num_spatial_dims;

// padding_vals is either one value per axis or a {begin..., end...} list.
auto trailing_padding = [&](std::size_t axis) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a member function.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the reference (CPU/ref target) average-pooling divisor computation for count_include_pad=true when padding is asymmetric, aligning ref results with GPU behavior and unblocking verification for the reported case (#4350).

Changes:

  • Update calc_pooling in the pooling op to clip against the trailing pad (not leading) for count_include_pad=true average pooling.
  • Add a ref golden-value regression test reproducing the asymmetric-padding include-pad case.
  • Add a verify test intended to pin ref == gpu for the missing asymmetric-padding include-pad matrix cell, plus a changelog entry.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/include/migraphx/op/pooling.hpp Fixes the ref pooling window end limit for include-pad average pooling with asymmetric padding.
test/ref/pooling.cpp Adds a ref-only golden-value regression test for count_include_pad=true with asymmetric padding.
test/verify/test_avg_pooling_include_pad_asym.cpp Adds a verify test intended to assert ref == gpu for asymmetric padding + include-pad average pooling.
CHANGELOG.md Documents the resolved issue in the changelog.

Comment on lines +44 to +51
mm->add_instruction(migraphx::make_op("pooling",
{{"mode", migraphx::op::pooling_mode::average},
{"padding", {1, 1, 2, 2}},
{"stride", {1, 1}},
{"lengths", {3, 3}},
{"count_include_pad", true}}),
input);
return p;
Comment thread test/ref/pooling.cpp
Comment on lines +157 to +160
// 1D case with asymmetric padding and count_include_pad=true (issue #4350).
// count_include_pad must always divide by the full window size; the window
// must not be shrunk by the (smaller) leading pad when it reaches into the
// trailing pad. The ref result must match the gpu here.
HamzaIkhurram and others added 2 commits June 26, 2026 13:09
…o test

- Hoist the trailing_padding lambda in calc_pooling into a static member
  function on the pooling struct (per review feedback).
- Add an explicit add_return for the pooling result in the
  test_avg_pooling_include_pad_asym verify program.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixes the licensing CI check, which requires the current year in the
copyright stamp of any file modified by the PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@eddieliao eddieliao left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpicks, mostly on comments. Good work.

end = std::min(start + dilated_kernel_dim, in_lens[dim]);
}
// count_include_pad divides by the full window, so it keeps the declared
// trailing padding (issue #4350); every other case stops at the input edge.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to reference back to the original issue number.

Suggested change
// trailing padding (issue #4350); every other case stops at the input edge.
// trailing padding; every other case stops at the input edge.

Comment thread test/ref/pooling.cpp
Comment on lines +157 to +160
// 1D case with asymmetric padding and count_include_pad=true (issue #4350).
// count_include_pad must always divide by the full window size; the window
// must not be shrunk by the (smaller) leading pad when it reaches into the
// trailing pad. The ref result must match the gpu here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// 1D case with asymmetric padding and count_include_pad=true (issue #4350).
// count_include_pad must always divide by the full window size; the window
// must not be shrunk by the (smaller) leading pad when it reaches into the
// trailing pad. The ref result must match the gpu here.
// 1D case with asymmetric padding and count_include_pad=true.
// count_include_pad must always divide by the full window size; the window
// must not be shrunk by the (smaller) leading pad when it reaches into the
// trailing pad.

Comment on lines +31 to +34
// count_include_pad=true combined with asymmetric padding: the ref backend used
// to shrink the window against the leading pad, producing a wrong divisor where
// the window reached into the (larger) trailing pad. This pins ref == gpu for
// that cell (issue #4350).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really needed, test name is descriptive enough.

Comment thread CHANGELOG.md

### Resolved issues

* Fixed an incorrect reference (CPU) result for average pooling with `count_include_pad=true` and asymmetric padding, where the window was clipped against the leading pad and produced a too-small divisor (#4350).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should reference the PR, not the original issue.

Suggested change
* Fixed an incorrect reference (CPU) result for average pooling with `count_include_pad=true` and asymmetric padding, where the window was clipped against the leading pad and produced a too-small divisor (#4350).
* Fixed an incorrect reference (CPU) result for average pooling with `count_include_pad=true` and asymmetric padding, where the window was clipped against the leading pad and produced a too-small divisor (#5007).

{"lengths", {3, 3}},
{"count_include_pad", true}}),
input);
mm->add_return({r});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this isn't actually needed; MIGraphX should take automatically treat the output of the last instruction as program output if a return isn't specified.

@eddieliao eddieliao added bugfix Fixes a bug found in the code. Changelog: Resolved Issues Known issues from a previous version that have been resolved. labels Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Fixes a bug found in the code. Changelog: Resolved Issues Known issues from a previous version that have been resolved.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect ref version for pooling with count_include_pad=True

4 participants