Skip to content

Mark wave intrinsics as convergent to prevent loop miscompilation#8204

Open
JoeCitizen wants to merge 3 commits intomicrosoft:mainfrom
JoeCitizen:user/jackell/prefix-sum-bug
Open

Mark wave intrinsics as convergent to prevent loop miscompilation#8204
JoeCitizen wants to merge 3 commits intomicrosoft:mainfrom
JoeCitizen:user/jackell/prefix-sum-bug

Conversation

@JoeCitizen
Copy link
Collaborator

The JumpThreading pass could restructure loops containing wave intrinsics by threading edges through the loop latch block. This moved wave ops (WaveReadLaneFirst, WaveActiveCountBits, etc.) from inside the loop to after it, changing the set of active lanes at the call site. On SIMT hardware, this produced incorrect results because all lanes reconverge at the post-loop point rather than only the matching subset.

Fix:

  • DxilConvergentMark: Mark wave-sensitive HL functions with the Attribute::Convergent attribute before optimizer passes run.
  • JumpThreading: In ThreadEdge, walk backward from the latch to the loop header to identify loop body blocks. If any contains a convergent call, prevent threading through the latch.
  • DxilOperations::GetOpFunc: Mark DXIL wave op functions with Attribute::Convergent for post-dxilgen optimizer passes.

Fixes a bug where a material binning pattern using WaveReadLaneFirst + WaveActiveCountBits in a while-loop produced correct results at -Od but incorrect results with optimizations enabled.

The JumpThreading pass could restructure loops containing wave intrinsics
by threading edges through the loop latch block. This moved wave ops
(WaveReadLaneFirst, WaveActiveCountBits, etc.) from inside the loop to
after it, changing the set of active lanes at the call site. On SIMT
hardware, this produced incorrect results because all lanes reconverge
at the post-loop point rather than only the matching subset.

Fix:
- DxilConvergentMark: Mark wave-sensitive HL functions with the
  Attribute::Convergent attribute before optimizer passes run.
- JumpThreading: In ThreadEdge, walk backward from the latch to the
  loop header to identify loop body blocks. If any contains a convergent
  call, prevent threading through the latch.
- DxilOperations::GetOpFunc: Mark DXIL wave op functions with
  Attribute::Convergent for post-dxilgen optimizer passes.

Fixes a bug where a material binning pattern using WaveReadLaneFirst +
WaveActiveCountBits in a while-loop produced correct results at -Od but
incorrect results with optimizations enabled.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

if (existF->getFunctionType() != pFT)
return nullptr;
F = existF;
// HLSL Change Begin - ensure attributes are set on existing functions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These HLSL Change markers are also not needed.

@JoeCitizen JoeCitizen marked this pull request as ready for review February 26, 2026 21:47
- Rename bUpdated -> Updated (drop Hungarian prefix per reviewer feedback)
- Remove // HLSL Change Begin/End markers in DxilConvergent.cpp since
  the file is already HLSL-specific

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JoeCitizen JoeCitizen force-pushed the user/jackell/prefix-sum-bug branch from fa67aa2 to 8dc2b90 Compare March 2, 2026 19:32
Copy link
Contributor

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have several problems with this change in its current form.

  1. Change impacts final DXIL
  2. Test JumpThreading pass change with a pass test under DXC/Passes
  3. Let's not add tests under HLSLFileCheck if we can avoid it.
  4. It looks like JumpThreading is run before DxilGeneration, so how could convergent attr on the DXIL op make a difference during this pass when you only have the HL ops during this pass?

F->addFnAttr(OpProps.FuncAttr);
// Mark wave ops as convergent since they depend on the active lane set.
if (IsDxilOpWave(opCode) && !F->hasFnAttribute(Attribute::Convergent))
F->addFnAttr(Attribute::Convergent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change modifies attributes that show up in final DXIL. We should be cautious with changes like that, since it could break drivers not expecting it. We need to decide whether we want to expose the convergent attribute in next DXIL version and probably filter attributes on DXIL ops for prior shader models in DxilFinalizeModule (defined in DxilPreparePasses.cpp).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a pass test (under DXC/Passes) if you're testing the JumpThreading pass. Plus this test is under the older TAEF test suite, and isn't run as a lit shell test. It would be better to keep additions under the lit shell test locations. For a test compiling HLSL to DXIL, I think that would be under CodeGenDXIL.

@github-project-automation github-project-automation bot moved this from New to In progress in HLSL Roadmap Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

3 participants