Add invert_permute operator with SYCL implementation for XPU by aagalleg · Pull Request #71 · intel/torchlib-xpu

aagalleg · 2026-06-18T23:20:20Z

This PR introduces the invert_permute operator to fbgemm-xpu, enabling inverse permutations on Intel XPU devices.

Changes

Core Implementation

SYCL Kernel (invert_permute_kernel.cpp/h): High-performance SYCL implementation that computes inverse permutations on XPU devices
Operator Registration (ops_registry.cpp): Registers the operator with PyTorch's dispatch system using conditional schema registration to avoid conflicts
Python API (ops.py): Clean Python wrapper function with type hints for easy integration

Infrastructure

Build System (CMakeLists.txt): Added SYCL kernel to build configuration
Torch Library Utilities (torch_library.h): Helper utilities for schema existence checking and operator registration
Cleanup: Removed .gitkeep files from populated directories

Testing

Comprehensive Test Suite (test_invert_permute.py):
- Correctness validation for int32/int64 data types
- Edge cases (empty, single element, identity, reverse permutations)
- Input validation and error handling
- CPU-XPU consistency verification

cc: @manuelhsantana, @flezaalv

Remove .gitkeep placeholders

- Add invert_permute kernel to CMake build - Implement invert_permute Python wrapper in ops.py - Register invert_permute operator with schema existence check - Add torch_library.h utility for schema validation

Add SYCL/XPU kernel implementation for invert_permute operation.

Add complete test coverage for invert_permute operator on XPU devices, covering correctness, validation, parity, and performance. Test coverage includes: - Correctness tests for int32/int64 with edge cases (empty, single element, identity, reverse, random permutations) - Input validation tests for invalid dimensions and dtypes - Meta function tests for torch.compile compatibility - PyTorch opcheck validation for operator conventions - Parametric tests with varying sizes (1 to 1M elements) - CPU-XPU parity tests to ensure consistent results - Performance benchmarks measuring execution time and bandwidth

dvrogozh

@aagalleg, I missed the beginning of the PRs chain and left all the comments in #73. Comments are pretty generic, can you please, refer to them?

For the convenience, please, mark all follow up PRs as drafts - in this way we will clearly know what should be reviewed first.

dvrogozh

In addition to inline comments:

Repository has lint CI - please, address identified issues.
What we implement and merge must be tested under CI. And IMHO doing that by engineers who implemented the code is the best variant rather than offloading that to dedicated CI engineers. Thus, please, modify .github/workflows/ci.yml and add a test section for FBGEMM following the pattern outlined for torchcodec (see link below). The test job for FBGEMM will be much simpler. Target only bmg for now is fine. Note that fbgemm wheel is already getting build by CI.

torchlib-xpu/.github/workflows/ci.yml

Line 199 in d4230aa

test-torchcodec:

dvrogozh · 2026-06-22T18:37:39Z

+      .has_value();
+}
+
+} // namespace fbgemm_xpu::utils::torch


Add empty line in the end of file.

dvrogozh · 2026-06-22T18:38:15Z

+// SYCL Kernel Functor Implementations
+// ============================================================================
+
+void InvertPermuteKernelInt32::operator()(const sycl::nd_item<1>& item) const {


This seems templetable kernel over the operand type. Why don't we do that especially considering that the reference CUDA kernel is doing exactly that [1]?

[1] https://github.com/pytorch/FBGEMM/blob/8744183e4bd7390a674a0d8736bf4d7ef2d0516e/fbgemm_gpu/src/sparse_ops/sparse_invert_permute.cu#L15

dvrogozh · 2026-06-22T18:38:48Z

+
 __all__ = [
-	"dense_embedding_codegen_lookup_function",
+    "invert_permute",


Here I am confused. Why do we expose these XPU specific APIs from fbgemm xpu? who is supposed to use them? My understanding was that we are just register low level entrypoints for XPU key so that standard fbgemm APIs start to work on XPU device. Shouldn't that be that user calls fbgemm::invert_permute for example?

aagalleg added 4 commits June 18, 2026 22:37

chore(fbgemm-xpu): remove .gitkeep files from populated directories

081cc88

Remove .gitkeep placeholders

feat(fbgemm-xpu): register invert_permute operation

07bfb4b

- Add invert_permute kernel to CMake build - Implement invert_permute Python wrapper in ops.py - Register invert_permute operator with schema existence check - Add torch_library.h utility for schema validation

feat(fbgemm-xpu): add SYCL kernel implementation for invert_permute

d6d6bad

Add SYCL/XPU kernel implementation for invert_permute operation.

aagalleg requested review from dvrogozh, eromomon and luis-real as code owners June 18, 2026 23:20

aagalleg mentioned this pull request Jun 18, 2026

Add permute_1D_sparse_data Operation for Intel XPU #72

Draft

dvrogozh reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add invert_permute operator with SYCL implementation for XPU#71

Add invert_permute operator with SYCL implementation for XPU#71
aagalleg wants to merge 4 commits into
intel:mainfrom
aagalleg:feat/invert_permute

aagalleg commented Jun 18, 2026 •

edited

Loading

Uh oh!

dvrogozh left a comment

Uh oh!

dvrogozh left a comment

Uh oh!

dvrogozh Jun 22, 2026

Uh oh!

dvrogozh Jun 22, 2026

Uh oh!

dvrogozh Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

aagalleg commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Core Implementation

Infrastructure

Testing

Uh oh!

dvrogozh left a comment

Choose a reason for hiding this comment

Uh oh!

dvrogozh left a comment

Choose a reason for hiding this comment

Uh oh!

dvrogozh Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

dvrogozh Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

dvrogozh Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aagalleg commented Jun 18, 2026 •

edited

Loading