Add XPU MoE decode kernel (FP16/BF16 + INT4 sym/asym) by Copilot · Pull Request #1813 · intel/auto-round

Copilot · 2026-05-14T04:27:10Z

Add INT8 (sym/asym) decode GEMV kernel in sycl_tla_moe_decode.hpp
Add INT2 (sym/asym) decode GEMV kernel (4 packed values per byte, sign-extended via shift trick)
Add FP8 (E4M3 / E5M2) decode GEMV kernel with inline bit-pattern decode (verified to match torch FP8 cast for all 256 byte values)
Extend moe_gemm_decode Python wrapper to accept weight_bits=8, weight_bits=2, and FP8 weight dtypes (torch.float8_e4m3fn, torch.float8_e5m2)
Add unit tests for INT8 sym/asym, INT2 sym/asym, FP8 E4M3/E5M2 (parametrized over fp16/bf16 acts and group_size where applicable) and an FP8+asym validation-error case in test_moe.py
Verified packing/dequant helpers and kernel-equivalent decode produce bit-exact reference output (int2 sym/asym diff=0; fp8 formulas match torch cast for all bytes)
parallel_validation: CodeQL 0 alerts; code-review nit on docstring addressed

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/95841e6d-d5d1-4662-8db0-4dd69690bc28 Co-authored-by: a32543254 <53296245+a32543254@users.noreply.github.com>

for more information, see https://pre-commit.ci

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/91221649-2c90-4404-ae86-3321b1581428 Co-authored-by: a32543254 <53296245+a32543254@users.noreply.github.com>

Copilot AI and others added 2 commits May 14, 2026 04:03

Add XPU MoE decode kernel with INT4 sym/asym and FP16/BF16 baselines

58b0900

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/95841e6d-d5d1-4662-8db0-4dd69690bc28 Co-authored-by: a32543254 <53296245+a32543254@users.noreply.github.com>

Document int4 sign-extension trick

527eede

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/95841e6d-d5d1-4662-8db0-4dd69690bc28 Co-authored-by: a32543254 <53296245+a32543254@users.noreply.github.com>

Copilot AI assigned Copilot and a32543254 May 14, 2026

Copilot created this pull request from a session on behalf of a32543254 May 14, 2026 04:27 View session

[pre-commit.ci] auto fixes from pre-commit.com hooks

78ecc0c

for more information, see https://pre-commit.ci

Copilot started work on behalf of a32543254 May 14, 2026 06:14 View session

Copilot finished work on behalf of a32543254 May 14, 2026 06:30

Copilot AI requested a review from a32543254 May 14, 2026 06:30

Copilot started work on behalf of a32543254 May 14, 2026 06:56 View session

Copilot finished work on behalf of a32543254 May 14, 2026 07:00

Copilot started work on behalf of a32543254 May 14, 2026 07:04 View session

Copilot AI and others added 2 commits May 14, 2026 07:16

Add INT8/INT2/FP8 decode MoE GEMV kernels and tests

5dc9d95

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/91221649-2c90-4404-ae86-3321b1581428 Co-authored-by: a32543254 <53296245+a32543254@users.noreply.github.com>

docs: clarify int2 bit-indexing notation in moe_gemm_decode

f15093a

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/91221649-2c90-4404-ae86-3321b1581428 Co-authored-by: a32543254 <53296245+a32543254@users.noreply.github.com>

Copilot finished work on behalf of a32543254 May 14, 2026 07:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add XPU MoE decode kernel (FP16/BF16 + INT4 sym/asym)#1813

Add XPU MoE decode kernel (FP16/BF16 + INT4 sym/asym)#1813
Copilot wants to merge 5 commits into
mainfrom
copilot/add-xpu-moe-decode-implementation

Copilot AI commented May 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented May 14, 2026 •

edited

Loading