Skip to content

Add rotary embedding onnx domain support#29261

Open
xiaoyu-work wants to merge 2 commits into
mainfrom
xiaoyu/re
Open

Add rotary embedding onnx domain support#29261
xiaoyu-work wants to merge 2 commits into
mainfrom
xiaoyu/re

Conversation

@xiaoyu-work

Copy link
Copy Markdown
Contributor

Description

Mobius exports standard ONNX rotary embedding op. Adding support for this.

Motivation and Context

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the GroupQueryAttention fusion optimizer pass to recognize standard (ONNX-domain) RotaryEmbedding nodes when extracting the cos_cache/sin_cache inputs needed to fuse rotary embedding into the com.microsoft.GroupQueryAttention node.

Changes:

  • Adds a helper to retrieve cos_cache/sin_cache NodeArgs for both com.microsoft.RotaryEmbedding and ONNX-domain RotaryEmbedding (different input ordering).
  • Updates the fusion pattern-matching logic to use that helper and to require that rotary cache inputs were successfully identified before fusing.

Comment thread onnxruntime/core/optimizer/group_query_attention_fusion.cc
Comment thread onnxruntime/core/optimizer/group_query_attention_fusion.cc
@titaiwangms

Copy link
Copy Markdown
Contributor

cc @tianleiwu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants