Skip to content

Modified qwen_2.5 modelling file to allow replicate_kv_script to work for custom num_kv_heads.#595

Closed
quic-dhirajku wants to merge 1 commit into
quic:mainfrom
quic-dhirajku:qwen_2.5_vl_replicate_kv_heads
Closed

Modified qwen_2.5 modelling file to allow replicate_kv_script to work for custom num_kv_heads.#595
quic-dhirajku wants to merge 1 commit into
quic:mainfrom
quic-dhirajku:qwen_2.5_vl_replicate_kv_heads

Conversation

@quic-dhirajku
Copy link
Copy Markdown
Contributor

Edited the replicate_kv_heads script to allow loading the VLM and export properly after KV_heads update.

Export goes through successfuly, need to export full model for it to work due to TF version issues.

… for custom num_kv_heads.

Edited the replicate_kv_heads script to allow loading the VLM and export properly after KV_heads update.

Signed-off-by: quic-dhirajku <quic_dhirajku@quicinc.com>
## Export goes through successfuly, need to export full model for it to work due to TF version issues.
@quic-hemagnih
Copy link
Copy Markdown
Contributor

Hi @quic-dhirajku as discussed please raise a ticket for SIT to verify the Perf hit. In case if we are good from perf side and CI then lets go ahead and merge it.

In parallel we can work on below PR
#625

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants