Skip to content

Support to_huggingface checkpoint conversion for Gemma4#3813

Merged
copybara-service[bot] merged 1 commit intomainfrom
aireen/gemma4_to_hf
May 6, 2026
Merged

Support to_huggingface checkpoint conversion for Gemma4#3813
copybara-service[bot] merged 1 commit intomainfrom
aireen/gemma4_to_hf

Conversation

@aireenmei
Copy link
Copy Markdown
Collaborator

@aireenmei aireenmei commented May 5, 2026

Description

hf to_maxtext checkpoint conversion is already supported. Adding support for the other direction. The following changes are included to reduce frictions during the conversion:

  • Previously param mapping dict won't include multimodal keys, when use_multimodal=False, this cause error if the provided maxtext ckpt contains multimodal keys, while the user only want to convert the text-related weights to hf ckpt. I improved this logic to have param_mapping always load the full dict, but keep or skip multimodal keys later depends on use_multimodal flag
  • Adjust moe tile size in the scripts for 26b, so it fits the memory of v5p-8
  • Add hint when scan_layers config doesn't match the ckpt

b/508269624

Tests

For gemma4-26b-a4b-it model:

  • 26b/test_gemma4_to_hf.sh: the script runs to_huggingface.py for maxtext -> hf ckpt conversion, then validate the conversion by forward_pass_logit_checker.py. text-only results
  • The above results are comparable to the results from convert_gemma4.sh, which runs to_maxtext.py for hf -> maxtext ckpt conversion, then validate by forward_pass_logit_checker.py text-only results

For gemma4-31b:

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

@aireenmei aireenmei force-pushed the aireen/gemma4_to_hf branch from c7838c0 to 3fc8ce3 Compare May 6, 2026 00:30
Comment thread tests/end_to_end/tpu/gemma4/26b/test_gemma4_to_hf.sh
Copy link
Copy Markdown
Collaborator

@gagika gagika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@copybara-service copybara-service Bot merged commit fb83eaa into main May 6, 2026
73 of 77 checks passed
@copybara-service copybara-service Bot deleted the aireen/gemma4_to_hf branch May 6, 2026 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants