Skip to content

refactor autoencoder tests (temporal decoder, cosmos, kvae, mochi)#13832

Open
akshan-main wants to merge 7 commits into
huggingface:mainfrom
akshan-main:tests-refactor-autoencoder-temporal-decoder
Open

refactor autoencoder tests (temporal decoder, cosmos, kvae, mochi)#13832
akshan-main wants to merge 7 commits into
huggingface:mainfrom
akshan-main:tests-refactor-autoencoder-temporal-decoder

Conversation

@akshan-main
Copy link
Copy Markdown
Contributor

What does this PR do?

Part of the ongoing modeling-test migration (following #13369 and #13153). Migrates the AutoencoderKLTemporalDecoder, AutoencoderKLCosmos, AutoencoderKLKVAE, and AutoencoderKLMochi test suites to the mixin-based structure (Config + ModelTesterMixin / TrainingTesterMixin / MemoryTesterMixin / NewAutoencoderTesterMixin).

Also fixes AutoencoderKLMochi.forward to propagate return_dict to decode, which was returning a DecoderOutput instead of a tensor with return_dict=False.

Before submitting

Who can review?

@sayakpaul @DN6

@github-actions github-actions Bot added size/L PR with diff > 200 LOC models tests and removed size/L PR with diff > 200 LOC labels May 29, 2026
@sayakpaul
Copy link
Copy Markdown
Member

@askserge could you do a review?

@github-actions github-actions Bot added the size/L PR with diff > 200 LOC label May 29, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤗 Serge says:

Overall

Clean migration of four autoencoder test suites to the new mixin-based structure. The pattern is consistent with other already-migrated tests in the repo. The model-side fix to AutoencoderKLMochi.forward is correct and well-motivated.

Correctness

  • Mochi forward fix — The old code called self.decode(z) (always return_dict=True), then manually checked if not return_dict: return (dec,), which would wrap a DecoderOutput in a tuple instead of returning a raw tensor tuple. The new code correctly propagates return_dict to decode, which already handles both branches. Good fix.

  • Dropped test_sharded_checkpoints_device_map skip — The old Mochi test skipped this test (with a multi-GPU error reason). The new ModelTesterMixin (from tests/models/testing_utils/common.py) does not define test_sharded_checkpoints_device_map, so the skip is correctly dropped.

  • Renamed test_effective_gradient_checkpointingtest_gradient_checkpointing_equivalence — Matches the method name in the new TrainingTesterMixin. Correct.

  • input_shape property removed — Not used anywhere in the new mixin framework or NewAutoencoderTesterMixin. Correct to drop.

Minor observations

  • The generator property creates a new torch.Generator on every access. This is intentional (ensures fresh state per call) and matches the pattern in other migrated tests (test_models_autoencoder_kl.py, test_models_autoencoder_dc.py, etc.).

  • The main_input_name property is only consumed by MemoryTesterMixin (in test_layerwise_casting_training). All four configs correctly define it.

  • Pre-existing: the return type annotation on AutoencoderKLMochi.forward is torch.Tensor | torch.Tensor (line 1095, not in this diff) — should probably be DecoderOutput | tuple[torch.Tensor], but that's out of scope for this PR.

20 LLM turns · 28 tool calls · 89.2s · 439893 in / 4044 out tokens

@akshan-main
Copy link
Copy Markdown
Contributor Author

akshan-main commented May 29, 2026

@sayakpaul
every other VAE uses

DecoderOutput | torch.Tensor

return type annotation

I'll make that change too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models size/L PR with diff > 200 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants