refactor autoencoder tests (temporal decoder, cosmos, kvae, mochi)#13832
refactor autoencoder tests (temporal decoder, cosmos, kvae, mochi)#13832akshan-main wants to merge 7 commits into
Conversation
|
@askserge could you do a review? |
There was a problem hiding this comment.
🤗 Serge says:
Overall
Clean migration of four autoencoder test suites to the new mixin-based structure. The pattern is consistent with other already-migrated tests in the repo. The model-side fix to AutoencoderKLMochi.forward is correct and well-motivated.
Correctness
-
Mochi
forwardfix — The old code calledself.decode(z)(alwaysreturn_dict=True), then manually checkedif not return_dict: return (dec,), which would wrap aDecoderOutputin a tuple instead of returning a raw tensor tuple. The new code correctly propagatesreturn_dicttodecode, which already handles both branches. Good fix. -
Dropped
test_sharded_checkpoints_device_mapskip — The old Mochi test skipped this test (with a multi-GPU error reason). The newModelTesterMixin(fromtests/models/testing_utils/common.py) does not definetest_sharded_checkpoints_device_map, so the skip is correctly dropped. -
Renamed
test_effective_gradient_checkpointing→test_gradient_checkpointing_equivalence— Matches the method name in the newTrainingTesterMixin. Correct. -
input_shapeproperty removed — Not used anywhere in the new mixin framework orNewAutoencoderTesterMixin. Correct to drop.
Minor observations
-
The
generatorproperty creates a newtorch.Generatoron every access. This is intentional (ensures fresh state per call) and matches the pattern in other migrated tests (test_models_autoencoder_kl.py,test_models_autoencoder_dc.py, etc.). -
The
main_input_nameproperty is only consumed byMemoryTesterMixin(intest_layerwise_casting_training). All four configs correctly define it. -
Pre-existing: the return type annotation on
AutoencoderKLMochi.forwardistorch.Tensor | torch.Tensor(line 1095, not in this diff) — should probably beDecoderOutput | tuple[torch.Tensor], but that's out of scope for this PR.
20 LLM turns · 28 tool calls · 89.2s · 439893 in / 4044 out tokens
|
@sayakpaul
return type annotation I'll make that change too |
What does this PR do?
Part of the ongoing modeling-test migration (following #13369 and #13153). Migrates the AutoencoderKLTemporalDecoder, AutoencoderKLCosmos, AutoencoderKLKVAE, and AutoencoderKLMochi test suites to the mixin-based structure (Config + ModelTesterMixin / TrainingTesterMixin / MemoryTesterMixin / NewAutoencoderTesterMixin).
Also fixes AutoencoderKLMochi.forward to propagate return_dict to decode, which was returning a DecoderOutput instead of a tensor with return_dict=False.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@sayakpaul @DN6