Skip to content

[Distillation] Fix eod masking + strategy refactoring#3478

Draft
vlad-karp wants to merge 2 commits intomainfrom
vladk/distill-eod-refactor
Draft

[Distillation] Fix eod masking + strategy refactoring#3478
vlad-karp wants to merge 2 commits intomainfrom
vladk/distill-eod-refactor

Conversation

@vlad-karp
Copy link
Collaborator

Description

  1. The current code didn't correctly mask eod tokens when they are segment delimiters - sft mode was fine, but the pretraining incorrectly included that token into prediction.
    The last token in the sample still need to predict the eod token as expected, it is the oed input token which need to be excluded from the loss.

  2. The kl divergence averaging wasn't correct as well.

  3. Since this logic was in anonymous labels_fn function, it is hard to unit test it.
    Since we are not longer inheriting from tunix distillation strategy, we can add it directly to our custom strategy class.
    The base strategy class has been intrduced.

Tests

A new test test_strategy_ignores_segmentation_zero_tokens() has been added to tests/post_training/unit/train_distill_test.py

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 76.47059% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ners/post_train/distillation/distillation_utils.py 74.19% 7 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant