You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: remove non-tensorizable columns before SDFT training
Two changes to prevent "Unable to create tensor" ValueError when the
data collator encounters the 'messages' column (list of dicts):
1. SDFTDataCollator.__call__: filter features to only pass columns
the base DataCollatorForSeq2Seq can handle before calling it.
2. sdft_train(): after dataset preprocessing, explicitly remove all
columns except 'text', 'teacher_input_ids', 'teacher_attention_mask'
so non-tensorizable columns (messages, demonstration, etc.) never
reach the collator.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
0 commit comments