Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
bvandermoon
left a comment
There was a problem hiding this comment.
Thank you @charlesli640. Is it possible to migrate train.py directly instead of forking this logic? I am concerned these two files will get out of sync before they are merged, this setup will skip running unit tests, and it makes the code a bit more complicated/harder to follow
94324cd to
bdc6ec6
Compare
Definitely we can move the logic to train.py and make it controlled by enable_nnx config. Actually this is one of experimental solutions I am doing internally - try to create brand-new pre-train using pure NNX style, leaving old linen style pre-train untouched/co-existing. Another solution is submitted on PR #3427. This solution tries to keep/re-use current linen style train loop as much as possible. It created TrainStateNNX class and make existing linen functions compatible to both linen model and nnx model. Please also review the PR3427. We can discuss more on which direction we are going. |
bdc6ec6 to
82365f7
Compare
Description
Implement pre-train using NNX style
Tests
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.