Add docs for training hub by typhoonzero · Pull Request #159 · alauda/aml-docs

typhoonzero · 2026-03-23T13:21:13Z

Summary by CodeRabbit

Documentation
- Added a comprehensive OSFT tutorial notebook with end-to-end workflow, example model configurations, parameter reference (including unfreeze_rank_ratio, target_patterns, unmasking options), distributed training resource calculations, run/troubleshooting guidance, and post-training validation tips.
- Added Training Hub fine-tuning guide covering setup, JSONL data format, SFT/OSFT workflows, parameter reference, and multi-node training guidance.

coderabbitai · 2026-03-23T13:21:36Z

Walkthrough

Adds two documentation resources: an OSFT end-to-end tutorial notebook with dataset creation, model/config examples, distributed training setup, run wrapper, and post-training inspection; and a training_hub fine-tuning guide covering installation, JSONL data format, SFT/OSFT loss-masking behavior, parameter reference, and multi-node launch notes.

Changes

Cohort / File(s)	Summary
OSFT Tutorial Notebook `docs/en/workbench/how_to/osft_comprehensive_tutorial.ipynb`	New comprehensive Jupyter notebook: dummy JSONL dataset cell, OSFT conceptual docs, example model configs (Qwen, Llama, Phi, generic), full parameter reference (including `unfreeze_rank_ratio`, `target_patterns`, `unmask_messages`, `use_processed_dataset`), distributed-resource computation, `training_params` assembly, `osft(**training_params)` run wrapped in try/except with troubleshooting, and post-training checkpoint inspection and guidance.
Training Hub Fine-tuning Guide `docs/en/workbench/how_to/training_hub_fine_tuning.mdx`	New MDX page: installation/prerequisites for Workbench, expected JSONL conversation schema and supported `role` values, SFT vs OSFT loss-masking behavior and preprocessed-dataset option, example `sft(...)`/`osft(...)` calls, consolidated parameter table (including OSFT-specific options), and multi-node rendezvous guidance (`rdzv_id`, `rdzv_endpoint`, `node_rank`).

Sequence Diagram(s)

(omitted — changes are documentation-only and do not introduce new multi-component control flow)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

zhaomingkun1030

Poem

🐰 I hopped through notebooks, line by line,
I tucked configs snug and docs to shine,
OSFT's ranks and masks in gentle tune,
Checkpoints hum beneath the training moon,
Hop on—models warm by dawn's first light 🌙✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title 'Add docs for training hub' directly and clearly summarizes the main change: adding documentation for the training hub feature, which matches the PR's addition of two documentation files (a comprehensive tutorial notebook and a fine-tuning guide).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add_docs_for_training_hub

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/workbench/how_to/osft_comprehensive_tutorial.ipynb`:
- Line 402: Remove the internal author TODO string "# TODO: revert these
overrides after we've concluded training" from the notebook cell in the tutorial
flow; either delete that line or replace it with user-facing instructions that
explain the current state (e.g., "Temporary overrides applied for training — no
action required") so the published tutorial contains only user-directed
guidance; locate the cell containing that exact TODO string to make the edit
(search for "# TODO: revert these overrides after we've concluded training").
- Line 886: The docs show conflicting defaults for the setting 'use_liger'—the
table row lists default False while the configuration cell assigns use_liger =
True and the companion MDX page also differs; pick the canonical default (e.g.,
True) and update all occurrences to match: change the table/default example
entry to the chosen value and/or update the configuration cell to the chosen
value so the table row, the config cell (variable use_liger), and the companion
MDX description all state the same default and example.
- Around line 539-552: The multi-node template entries (keys like
"multi_node_master" and "multi_node_worker") are inconsistent: change "nnodes"
to match the described 4-node topology (set nnodes: 4 for master and workers),
update "node_rank" comments so workers indicate ranks 1..3, and make
"rdzv_endpoint" a clearly labeled placeholder (e.g., "MASTER_IP:PORT") rather
than a concrete cluster IP; keep "rdzv_id" unchanged but ensure descriptions
(the "description" fields) reflect 4 nodes and indicate that workers must change
their node_rank per node.
- Around line 297-299: The preset is inconsistent: the JSON fields model_name
and model_path disagree (model_name currently "Qwen 2.5 7B Instruct" while
model_path points to "Qwen3-0.6B"); update one to match the other (e.g., set
model_name to "Qwen3-0.6B" if model_path is correct, or change model_path to the
correct HuggingFace/local path for "Qwen 2.5 7B Instruct"), and ensure the
adjacent comment and example_unfreeze_rank_ratio remain accurate for the chosen
model.

In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx`:
- Line 136: The max_seq_len value in the OSFT example is a typo: update the
assignment of max_seq_len from 8196 to 8192 in the OSFT/example config (look for
the max_seq_len=8196 occurrence) so it matches the 8K examples elsewhere and
maintain consistency across docs.
- Line 23: The table row under the "Catastrophic forgetting" column uses awkward
modal phrasing "Requires mixed/replay data to mitigate"; change it to a clearer
verb form such as "Mitigated by using mixed/replay data" (or "Mitigated using
mixed/replay data") so the row reads: | **Catastrophic forgetting** | Mitigated
by using mixed/replay data | Prevented algorithmically |, updating the cell text
accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 380e0c93-7883-4a94-b58a-fde7a5e43c12

📥 Commits

Reviewing files that changed from the base of the PR and between 904d33b and 5edfb29.

📒 Files selected for processing (3)

docs/en/workbench/how_to/osft_comprehensive_tutorial.ipynb
docs/en/workbench/how_to/sft_comprehensive_tutorial.ipynb
docs/en/workbench/how_to/training_hub_fine_tuning.mdx

docs/en/workbench/how_to/osft_comprehensive_tutorial.ipynb

docs/en/workbench/how_to/training_hub_fine_tuning.mdx

coderabbitai

♻️ Duplicate comments (1)

docs/en/workbench/how_to/training_hub_fine_tuning.mdx (1)
23-23: ⚠️ Potential issue | 🟡 Minor

Use clearer phrasing for catastrophic forgetting mitigation.

Line 23 reads awkwardly (“Requires mixed/replay data to mitigate”). Please rephrase for clarity, e.g. “Mitigated by using mixed/replay data.”
Suggested edit
-| **Catastrophic forgetting** | Requires mixed/replay data to mitigate | Prevented algorithmically |
+| **Catastrophic forgetting** | Mitigated by using mixed/replay data | Prevented algorithmically |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx` at line 23, Update the
table row in the file where the "Catastrophic forgetting" entry appears (the
cell currently reads "Requires mixed/replay data to mitigate") to a clearer
phrasing such as "Mitigated by using mixed/replay data" so the mitigation column
reads concisely and reads naturally; locate the exact string "Catastrophic
forgetting" in the table and replace the adjacent mitigation cell accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx`:
- Line 23: Update the table row in the file where the "Catastrophic forgetting"
entry appears (the cell currently reads "Requires mixed/replay data to
mitigate") to a clearer phrasing such as "Mitigated by using mixed/replay data"
so the mitigation column reads concisely and reads naturally; locate the exact
string "Catastrophic forgetting" in the table and replace the adjacent
mitigation cell accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6df118c7-39ae-4e8a-8818-15adca701633

📥 Commits

Reviewing files that changed from the base of the PR and between 5edfb29 and 96f8d37.

📒 Files selected for processing (2)

docs/en/workbench/how_to/osft_comprehensive_tutorial.ipynb
docs/en/workbench/how_to/training_hub_fine_tuning.mdx

✅ Files skipped from review due to trivial changes (1)

docs/en/workbench/how_to/osft_comprehensive_tutorial.ipynb

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

docs/en/workbench/how_to/training_hub_fine_tuning.mdx (1)
23-23: ⚠️ Potential issue | 🟡 Minor

Fix awkward modal phrasing in the table row.

The phrasing "Requires mixed/replay data to mitigate" is grammatically awkward. Consider "Mitigated using mixed/replay data" or "Needs mixed/replay data to mitigate."
📝 Suggested edit
-| **Catastrophic forgetting** | Requires mixed/replay data to mitigate | Prevented algorithmically |
+| **Catastrophic forgetting** | Mitigated using mixed/replay data | Prevented algorithmically |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx` at line 23, Update the
table row that currently reads "Requires mixed/replay data to mitigate" to a
clearer phrasing such as "Mitigated using mixed/replay data" (or alternatively
"Needs mixed/replay data to mitigate") in the
docs/en/workbench/how_to/training_hub_fine_tuning.mdx table so the modal
phrasing is grammatically correct and consistent with other rows; replace the
exact cell text in the row labelled "Catastrophic forgetting" with the chosen
suggestion.

🧹 Nitpick comments (2)

docs/en/workbench/how_to/training_hub_fine_tuning.mdx (2)

45-51: Clarify the pretraining role.

Line 45 lists pretraining as a supported role value, but the masking behavior section (lines 47-51) doesn't explain what this role does or how it differs from using the "unmask": true flag. Consider adding a brief explanation.

📝 Suggested addition

After line 45, add:

- `pretraining` role: Content with this role is always included in the loss calculation, similar to enabling `unmask` for the sample.

Or integrate into the masking behavior section.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx` around lines 45 - 51,
Add a brief explanation of the "pretraining" role and how it differs from the
other roles: state that the `pretraining` role means content marked with
role="pretraining" is always included in the loss (equivalent to setting
`"unmask": true` for that sample), and mention how it relates to SFT and OSFT
masking (i.e., SFT masks assistant-only by default, OSFT uses `unmask_messages`,
while `pretraining` forces inclusion regardless of those flags). Reference the
`pretraining` role string and the `"unmask": true` flag and `unmask_messages`
parameter so readers can find the relevant behavior descriptions.

98-99: Complete the guidance for unfreeze_rank_ratio range.

The comment provides guidance for 0.1–0.5 but doesn't explain what values above 0.5 represent, leaving users uncertain about the upper half of the valid range (0.5–1.0).

📝 Suggested enhancement

-unfreeze_rank_ratio = 0.25  # 0.1–0.3 conservative, 0.3–0.5 balanced
+unfreeze_rank_ratio = 0.25  # 0.1–0.3 conservative, 0.3–0.5 balanced, 0.5–1.0 aggressive

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx` around lines 98 - 99,
The documentation for unfreeze_rank_ratio stops at 0.5 and leaves the upper half
of the valid range undefined; update the guidance for the unfreeze_rank_ratio
parameter (used in the training/fine-tuning docs) to explain what values in
0.5–1.0 mean for training behavior and trade-offs (e.g., >0.5 progressively
unfreezes more rank components, increasing model plasticity and compute/memory
requirements, higher risk of overfitting but faster adaptation), and give
concrete recommendations for when to choose conservative (0.1–0.3), balanced
(0.3–0.5), and aggressive (0.5–1.0) settings along with a short sentence on
pitfalls (compute, memory, overfitting) to help users pick a value.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx`:
- Around line 60-61: The two table links for "SFT Comprehensive Tutorial" and
"OSFT Comprehensive Tutorial" currently point to the directory view; update
their URLs so they reference the actual notebook files
(sft_comprehensive_tutorial.ipynb and osft_comprehensive_tutorial.ipynb) using
either the /blob/master/ path for direct file view or /raw/master/ for direct
download, ensuring the markdown link targets the specific file rather than the
folder.

---

Duplicate comments:
In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx`:
- Line 23: Update the table row that currently reads "Requires mixed/replay data
to mitigate" to a clearer phrasing such as "Mitigated using mixed/replay data"
(or alternatively "Needs mixed/replay data to mitigate") in the
docs/en/workbench/how_to/training_hub_fine_tuning.mdx table so the modal
phrasing is grammatically correct and consistent with other rows; replace the
exact cell text in the row labelled "Catastrophic forgetting" with the chosen
suggestion.

---

Nitpick comments:
In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx`:
- Around line 45-51: Add a brief explanation of the "pretraining" role and how
it differs from the other roles: state that the `pretraining` role means content
marked with role="pretraining" is always included in the loss (equivalent to
setting `"unmask": true` for that sample), and mention how it relates to SFT and
OSFT masking (i.e., SFT masks assistant-only by default, OSFT uses
`unmask_messages`, while `pretraining` forces inclusion regardless of those
flags). Reference the `pretraining` role string and the `"unmask": true` flag
and `unmask_messages` parameter so readers can find the relevant behavior
descriptions.
- Around line 98-99: The documentation for unfreeze_rank_ratio stops at 0.5 and
leaves the upper half of the valid range undefined; update the guidance for the
unfreeze_rank_ratio parameter (used in the training/fine-tuning docs) to explain
what values in 0.5–1.0 mean for training behavior and trade-offs (e.g., >0.5
progressively unfreezes more rank components, increasing model plasticity and
compute/memory requirements, higher risk of overfitting but faster adaptation),
and give concrete recommendations for when to choose conservative (0.1–0.3),
balanced (0.3–0.5), and aggressive (0.5–1.0) settings along with a short
sentence on pitfalls (compute, memory, overfitting) to help users pick a value.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c7e52585-b4c7-4a3c-abed-718a1e780ed0

📥 Commits

Reviewing files that changed from the base of the PR and between 96f8d37 and ce9309d.

📒 Files selected for processing (1)

docs/en/workbench/how_to/training_hub_fine_tuning.mdx

coderabbitai · 2026-03-24T03:19:21Z

docs/en/workbench/how_to/training_hub_fine_tuning.mdx

+| SFT Comprehensive Tutorial | Supervised Fine-Tuning | [Download sft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to) |
+| OSFT Comprehensive Tutorial | Orthogonal Subspace Fine-Tuning | [Download osft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to) |


⚠️ Potential issue | 🟠 Major

Fix incomplete download links.

Both download links point to the same directory view (/tree/master/docs/en/workbench/how_to) rather than to specific notebook files. Users expecting to download sft_comprehensive_tutorial.ipynb and osft_comprehensive_tutorial.ipynb will land on a directory listing instead.

🔗 Suggested fix

Update the links to point to the actual files:

-| SFT Comprehensive Tutorial | Supervised Fine-Tuning | [Download sft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to) | -| OSFT Comprehensive Tutorial | Orthogonal Subspace Fine-Tuning | [Download osft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to) | +| SFT Comprehensive Tutorial | Supervised Fine-Tuning | [Download sft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to/sft_comprehensive_tutorial.ipynb) | +| OSFT Comprehensive Tutorial | Orthogonal Subspace Fine-Tuning | [Download osft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to/osft_comprehensive_tutorial.ipynb) |

Or use /blob/master/ for direct file view, or /raw/master/ for raw download.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/en/workbench/how_to/training_hub_fine_tuning.mdx` around lines 60 - 61, The two table links for "SFT Comprehensive Tutorial" and "OSFT Comprehensive Tutorial" currently point to the directory view; update their URLs so they reference the actual notebook files (sft_comprehensive_tutorial.ipynb and osft_comprehensive_tutorial.ipynb) using either the /blob/master/ path for direct file view or /raw/master/ for direct download, ensuring the markdown link targets the specific file rather than the folder.

cloudflare-workers-and-pages · 2026-03-24T03:20:27Z

Deploying alauda-ai with Cloudflare Pages

Latest commit:	`ce9309d`
Status:	✅ Deploy successful!
Preview URL:	https://07f2b7a9.alauda-ai.pages.dev
Branch Preview URL:	https://add-docs-for-training-hub.alauda-ai.pages.dev

View logs

add docs for training hub

5edfb29

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

fix link

96f8d37

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

fix link

ce9309d

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

typhoonzero changed the title ~~WIP: Add docs for training hub~~ Add docs for training hub Mar 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add docs for training hub#159

Add docs for training hub#159
typhoonzero wants to merge 3 commits intomasterfrom
add_docs_for_training_hub

typhoonzero commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 24, 2026

Uh oh!

cloudflare-workers-and-pages bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		\| SFT Comprehensive Tutorial \| Supervised Fine-Tuning \| [Download sft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to) \|
		\| OSFT Comprehensive Tutorial \| Orthogonal Subspace Fine-Tuning \| [Download osft_comprehensive_tutorial.ipynb](https://github.com/alauda/aml-docs/tree/master/docs/en/workbench/how_to) \|

Conversation

typhoonzero commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

cloudflare-workers-and-pages bot commented Mar 24, 2026

Deploying alauda-ai with Cloudflare Pages

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

typhoonzero commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading