Skip to content

Commit 5090864

Browse files
fix: shell cd syntax, proper pip install, os.chdir for working dir
1 parent 1a9fd83 commit 5090864

1 file changed

Lines changed: 156 additions & 168 deletions

File tree

Lines changed: 156 additions & 168 deletions
Original file line numberDiff line numberDiff line change
@@ -1,170 +1,158 @@
11
{
2-
"cells": [
3-
{
4-
"cell_type": "markdown",
5-
"metadata": {},
6-
"source": [
7-
"# 🎯 Stack 2.9 — 128K Context Fine-tuning\n",
8-
"\n",
9-
"Fine-tune **Qwen2.5-Coder-1.5B** with **packed 128K context windows**.\n",
10-
"\n",
11-
"**Key innovation:** Instead of training on short ~500-token examples, we **pack 200+ examples** into each 128K window. This multiplies training signal and teaches the model to track tool state across long, multi-turn interactions.\n",
12-
"\n",
13-
"**Runtime:** Runtime → Change runtime type → **GPU (T4 16GB recommended)**\n",
14-
"**Time:** ~6-8 hours on free Colab T4"
15-
]
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# \ud83c\udfaf Stack 2.9 \u2014 128K Context Fine-tuning\n",
8+
"\n",
9+
"Fine-tune **Qwen2.5-Coder-1.5B** with **packed 128K context windows**.\n",
10+
"1500 examples \u2192 packed into long 128K sequences for massive training signal.\n",
11+
"\n",
12+
"**Runtime:** Runtime \u2192 Change runtime type \u2192 **GPU (T4 16GB)**\n",
13+
"**Time:** ~6-8 hours on free Colab T4"
14+
]
15+
},
16+
{
17+
"cell_type": "markdown",
18+
"metadata": {},
19+
"source": [
20+
"## Step 1: Clone Stack 2.9 & Install Dependencies"
21+
]
22+
},
23+
{
24+
"cell_type": "code",
25+
"execution_count": null,
26+
"metadata": {},
27+
"outputs": [],
28+
"source": [
29+
"# Clone and install in ONE cell \u2014 use ! for shell commands\n",
30+
"!git clone https://github.com/my-ai-stack/stack-2.9.git\n",
31+
"!pip install -q transformers peft datasets bitsandbytes>=0.46.1 accelerate huggingface_hub scipy\n",
32+
"!pip install -q torch --upgrade\n",
33+
"import os\n",
34+
"os.chdir('/content/stack-2.9') # Change to repo dir for subsequent cells\n",
35+
"print('\u2705 Cloned & installed. Working dir:', os.getcwd())"
36+
]
37+
},
38+
{
39+
"cell_type": "markdown",
40+
"metadata": {},
41+
"source": [
42+
"## Step 2: Login to HuggingFace"
43+
]
44+
},
45+
{
46+
"cell_type": "code",
47+
"execution_count": null,
48+
"metadata": {},
49+
"outputs": [],
50+
"source": [
51+
"from huggingface_hub import login\n",
52+
"# \ud83d\udc47 Put YOUR HF token here\n",
53+
"login(token=\"YOUR_HF_TOKEN_HERE\")\n",
54+
"print('\u2705 Logged into HuggingFace')"
55+
]
56+
},
57+
{
58+
"cell_type": "markdown",
59+
"metadata": {},
60+
"source": [
61+
"## Step 3: Mount Google Drive"
62+
]
63+
},
64+
{
65+
"cell_type": "code",
66+
"execution_count": null,
67+
"metadata": {},
68+
"outputs": [],
69+
"source": [
70+
"from google.colab import drive\n",
71+
"drive.mount('/content/drive')\n",
72+
"OUTPUT_DIR = '/content/drive/MyDrive/stack-2.9-128k-output'\n",
73+
"os.makedirs(OUTPUT_DIR, exist_ok=True)\n",
74+
"print(f'\ud83d\udcc1 Output: {OUTPUT_DIR}')"
75+
]
76+
},
77+
{
78+
"cell_type": "markdown",
79+
"metadata": {},
80+
"source": [
81+
"## Step 4: Download Training Data"
82+
]
83+
},
84+
{
85+
"cell_type": "code",
86+
"execution_count": null,
87+
"metadata": {},
88+
"outputs": [],
89+
"source": [
90+
"import huggingface_hub, shutil, json\n",
91+
"\n",
92+
"print('Downloading from HuggingFace...')\n",
93+
"path = huggingface_hub.hf_hub_download(\n",
94+
" repo_id='walidsobhie/stack-2-9-tool-examples',\n",
95+
" filename='tool_examples_combined.jsonl',\n",
96+
" repo_type='dataset',\n",
97+
" local_dir='/content/',\n",
98+
")\n",
99+
"shutil.move(path, '/content/tool_examples.jsonl')\n",
100+
"\n",
101+
"with open('/content/tool_examples.jsonl') as f:\n",
102+
" lines = f.readlines()\n",
103+
"print(f'\u2705 {len(lines)} examples ready')"
104+
]
105+
},
106+
{
107+
"cell_type": "markdown",
108+
"metadata": {},
109+
"source": [
110+
"## Step 5: Train! \ud83c\udfaf"
111+
]
112+
},
113+
{
114+
"cell_type": "code",
115+
"execution_count": null,
116+
"metadata": {},
117+
"outputs": [],
118+
"source": [
119+
"import subprocess\n",
120+
"\n",
121+
"result = subprocess.run([\n",
122+
" \"python3\", \"training/train_extended_context.py\",\n",
123+
" \"--model-path\", \"Qwen/Qwen2.5-Coder-1.5B\",\n",
124+
" \"--data-path\", \"/content/tool_examples.jsonl\",\n",
125+
" \"--output-dir\", OUTPUT_DIR,\n",
126+
" \"--context-length\", \"131072\",\n",
127+
" \"--lora-rank\", \"32\",\n",
128+
" \"--epochs\", \"3\",\n",
129+
" \"--batch-size\", \"1\",\n",
130+
" \"--grad-accum\", \"16\",\n",
131+
" \"--lr\", \"2e-4\",\n",
132+
" \"--use-packing\",\n",
133+
" \"--push-to-hub\",\n",
134+
" \"--hub-model-id\", \"walidsobhie/stack-2.9-128k-context\"\n",
135+
"], cwd=\"/content/stack-2.9\")\n",
136+
"\n",
137+
"print('STDOUT:\\n', result.stdout[-2000:] if result.stdout else '')\n",
138+
"print('STDERR:\\n', result.stderr[-2000:] if result.stderr else '')"
139+
]
140+
}
141+
],
142+
"metadata": {
143+
"colab": {
144+
"provenance": [],
145+
"name": "stack-2.9-128k-packed-training"
146+
},
147+
"kernelspec": {
148+
"name": "python3",
149+
"display_name": "Python 3"
150+
},
151+
"language_info": {
152+
"name": "python",
153+
"version": "3.10.0"
154+
}
16155
},
17-
{
18-
"cell_type": "markdown",
19-
"metadata": {},
20-
"source": ["## Step 1: Clone Stack 2.9 & Install Dependencies"]
21-
},
22-
{
23-
"cell_type": "code",
24-
"execution_count": null,
25-
"metadata": {},
26-
"outputs": [],
27-
"source": [
28-
"# Clone the repo (gets the fixed training script)\n",
29-
"!git clone https://github.com/my-ai-stack/stack-2.9.git\n",
30-
"cd stack-2.9\n",
31-
"\n",
32-
"# Install all dependencies\n",
33-
"!pip install -q transformers peft datasets bitsandbytes>=0.46.1 accelerate huggingface_hub scipy\n",
34-
"!pip install -q torch --upgrade\n",
35-
"\n",
36-
"print('✅ Dependencies installed')"
37-
]
38-
},
39-
{
40-
"cell_type": "markdown",
41-
"metadata": {},
42-
"source": ["## Step 2: Login to HuggingFace\n\nGet your token at: https://huggingface.co/settings/tokens"]
43-
},
44-
{
45-
"cell_type": "code",
46-
"execution_count": null,
47-
"metadata": {},
48-
"outputs": [],
49-
"source": [
50-
"from huggingface_hub import login\n",
51-
"# 👇 Replace with YOUR HuggingFace token\n",
52-
"login(token=\"YOUR_HF_TOKEN_HERE\") # ← 🔴 PUT YOUR HF TOKEN HERE\n",
53-
"print('✅ Logged into HuggingFace')"
54-
]
55-
},
56-
{
57-
"cell_type": "markdown",
58-
"metadata": {},
59-
"source": [
60-
"## Step 3: Mount Google Drive\n\nTraining checkpoints and the final adapter will be saved here."
61-
]
62-
},
63-
{
64-
"cell_type": "code",
65-
"execution_count": null,
66-
"metadata": {},
67-
"outputs": [],
68-
"source": [
69-
"from google.colab import drive\n",
70-
"drive.mount('/content/drive')\n",
71-
"OUTPUT_DIR = '/content/drive/MyDrive/stack-2.9-128k-output'\n",
72-
"import os; os.makedirs(OUTPUT_DIR, exist_ok=True)\n",
73-
"print(f'📁 Output directory: {OUTPUT_DIR}')"
74-
]
75-
},
76-
{
77-
"cell_type": "markdown",
78-
"metadata": {},
79-
"source": [
80-
"## Step 4: Download Training Data\n\nWe use the dataset uploaded to HuggingFace Hub — 1500 tool-calling examples, packed into 128K sequences."
81-
]
82-
},
83-
{
84-
"cell_type": "code",
85-
"execution_count": null,
86-
"metadata": {},
87-
"outputs": [],
88-
"source": [
89-
"import huggingface_hub\n",
90-
"\n",
91-
"DATA_FILE = '/content/tool_examples.jsonl'\n",
92-
"\n",
93-
"print('Downloading training data from HuggingFace...')\n",
94-
"hf_id = 'walidsobhie/stack-2-9-tool-examples'\n",
95-
"path = huggingface_hub.hf_hub_download(\n",
96-
" repo_id=hf_id,\n",
97-
" filename='tool_examples_combined.jsonl',\n",
98-
" repo_type='dataset',\n",
99-
" local_dir='/content/',\n",
100-
" local_dir_use_symlinks=False,\n",
101-
")\n",
102-
"import shutil\n",
103-
"shutil.move(path, DATA_FILE)\n",
104-
"print(f'✅ Dataset ready: {DATA_FILE}')\n",
105-
"\n",
106-
"# Quick sanity check\n",
107-
"import json\n",
108-
"with open(DATA_FILE) as f:\n",
109-
" lines = f.readlines()\n",
110-
"print(f' Total examples: {len(lines)}')\n",
111-
"ex = json.loads(lines[0])\n",
112-
"print(f' Keys: {list(ex.keys())}')"
113-
]
114-
},
115-
{
116-
"cell_type": "markdown",
117-
"metadata": {},
118-
"source": [
119-
"## Step 5: Run 128K Packed Context Fine-tuning\n\n**This cell runs the full training. On free Colab T4 it takes ~6-8 hours.**\n",
120-
"\n",
121-
"If Colab disconnects, your checkpoints are safe in Google Drive. Reconnect and re-run this cell — it will resume from the last checkpoint."
122-
]
123-
},
124-
{
125-
"cell_type": "code",
126-
"execution_count": null,
127-
"metadata": {},
128-
"outputs": [],
129-
"source": [
130-
"import subprocess\n",
131-
"\n",
132-
"# Run the fixed training script with packing enabled\n",
133-
"result = subprocess.run([\n",
134-
" \"python3\", \"training/train_extended_context.py\",\n",
135-
" \"--model-path\", \"Qwen/Qwen2.5-Coder-1.5B\",\n",
136-
" \"--data-path\", \"/content/tool_examples.jsonl\",\n",
137-
" \"--output-dir\", OUTPUT_DIR,\n",
138-
" \"--context-length\", \"131072\",\n",
139-
" \"--lora-rank\", \"32\",\n",
140-
" \"--epochs\", \"3\",\n",
141-
" \"--batch-size\", \"1\",\n",
142-
" \"--grad-accum\", \"16\",\n",
143-
" \"--lr\", \"2e-4\",\n",
144-
" \"--use-packing\",\n",
145-
" \"--push-to-hub\",\n",
146-
" \"--hub-model-id\", \"walidsobhie/stack-2.9-128k-context\"\n",
147-
"], cwd=\"/content/stack-2.9\")\n",
148-
"\n",
149-
"print('STDOUT:', result.stdout)\n",
150-
"print('STDERR:', result.stderr[-3000:] if result.stderr else '(none)')"
151-
]
152-
}
153-
],
154-
"metadata": {
155-
"colab": {
156-
"provenance": [],
157-
"name": "stack-2.9-128k-packed-training"
158-
},
159-
"kernelspec": {
160-
"name": "python3",
161-
"display_name": "Python 3"
162-
},
163-
"language_info": {
164-
"name": "python",
165-
"version": "3.10.0"
166-
}
167-
},
168-
"nbformat": 4,
169-
"nbformat_minor": 4
170-
}
156+
"nbformat": 4,
157+
"nbformat_minor": 4
158+
}

0 commit comments

Comments
 (0)