11{
2- "cells" : [
3- {
4- "cell_type" : " markdown" ,
5- "metadata" : {},
6- "source" : [
7- " # 🎯 Stack 2.9 — 128K Context Fine-tuning\n " ,
8- " \n " ,
9- " Fine-tune **Qwen2.5-Coder-1.5B** with **packed 128K context windows**.\n " ,
10- " \n " ,
11- " **Key innovation:** Instead of training on short ~500-token examples, we **pack 200+ examples** into each 128K window. This multiplies training signal and teaches the model to track tool state across long, multi-turn interactions.\n " ,
12- " \n " ,
13- " **Runtime:** Runtime → Change runtime type → **GPU (T4 16GB recommended)**\n " ,
14- " **Time:** ~6-8 hours on free Colab T4"
15- ]
2+ "cells" : [
3+ {
4+ "cell_type" : " markdown" ,
5+ "metadata" : {},
6+ "source" : [
7+ " # \ud83c\udfaf Stack 2.9 \u2014 128K Context Fine-tuning\n " ,
8+ " \n " ,
9+ " Fine-tune **Qwen2.5-Coder-1.5B** with **packed 128K context windows**.\n " ,
10+ " 1500 examples \u2192 packed into long 128K sequences for massive training signal.\n " ,
11+ " \n " ,
12+ " **Runtime:** Runtime \u2192 Change runtime type \u2192 **GPU (T4 16GB)**\n " ,
13+ " **Time:** ~6-8 hours on free Colab T4"
14+ ]
15+ },
16+ {
17+ "cell_type" : " markdown" ,
18+ "metadata" : {},
19+ "source" : [
20+ " ## Step 1: Clone Stack 2.9 & Install Dependencies"
21+ ]
22+ },
23+ {
24+ "cell_type" : " code" ,
25+ "execution_count" : null ,
26+ "metadata" : {},
27+ "outputs" : [],
28+ "source" : [
29+ " # Clone and install in ONE cell \u2014 use ! for shell commands\n " ,
30+ " !git clone https://github.com/my-ai-stack/stack-2.9.git\n " ,
31+ " !pip install -q transformers peft datasets bitsandbytes>=0.46.1 accelerate huggingface_hub scipy\n " ,
32+ " !pip install -q torch --upgrade\n " ,
33+ " import os\n " ,
34+ " os.chdir('/content/stack-2.9') # Change to repo dir for subsequent cells\n " ,
35+ " print('\u2705 Cloned & installed. Working dir:', os.getcwd())"
36+ ]
37+ },
38+ {
39+ "cell_type" : " markdown" ,
40+ "metadata" : {},
41+ "source" : [
42+ " ## Step 2: Login to HuggingFace"
43+ ]
44+ },
45+ {
46+ "cell_type" : " code" ,
47+ "execution_count" : null ,
48+ "metadata" : {},
49+ "outputs" : [],
50+ "source" : [
51+ " from huggingface_hub import login\n " ,
52+ " # \ud83d\udc47 Put YOUR HF token here\n " ,
53+ " login(token=\" YOUR_HF_TOKEN_HERE\" )\n " ,
54+ " print('\u2705 Logged into HuggingFace')"
55+ ]
56+ },
57+ {
58+ "cell_type" : " markdown" ,
59+ "metadata" : {},
60+ "source" : [
61+ " ## Step 3: Mount Google Drive"
62+ ]
63+ },
64+ {
65+ "cell_type" : " code" ,
66+ "execution_count" : null ,
67+ "metadata" : {},
68+ "outputs" : [],
69+ "source" : [
70+ " from google.colab import drive\n " ,
71+ " drive.mount('/content/drive')\n " ,
72+ " OUTPUT_DIR = '/content/drive/MyDrive/stack-2.9-128k-output'\n " ,
73+ " os.makedirs(OUTPUT_DIR, exist_ok=True)\n " ,
74+ " print(f'\ud83d\udcc1 Output: {OUTPUT_DIR}')"
75+ ]
76+ },
77+ {
78+ "cell_type" : " markdown" ,
79+ "metadata" : {},
80+ "source" : [
81+ " ## Step 4: Download Training Data"
82+ ]
83+ },
84+ {
85+ "cell_type" : " code" ,
86+ "execution_count" : null ,
87+ "metadata" : {},
88+ "outputs" : [],
89+ "source" : [
90+ " import huggingface_hub, shutil, json\n " ,
91+ " \n " ,
92+ " print('Downloading from HuggingFace...')\n " ,
93+ " path = huggingface_hub.hf_hub_download(\n " ,
94+ " repo_id='walidsobhie/stack-2-9-tool-examples',\n " ,
95+ " filename='tool_examples_combined.jsonl',\n " ,
96+ " repo_type='dataset',\n " ,
97+ " local_dir='/content/',\n " ,
98+ " )\n " ,
99+ " shutil.move(path, '/content/tool_examples.jsonl')\n " ,
100+ " \n " ,
101+ " with open('/content/tool_examples.jsonl') as f:\n " ,
102+ " lines = f.readlines()\n " ,
103+ " print(f'\u2705 {len(lines)} examples ready')"
104+ ]
105+ },
106+ {
107+ "cell_type" : " markdown" ,
108+ "metadata" : {},
109+ "source" : [
110+ " ## Step 5: Train! \ud83c\udfaf "
111+ ]
112+ },
113+ {
114+ "cell_type" : " code" ,
115+ "execution_count" : null ,
116+ "metadata" : {},
117+ "outputs" : [],
118+ "source" : [
119+ " import subprocess\n " ,
120+ " \n " ,
121+ " result = subprocess.run([\n " ,
122+ " \" python3\" , \" training/train_extended_context.py\" ,\n " ,
123+ " \" --model-path\" , \" Qwen/Qwen2.5-Coder-1.5B\" ,\n " ,
124+ " \" --data-path\" , \" /content/tool_examples.jsonl\" ,\n " ,
125+ " \" --output-dir\" , OUTPUT_DIR,\n " ,
126+ " \" --context-length\" , \" 131072\" ,\n " ,
127+ " \" --lora-rank\" , \" 32\" ,\n " ,
128+ " \" --epochs\" , \" 3\" ,\n " ,
129+ " \" --batch-size\" , \" 1\" ,\n " ,
130+ " \" --grad-accum\" , \" 16\" ,\n " ,
131+ " \" --lr\" , \" 2e-4\" ,\n " ,
132+ " \" --use-packing\" ,\n " ,
133+ " \" --push-to-hub\" ,\n " ,
134+ " \" --hub-model-id\" , \" walidsobhie/stack-2.9-128k-context\"\n " ,
135+ " ], cwd=\" /content/stack-2.9\" )\n " ,
136+ " \n " ,
137+ " print('STDOUT:\\ n', result.stdout[-2000:] if result.stdout else '')\n " ,
138+ " print('STDERR:\\ n', result.stderr[-2000:] if result.stderr else '')"
139+ ]
140+ }
141+ ],
142+ "metadata" : {
143+ "colab" : {
144+ "provenance" : [],
145+ "name" : " stack-2.9-128k-packed-training"
146+ },
147+ "kernelspec" : {
148+ "name" : " python3" ,
149+ "display_name" : " Python 3"
150+ },
151+ "language_info" : {
152+ "name" : " python" ,
153+ "version" : " 3.10.0"
154+ }
16155 },
17- {
18- "cell_type" : " markdown" ,
19- "metadata" : {},
20- "source" : [" ## Step 1: Clone Stack 2.9 & Install Dependencies" ]
21- },
22- {
23- "cell_type" : " code" ,
24- "execution_count" : null ,
25- "metadata" : {},
26- "outputs" : [],
27- "source" : [
28- " # Clone the repo (gets the fixed training script)\n " ,
29- " !git clone https://github.com/my-ai-stack/stack-2.9.git\n " ,
30- " cd stack-2.9\n " ,
31- " \n " ,
32- " # Install all dependencies\n " ,
33- " !pip install -q transformers peft datasets bitsandbytes>=0.46.1 accelerate huggingface_hub scipy\n " ,
34- " !pip install -q torch --upgrade\n " ,
35- " \n " ,
36- " print('✅ Dependencies installed')"
37- ]
38- },
39- {
40- "cell_type" : " markdown" ,
41- "metadata" : {},
42- "source" : [" ## Step 2: Login to HuggingFace\n\n Get your token at: https://huggingface.co/settings/tokens" ]
43- },
44- {
45- "cell_type" : " code" ,
46- "execution_count" : null ,
47- "metadata" : {},
48- "outputs" : [],
49- "source" : [
50- " from huggingface_hub import login\n " ,
51- " # 👇 Replace with YOUR HuggingFace token\n " ,
52- " login(token=\" YOUR_HF_TOKEN_HERE\" ) # ← 🔴 PUT YOUR HF TOKEN HERE\n " ,
53- " print('✅ Logged into HuggingFace')"
54- ]
55- },
56- {
57- "cell_type" : " markdown" ,
58- "metadata" : {},
59- "source" : [
60- " ## Step 3: Mount Google Drive\n\n Training checkpoints and the final adapter will be saved here."
61- ]
62- },
63- {
64- "cell_type" : " code" ,
65- "execution_count" : null ,
66- "metadata" : {},
67- "outputs" : [],
68- "source" : [
69- " from google.colab import drive\n " ,
70- " drive.mount('/content/drive')\n " ,
71- " OUTPUT_DIR = '/content/drive/MyDrive/stack-2.9-128k-output'\n " ,
72- " import os; os.makedirs(OUTPUT_DIR, exist_ok=True)\n " ,
73- " print(f'📁 Output directory: {OUTPUT_DIR}')"
74- ]
75- },
76- {
77- "cell_type" : " markdown" ,
78- "metadata" : {},
79- "source" : [
80- " ## Step 4: Download Training Data\n\n We use the dataset uploaded to HuggingFace Hub — 1500 tool-calling examples, packed into 128K sequences."
81- ]
82- },
83- {
84- "cell_type" : " code" ,
85- "execution_count" : null ,
86- "metadata" : {},
87- "outputs" : [],
88- "source" : [
89- " import huggingface_hub\n " ,
90- " \n " ,
91- " DATA_FILE = '/content/tool_examples.jsonl'\n " ,
92- " \n " ,
93- " print('Downloading training data from HuggingFace...')\n " ,
94- " hf_id = 'walidsobhie/stack-2-9-tool-examples'\n " ,
95- " path = huggingface_hub.hf_hub_download(\n " ,
96- " repo_id=hf_id,\n " ,
97- " filename='tool_examples_combined.jsonl',\n " ,
98- " repo_type='dataset',\n " ,
99- " local_dir='/content/',\n " ,
100- " local_dir_use_symlinks=False,\n " ,
101- " )\n " ,
102- " import shutil\n " ,
103- " shutil.move(path, DATA_FILE)\n " ,
104- " print(f'✅ Dataset ready: {DATA_FILE}')\n " ,
105- " \n " ,
106- " # Quick sanity check\n " ,
107- " import json\n " ,
108- " with open(DATA_FILE) as f:\n " ,
109- " lines = f.readlines()\n " ,
110- " print(f' Total examples: {len(lines)}')\n " ,
111- " ex = json.loads(lines[0])\n " ,
112- " print(f' Keys: {list(ex.keys())}')"
113- ]
114- },
115- {
116- "cell_type" : " markdown" ,
117- "metadata" : {},
118- "source" : [
119- " ## Step 5: Run 128K Packed Context Fine-tuning\n\n **This cell runs the full training. On free Colab T4 it takes ~6-8 hours.**\n " ,
120- " \n " ,
121- " If Colab disconnects, your checkpoints are safe in Google Drive. Reconnect and re-run this cell — it will resume from the last checkpoint."
122- ]
123- },
124- {
125- "cell_type" : " code" ,
126- "execution_count" : null ,
127- "metadata" : {},
128- "outputs" : [],
129- "source" : [
130- " import subprocess\n " ,
131- " \n " ,
132- " # Run the fixed training script with packing enabled\n " ,
133- " result = subprocess.run([\n " ,
134- " \" python3\" , \" training/train_extended_context.py\" ,\n " ,
135- " \" --model-path\" , \" Qwen/Qwen2.5-Coder-1.5B\" ,\n " ,
136- " \" --data-path\" , \" /content/tool_examples.jsonl\" ,\n " ,
137- " \" --output-dir\" , OUTPUT_DIR,\n " ,
138- " \" --context-length\" , \" 131072\" ,\n " ,
139- " \" --lora-rank\" , \" 32\" ,\n " ,
140- " \" --epochs\" , \" 3\" ,\n " ,
141- " \" --batch-size\" , \" 1\" ,\n " ,
142- " \" --grad-accum\" , \" 16\" ,\n " ,
143- " \" --lr\" , \" 2e-4\" ,\n " ,
144- " \" --use-packing\" ,\n " ,
145- " \" --push-to-hub\" ,\n " ,
146- " \" --hub-model-id\" , \" walidsobhie/stack-2.9-128k-context\"\n " ,
147- " ], cwd=\" /content/stack-2.9\" )\n " ,
148- " \n " ,
149- " print('STDOUT:', result.stdout)\n " ,
150- " print('STDERR:', result.stderr[-3000:] if result.stderr else '(none)')"
151- ]
152- }
153- ],
154- "metadata" : {
155- "colab" : {
156- "provenance" : [],
157- "name" : " stack-2.9-128k-packed-training"
158- },
159- "kernelspec" : {
160- "name" : " python3" ,
161- "display_name" : " Python 3"
162- },
163- "language_info" : {
164- "name" : " python" ,
165- "version" : " 3.10.0"
166- }
167- },
168- "nbformat" : 4 ,
169- "nbformat_minor" : 4
170- }
156+ "nbformat" : 4 ,
157+ "nbformat_minor" : 4
158+ }
0 commit comments