Skip to content

Commit 3c55760

Browse files
authored
add: notebook workflow tests to execute code (#76)
## Changes - Add CI workflow that runs every notebook end-to-end on Modal GPU function. - We run standard model loading workflows on small gpus, we run modified version of training scripts with `0.01` epoch to verify it can run the training on larger GPUs. Included is a README.md in the util folder to understand how commands can be used. We introduced skipping commands to skip notebooks or skip specific cells in notebooks if required. In addition, in this util folder we have a file called `modal_runner.py` which contains the actual model function deployed to modal.
1 parent 5babdfb commit 3c55760

14 files changed

Lines changed: 2358 additions & 1391 deletions
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
name: Run notebooks
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
paths:
8+
- 'notebooks/**'
9+
pull_request:
10+
branches:
11+
- main
12+
paths:
13+
- 'notebooks/**'
14+
workflow_dispatch:
15+
16+
jobs:
17+
discover-notebooks:
18+
runs-on: ubuntu-latest
19+
outputs:
20+
notebooks: ${{ steps.list.outputs.notebooks }}
21+
steps:
22+
- name: Checkout code
23+
uses: actions/checkout@v4
24+
25+
- name: List notebooks
26+
id: list
27+
run: |
28+
notebooks=$(ls notebooks/*.ipynb | xargs -I{} basename {} | jq -R -s -c 'split("\n") | map(select(. != ""))')
29+
echo "notebooks=$notebooks" >> "$GITHUB_OUTPUT"
30+
echo "Found notebooks: $notebooks"
31+
32+
run-notebooks:
33+
needs: discover-notebooks
34+
runs-on: ubuntu-latest
35+
strategy:
36+
fail-fast: false
37+
matrix:
38+
notebook: ${{ fromJSON(needs.discover-notebooks.outputs.notebooks) }}
39+
steps:
40+
- name: Checkout code
41+
uses: actions/checkout@v4
42+
43+
- name: Set up Python
44+
uses: actions/setup-python@v5
45+
with:
46+
python-version: '3.12'
47+
48+
- name: Install uv
49+
uses: astral-sh/setup-uv@v4
50+
51+
- name: Install modal
52+
run: uv pip install --system modal
53+
54+
- name: Set up Modal token
55+
run: modal token set --token-id ${{ secrets.MODAL_TOKEN_ID }} --token-secret ${{ secrets.MODAL_TOKEN_SECRET }}
56+
57+
- name: Run notebook on Modal
58+
run: python util/run_notebook_test.py --notebook "notebooks/${{ matrix.notebook }}" --skip-packages flash-attn

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,3 +160,8 @@ Thumbs.db
160160
*~
161161
.vscode/
162162
.onnx-tests/
163+
164+
env
165+
env/*
166+
167+
__pycache__/

notebooks/LFM2_Inference_with_Ollama.ipynb

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,13 @@
33
{
44
"cell_type": "markdown",
55
"metadata": {},
6-
"source": "# 💧 LFM2 Inference with Ollama\n\nThis notebook demonstrates how to use the [Ollama](https://ollama.com) API to run [LFM2](https://huggingface.co/collections/LiquidAI/lfm2-67d775f3b4b6fe79fbb21bda) and [LFM2.5](https://huggingface.co/collections/LiquidAI/lfm25-6839e3e26b2a9fdbde95b341) models.\n\n> ⚠️ **Note:** Ollama is intended to run locally on your machine. This notebook shows the Python and curl API usage to get Ollama running in Colab. Install Ollama from [ollama.com/download](https://ollama.com/download) and follow the [Liquid Docs](https://docs.liquid.ai/docs/inference/ollama) to get started. Also, right now LFM VL models are currently not working with ollama, we have an [open PR](https://github.com/ollama/ollama/pull/14069) to resolve this quickly."
6+
"source": [
7+
"# 💧 LFM2 Inference with Ollama\n",
8+
"\n",
9+
"This notebook demonstrates how to use the [Ollama](https://ollama.com) API to run [LFM2](https://huggingface.co/collections/LiquidAI/lfm2-67d775f3b4b6fe79fbb21bda) and [LFM2.5](https://huggingface.co/collections/LiquidAI/lfm25-6839e3e26b2a9fdbde95b341) models.\n",
10+
"\n",
11+
"> ⚠️ **Note:** Ollama is intended to run locally on your machine. This notebook shows the Python and curl API usage to get Ollama running in Colab. Install Ollama from [ollama.com/download](https://ollama.com/download) and follow the [Liquid Docs](https://docs.liquid.ai/docs/inference/ollama) to get started. Also, right now LFM VL models are currently not working with ollama, we have an [open PR](https://github.com/ollama/ollama/pull/14069) to resolve this quickly."
12+
]
713
},
814
{
915
"cell_type": "markdown",
@@ -19,6 +25,7 @@
1925
"outputs": [],
2026
"source": [
2127
"# Colab specific settings\n",
28+
"# !modal_skip\n",
2229
"!sudo apt install zstd\n",
2330
"!sudo apt update\n",
2431
"!sudo apt install -y pciutils"
@@ -170,6 +177,7 @@
170177
"outputs": [],
171178
"source": [
172179
"# Chat API\n",
180+
"# !modal_skip_rest\n",
173181
"%%bash\n",
174182
"curl -s http://localhost:11434/api/chat -d '{\n",
175183
" \"model\": \"hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF\",\n",
@@ -219,4 +227,4 @@
219227
},
220228
"nbformat": 4,
221229
"nbformat_minor": 0
222-
}
230+
}

notebooks/LFM2_Inference_with_Transformers.ipynb

Lines changed: 71 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
"metadata": {},
2727
"outputs": [],
2828
"source": [
29-
"!uv pip install \"transformers>=5.0.0\" \"torch==2.9.0\" accelerate"
29+
"!uv pip install \"transformers>=5.0.0\" \"torch==2.9.0\" accelerate, torchvision"
3030
]
3131
},
3232
{
@@ -86,7 +86,30 @@
8686
"execution_count": null,
8787
"metadata": {},
8888
"outputs": [],
89-
"source": "from transformers import GenerationConfig\n\ngeneration_config = GenerationConfig(\n do_sample=True,\n temperature=0.1,\n top_k=50,\n repetition_penalty=1.05,\n max_new_tokens=512,\n)\n\nprompt = \"Explain quantum computing in simple terms.\"\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": prompt}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, generation_config=generation_config)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
89+
"source": [
90+
"from transformers import GenerationConfig\n",
91+
"\n",
92+
"generation_config = GenerationConfig(\n",
93+
" do_sample=True,\n",
94+
" temperature=0.1,\n",
95+
" top_k=50,\n",
96+
" repetition_penalty=1.05,\n",
97+
" max_new_tokens=512,\n",
98+
")\n",
99+
"\n",
100+
"prompt = \"Explain quantum computing in simple terms.\"\n",
101+
"inputs = tokenizer.apply_chat_template(\n",
102+
" [{\"role\": \"user\", \"content\": prompt}],\n",
103+
" add_generation_prompt=True,\n",
104+
" return_tensors=\"pt\",\n",
105+
" return_dict=True,\n",
106+
").to(model.device)\n",
107+
"\n",
108+
"output = model.generate(**inputs, generation_config=generation_config)\n",
109+
"input_length = inputs[\"input_ids\"].shape[1]\n",
110+
"response = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\n",
111+
"print(response)"
112+
]
90113
},
91114
{
92115
"cell_type": "markdown",
@@ -131,7 +154,51 @@
131154
"execution_count": null,
132155
"metadata": {},
133156
"outputs": [],
134-
"source": "from transformers import AutoProcessor, AutoModelForImageTextToText\nfrom transformers.image_utils import load_image\n\n# Load vision model and processor\nmodel_id = \"LiquidAI/LFM2.5-VL-1.6B\"\nvision_model = AutoModelForImageTextToText.from_pretrained(\n model_id,\n device_map=\"auto\",\n dtype=\"bfloat16\"\n)\n\n# IMPORTANT: tie lm_head to input embeddings (transformers v5 bug)\nvision_model.lm_head.weight = vision_model.get_input_embeddings().weight\n\nprocessor = AutoProcessor.from_pretrained(model_id)\n\n# Load image\nurl = \"https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg\"\nimage = load_image(url)\n\n# Create conversation\nconversation = [\n {\n \"role\": \"user\",\n \"content\": [\n {\"type\": \"image\", \"image\": image},\n {\"type\": \"text\", \"text\": \"What is in this image?\"},\n ],\n },\n]\n\n# Generate response\ninputs = processor.apply_chat_template(\n conversation,\n add_generation_prompt=True,\n return_tensors=\"pt\",\n return_dict=True,\n tokenize=True,\n).to(vision_model.device)\n\noutputs = vision_model.generate(**inputs, do_sample=True, temperature=0.1, min_p=0.15, repetition_penalty=1.05, max_new_tokens=64)\nresponse = processor.batch_decode(outputs, skip_special_tokens=True)[0]\nprint(response)"
157+
"source": [
158+
"from transformers import AutoProcessor, AutoModelForImageTextToText\n",
159+
"from transformers.image_utils import load_image\n",
160+
"\n",
161+
"# Load vision model and processor\n",
162+
"model_id = \"LiquidAI/LFM2.5-VL-1.6B\"\n",
163+
"vision_model = AutoModelForImageTextToText.from_pretrained(\n",
164+
" model_id,\n",
165+
" device_map=\"auto\",\n",
166+
" dtype=\"bfloat16\"\n",
167+
")\n",
168+
"\n",
169+
"# IMPORTANT: tie lm_head to input embeddings (transformers v5 bug)\n",
170+
"vision_model.lm_head.weight = vision_model.get_input_embeddings().weight\n",
171+
"\n",
172+
"processor = AutoProcessor.from_pretrained(model_id)\n",
173+
"\n",
174+
"# Load image\n",
175+
"url = \"https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg\"\n",
176+
"image = load_image(url)\n",
177+
"\n",
178+
"# Create conversation\n",
179+
"conversation = [\n",
180+
" {\n",
181+
" \"role\": \"user\",\n",
182+
" \"content\": [\n",
183+
" {\"type\": \"image\", \"image\": image},\n",
184+
" {\"type\": \"text\", \"text\": \"What is in this image?\"},\n",
185+
" ],\n",
186+
" },\n",
187+
"]\n",
188+
"\n",
189+
"# Generate response\n",
190+
"inputs = processor.apply_chat_template(\n",
191+
" conversation,\n",
192+
" add_generation_prompt=True,\n",
193+
" return_tensors=\"pt\",\n",
194+
" return_dict=True,\n",
195+
" tokenize=True,\n",
196+
").to(vision_model.device)\n",
197+
"\n",
198+
"outputs = vision_model.generate(**inputs, do_sample=True, temperature=0.1, min_p=0.15, repetition_penalty=1.05, max_new_tokens=64)\n",
199+
"response = processor.batch_decode(outputs, skip_special_tokens=True)[0]\n",
200+
"print(response)"
201+
]
135202
},
136203
{
137204
"cell_type": "markdown",
@@ -161,4 +228,4 @@
161228
},
162229
"nbformat": 4,
163230
"nbformat_minor": 0
164-
}
231+
}

notebooks/LFM2_Inference_with_llama_cpp.ipynb

Lines changed: 51 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,14 @@
4444
"execution_count": null,
4545
"metadata": {},
4646
"outputs": [],
47-
"source": "!llama-b7633/llama-cli \\\n -hf LiquidAI/LFM2.5-1.2B-Instruct-GGUF:Q4_K_M \\\n -p \"What is C. elegans?\" \\\n -n 256 \\\n --temp 0.1 --top-k 50 --top-p 0.1 --repeat-penalty 1.05"
47+
"source": [
48+
"# !modal_skip\n",
49+
"!llama-b7633/llama-cli \\\n",
50+
" -hf LiquidAI/LFM2.5-1.2B-Instruct-GGUF:Q4_K_M \\\n",
51+
" -p \"What is C. elegans?\" \\\n",
52+
" -n 256 \\\n",
53+
" --temp 0.1 --top-k 50 --top-p 0.1 --repeat-penalty 1.05"
54+
]
4855
},
4956
{
5057
"cell_type": "markdown",
@@ -99,15 +106,34 @@
99106
"metadata": {},
100107
"outputs": [],
101108
"source": [
102-
"!uv pip install -qqq openai"
109+
"!uv pip install -qqq openai requests"
103110
]
104111
},
105112
{
106113
"cell_type": "code",
107114
"execution_count": null,
108115
"metadata": {},
109116
"outputs": [],
110-
"source": "from openai import OpenAI\n\nclient = OpenAI(\n base_url=\"http://localhost:8000/v1\",\n api_key=\"not-needed\"\n)\n\nresponse = client.chat.completions.create(\n model=\"lfm2.5-1.2b-instruct\",\n messages=[\n {\"role\": \"user\", \"content\": \"What is machine learning?\"}\n ],\n temperature=0.1,\n top_p=0.1,\n max_tokens=512,\n extra_body={\"top_k\": 50, \"repetition_penalty\": 1.05},\n)\nprint(response.choices[0].message.content)"
117+
"source": [
118+
"from openai import OpenAI\n",
119+
"\n",
120+
"client = OpenAI(\n",
121+
" base_url=\"http://localhost:8000/v1\",\n",
122+
" api_key=\"not-needed\"\n",
123+
")\n",
124+
"\n",
125+
"response = client.chat.completions.create(\n",
126+
" model=\"lfm2.5-1.2b-instruct\",\n",
127+
" messages=[\n",
128+
" {\"role\": \"user\", \"content\": \"What is machine learning?\"}\n",
129+
" ],\n",
130+
" temperature=0.1,\n",
131+
" top_p=0.1,\n",
132+
" max_tokens=512,\n",
133+
" extra_body={\"top_k\": 50, \"repetition_penalty\": 1.05},\n",
134+
")\n",
135+
"print(response.choices[0].message.content)"
136+
]
111137
},
112138
{
113139
"cell_type": "code",
@@ -148,7 +174,7 @@
148174
"execution_count": null,
149175
"metadata": {},
150176
"outputs": [],
151-
"source": "!llama-b7633/llama-cli \\\n -hf LiquidAI/LFM2.5-VL-1.6B-GGUF:Q4_0 \\\n --image test_image.jpg \\\n --image-max-tokens 64 \\\n -p \"What's in this image?\" \\\n -n 128 \\\n --temp 0.1 --min-p 0.15 --repeat-penalty 1.05"
177+
"source": "# !modal_skip\n!llama-b7633/llama-cli \\\n -hf LiquidAI/LFM2.5-VL-1.6B-GGUF:Q4_0 \\\n --image test_image.jpg \\\n --image-max-tokens 64 \\\n -p \"What's in this image?\" \\\n -n 128 \\\n --temp 0.1 --min-p 0.15 --repeat-penalty 1.05"
152178
},
153179
{
154180
"cell_type": "markdown",
@@ -202,7 +228,27 @@
202228
"execution_count": null,
203229
"metadata": {},
204230
"outputs": [],
205-
"source": "client = OpenAI(\n base_url=\"http://localhost:8000/v1\",\n api_key=\"not-needed\"\n)\n\nresponse = client.chat.completions.create(\n model=\"lfm2.5-vl-1.6b\",\n messages=[{\n \"role\": \"user\",\n \"content\": [\n {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n {\"type\": \"text\", \"text\": \"What's in this image?\"}\n ]\n }],\n temperature=0.1,\n max_tokens=512,\n extra_body={\"min_p\": 0.15, \"repetition_penalty\": 1.05},\n)\nprint(response.choices[0].message.content)"
231+
"source": [
232+
"client = OpenAI(\n",
233+
" base_url=\"http://localhost:8000/v1\",\n",
234+
" api_key=\"not-needed\"\n",
235+
")\n",
236+
"\n",
237+
"response = client.chat.completions.create(\n",
238+
" model=\"lfm2.5-vl-1.6b\",\n",
239+
" messages=[{\n",
240+
" \"role\": \"user\",\n",
241+
" \"content\": [\n",
242+
" {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n",
243+
" {\"type\": \"text\", \"text\": \"What's in this image?\"}\n",
244+
" ]\n",
245+
" }],\n",
246+
" temperature=0.1,\n",
247+
" max_tokens=512,\n",
248+
" extra_body={\"min_p\": 0.15, \"repetition_penalty\": 1.05},\n",
249+
")\n",
250+
"print(response.choices[0].message.content)"
251+
]
206252
},
207253
{
208254
"cell_type": "code",

0 commit comments

Comments
 (0)