Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions .github/workflows/run-notebooks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: Run notebooks

on:
push:
branches:
- main
paths:
- 'notebooks/**'
pull_request:
branches:
- main
paths:
- 'notebooks/**'
workflow_dispatch:

jobs:
discover-notebooks:
runs-on: ubuntu-latest
outputs:
notebooks: ${{ steps.list.outputs.notebooks }}
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: List notebooks
id: list
run: |
notebooks=$(ls notebooks/*.ipynb | xargs -I{} basename {} | jq -R -s -c 'split("\n") | map(select(. != ""))')
echo "notebooks=$notebooks" >> "$GITHUB_OUTPUT"
echo "Found notebooks: $notebooks"

run-notebooks:
needs: discover-notebooks
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
notebook: ${{ fromJSON(needs.discover-notebooks.outputs.notebooks) }}
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install uv
uses: astral-sh/setup-uv@v4

- name: Install modal
run: uv pip install --system modal

- name: Set up Modal token
run: modal token set --token-id ${{ secrets.MODAL_TOKEN_ID }} --token-secret ${{ secrets.MODAL_TOKEN_SECRET }}

- name: Run notebook on Modal
run: python util/run_notebook_test.py --notebook "notebooks/${{ matrix.notebook }}" --skip-packages flash-attn
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,8 @@ Thumbs.db
*~
.vscode/
.onnx-tests/

env
env/*

__pycache__/
12 changes: 10 additions & 2 deletions notebooks/LFM2_Inference_with_Ollama.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,13 @@
{
"cell_type": "markdown",
"metadata": {},
"source": "# 💧 LFM2 Inference with Ollama\n\nThis notebook demonstrates how to use the [Ollama](https://ollama.com) API to run [LFM2](https://huggingface.co/collections/LiquidAI/lfm2-67d775f3b4b6fe79fbb21bda) and [LFM2.5](https://huggingface.co/collections/LiquidAI/lfm25-6839e3e26b2a9fdbde95b341) models.\n\n> ⚠️ **Note:** Ollama is intended to run locally on your machine. This notebook shows the Python and curl API usage to get Ollama running in Colab. Install Ollama from [ollama.com/download](https://ollama.com/download) and follow the [Liquid Docs](https://docs.liquid.ai/docs/inference/ollama) to get started. Also, right now LFM VL models are currently not working with ollama, we have an [open PR](https://github.com/ollama/ollama/pull/14069) to resolve this quickly."
"source": [
"# 💧 LFM2 Inference with Ollama\n",
"\n",
"This notebook demonstrates how to use the [Ollama](https://ollama.com) API to run [LFM2](https://huggingface.co/collections/LiquidAI/lfm2-67d775f3b4b6fe79fbb21bda) and [LFM2.5](https://huggingface.co/collections/LiquidAI/lfm25-6839e3e26b2a9fdbde95b341) models.\n",
"\n",
"> ⚠️ **Note:** Ollama is intended to run locally on your machine. This notebook shows the Python and curl API usage to get Ollama running in Colab. Install Ollama from [ollama.com/download](https://ollama.com/download) and follow the [Liquid Docs](https://docs.liquid.ai/docs/inference/ollama) to get started. Also, right now LFM VL models are currently not working with ollama, we have an [open PR](https://github.com/ollama/ollama/pull/14069) to resolve this quickly."
]
},
{
"cell_type": "markdown",
Expand All @@ -19,6 +25,7 @@
"outputs": [],
"source": [
"# Colab specific settings\n",
"# !modal_skip\n",
"!sudo apt install zstd\n",
"!sudo apt update\n",
"!sudo apt install -y pciutils"
Expand Down Expand Up @@ -170,6 +177,7 @@
"outputs": [],
"source": [
"# Chat API\n",
"# !modal_skip_rest\n",
"%%bash\n",
"curl -s http://localhost:11434/api/chat -d '{\n",
" \"model\": \"hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF\",\n",
Expand Down Expand Up @@ -219,4 +227,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
75 changes: 71 additions & 4 deletions notebooks/LFM2_Inference_with_Transformers.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"metadata": {},
"outputs": [],
"source": [
"!uv pip install \"transformers>=5.0.0\" \"torch==2.9.0\" accelerate"
"!uv pip install \"transformers>=5.0.0\" \"torch==2.9.0\" accelerate, torchvision"
]
},
{
Expand Down Expand Up @@ -86,7 +86,30 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "from transformers import GenerationConfig\n\ngeneration_config = GenerationConfig(\n do_sample=True,\n temperature=0.1,\n top_k=50,\n repetition_penalty=1.05,\n max_new_tokens=512,\n)\n\nprompt = \"Explain quantum computing in simple terms.\"\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": prompt}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, generation_config=generation_config)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
"source": [
"from transformers import GenerationConfig\n",
"\n",
"generation_config = GenerationConfig(\n",
" do_sample=True,\n",
" temperature=0.1,\n",
" top_k=50,\n",
" repetition_penalty=1.05,\n",
" max_new_tokens=512,\n",
")\n",
"\n",
"prompt = \"Explain quantum computing in simple terms.\"\n",
"inputs = tokenizer.apply_chat_template(\n",
" [{\"role\": \"user\", \"content\": prompt}],\n",
" add_generation_prompt=True,\n",
" return_tensors=\"pt\",\n",
" return_dict=True,\n",
").to(model.device)\n",
"\n",
"output = model.generate(**inputs, generation_config=generation_config)\n",
"input_length = inputs[\"input_ids\"].shape[1]\n",
"response = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -131,7 +154,51 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "from transformers import AutoProcessor, AutoModelForImageTextToText\nfrom transformers.image_utils import load_image\n\n# Load vision model and processor\nmodel_id = \"LiquidAI/LFM2.5-VL-1.6B\"\nvision_model = AutoModelForImageTextToText.from_pretrained(\n model_id,\n device_map=\"auto\",\n dtype=\"bfloat16\"\n)\n\n# IMPORTANT: tie lm_head to input embeddings (transformers v5 bug)\nvision_model.lm_head.weight = vision_model.get_input_embeddings().weight\n\nprocessor = AutoProcessor.from_pretrained(model_id)\n\n# Load image\nurl = \"https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg\"\nimage = load_image(url)\n\n# Create conversation\nconversation = [\n {\n \"role\": \"user\",\n \"content\": [\n {\"type\": \"image\", \"image\": image},\n {\"type\": \"text\", \"text\": \"What is in this image?\"},\n ],\n },\n]\n\n# Generate response\ninputs = processor.apply_chat_template(\n conversation,\n add_generation_prompt=True,\n return_tensors=\"pt\",\n return_dict=True,\n tokenize=True,\n).to(vision_model.device)\n\noutputs = vision_model.generate(**inputs, do_sample=True, temperature=0.1, min_p=0.15, repetition_penalty=1.05, max_new_tokens=64)\nresponse = processor.batch_decode(outputs, skip_special_tokens=True)[0]\nprint(response)"
"source": [
"from transformers import AutoProcessor, AutoModelForImageTextToText\n",
"from transformers.image_utils import load_image\n",
"\n",
"# Load vision model and processor\n",
"model_id = \"LiquidAI/LFM2.5-VL-1.6B\"\n",
"vision_model = AutoModelForImageTextToText.from_pretrained(\n",
" model_id,\n",
" device_map=\"auto\",\n",
" dtype=\"bfloat16\"\n",
")\n",
"\n",
"# IMPORTANT: tie lm_head to input embeddings (transformers v5 bug)\n",
"vision_model.lm_head.weight = vision_model.get_input_embeddings().weight\n",
"\n",
"processor = AutoProcessor.from_pretrained(model_id)\n",
"\n",
"# Load image\n",
"url = \"https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg\"\n",
"image = load_image(url)\n",
"\n",
"# Create conversation\n",
"conversation = [\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\"type\": \"image\", \"image\": image},\n",
" {\"type\": \"text\", \"text\": \"What is in this image?\"},\n",
" ],\n",
" },\n",
"]\n",
"\n",
"# Generate response\n",
"inputs = processor.apply_chat_template(\n",
" conversation,\n",
" add_generation_prompt=True,\n",
" return_tensors=\"pt\",\n",
" return_dict=True,\n",
" tokenize=True,\n",
").to(vision_model.device)\n",
"\n",
"outputs = vision_model.generate(**inputs, do_sample=True, temperature=0.1, min_p=0.15, repetition_penalty=1.05, max_new_tokens=64)\n",
"response = processor.batch_decode(outputs, skip_special_tokens=True)[0]\n",
"print(response)"
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -161,4 +228,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
56 changes: 51 additions & 5 deletions notebooks/LFM2_Inference_with_llama_cpp.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,14 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "!llama-b7633/llama-cli \\\n -hf LiquidAI/LFM2.5-1.2B-Instruct-GGUF:Q4_K_M \\\n -p \"What is C. elegans?\" \\\n -n 256 \\\n --temp 0.1 --top-k 50 --top-p 0.1 --repeat-penalty 1.05"
"source": [
"# !modal_skip\n",
"!llama-b7633/llama-cli \\\n",
" -hf LiquidAI/LFM2.5-1.2B-Instruct-GGUF:Q4_K_M \\\n",
" -p \"What is C. elegans?\" \\\n",
" -n 256 \\\n",
" --temp 0.1 --top-k 50 --top-p 0.1 --repeat-penalty 1.05"
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -99,15 +106,34 @@
"metadata": {},
"outputs": [],
"source": [
"!uv pip install -qqq openai"
"!uv pip install -qqq openai requests"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "from openai import OpenAI\n\nclient = OpenAI(\n base_url=\"http://localhost:8000/v1\",\n api_key=\"not-needed\"\n)\n\nresponse = client.chat.completions.create(\n model=\"lfm2.5-1.2b-instruct\",\n messages=[\n {\"role\": \"user\", \"content\": \"What is machine learning?\"}\n ],\n temperature=0.1,\n top_p=0.1,\n max_tokens=512,\n extra_body={\"top_k\": 50, \"repetition_penalty\": 1.05},\n)\nprint(response.choices[0].message.content)"
"source": [
"from openai import OpenAI\n",
"\n",
"client = OpenAI(\n",
" base_url=\"http://localhost:8000/v1\",\n",
" api_key=\"not-needed\"\n",
")\n",
"\n",
"response = client.chat.completions.create(\n",
" model=\"lfm2.5-1.2b-instruct\",\n",
" messages=[\n",
" {\"role\": \"user\", \"content\": \"What is machine learning?\"}\n",
" ],\n",
" temperature=0.1,\n",
" top_p=0.1,\n",
" max_tokens=512,\n",
" extra_body={\"top_k\": 50, \"repetition_penalty\": 1.05},\n",
")\n",
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "code",
Expand Down Expand Up @@ -148,7 +174,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "!llama-b7633/llama-cli \\\n -hf LiquidAI/LFM2.5-VL-1.6B-GGUF:Q4_0 \\\n --image test_image.jpg \\\n --image-max-tokens 64 \\\n -p \"What's in this image?\" \\\n -n 128 \\\n --temp 0.1 --min-p 0.15 --repeat-penalty 1.05"
"source": "# !modal_skip\n!llama-b7633/llama-cli \\\n -hf LiquidAI/LFM2.5-VL-1.6B-GGUF:Q4_0 \\\n --image test_image.jpg \\\n --image-max-tokens 64 \\\n -p \"What's in this image?\" \\\n -n 128 \\\n --temp 0.1 --min-p 0.15 --repeat-penalty 1.05"
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -202,7 +228,27 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "client = OpenAI(\n base_url=\"http://localhost:8000/v1\",\n api_key=\"not-needed\"\n)\n\nresponse = client.chat.completions.create(\n model=\"lfm2.5-vl-1.6b\",\n messages=[{\n \"role\": \"user\",\n \"content\": [\n {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n {\"type\": \"text\", \"text\": \"What's in this image?\"}\n ]\n }],\n temperature=0.1,\n max_tokens=512,\n extra_body={\"min_p\": 0.15, \"repetition_penalty\": 1.05},\n)\nprint(response.choices[0].message.content)"
"source": [
"client = OpenAI(\n",
" base_url=\"http://localhost:8000/v1\",\n",
" api_key=\"not-needed\"\n",
")\n",
"\n",
"response = client.chat.completions.create(\n",
" model=\"lfm2.5-vl-1.6b\",\n",
" messages=[{\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n",
" {\"type\": \"text\", \"text\": \"What's in this image?\"}\n",
" ]\n",
" }],\n",
" temperature=0.1,\n",
" max_tokens=512,\n",
" extra_body={\"min_p\": 0.15, \"repetition_penalty\": 1.05},\n",
")\n",
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "code",
Expand Down
Loading
Loading