Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,8 @@
"icon": "globe",
"pages": [
"examples/web/vl-webgpu-demo",
"examples/web/audio-webgpu-demo"
"examples/web/audio-webgpu-demo",
"examples/web/hand-voice-racer"
]
},
{
Expand Down
112 changes: 73 additions & 39 deletions examples/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,58 +2,92 @@
title: "Examples Library"
---

## Laptop

<CardGroup cols={2}>
<Card title="Invoice Extractor Tool" icon="file-invoice" href="/examples/laptop-examples/invoice-extractor-tool-with-liquid-nanos">
Turn invoices into structured JSON using a lightweight Vision Language Model. 100% local, no API costs.
</Card>

<Card title="Audio Transcription in Real-Time" icon="microphone" href="/examples/laptop-examples/audio-to-text-in-real-time">
Build a real-time audio transcription CLI using LFM2-Audio-1.5B with llama.cpp. 100% local processing without internet connection.
</Card>
<Card title="Invoice Extractor Tool" icon="file-invoice" href="/examples/laptop-examples/invoice-extractor-tool-with-liquid-nanos">
Turn invoices into structured JSON using a lightweight Vision Language Model. 100% local, no API costs.
</Card>

<Card title="Car Maker Identification" icon="car" href="/examples/customize-models/car-maker-identification">
Fine-tune LFM2-VL to identify car makers from images. Learn structured generation with Outlines and parameter-efficient fine-tuning with LoRA.
</Card>
<Card title="Audio Transcription in Real-Time" icon="microphone" href="/examples/laptop-examples/audio-to-text-in-real-time">
Build a real-time audio transcription CLI using LFM2-Audio-1.5B with llama.cpp. 100% local processing without internet connection.
</Card>

<Card title="English-Korean Translation" icon="globe" href="/examples/laptop-examples/lfm2-english-to-korean">
Efficient bidirectional translation system powered by LFM2 1.2B fine-tuned for Korean-English translation with automatic language detection.
</Card>
<Card title="English-Korean Translation" icon="globe" href="/examples/laptop-examples/lfm2-english-to-korean">
Efficient bidirectional translation system powered by LFM2 1.2B fine-tuned for Korean-English translation with automatic language detection.
</Card>

<Card title="Flight Search Assistant" icon="plane-departure" href="/examples/laptop-examples/flight-search-assistant">
Python CLI leveraging LFM2.5-1.2B-Thinking for multi-step reasoning and tool calling to find and book flights.
</Card>
<Card title="Flight Search Assistant" icon="plane-departure" href="/examples/laptop-examples/flight-search-assistant">
Python CLI leveraging LFM2.5-1.2B-Thinking for multi-step reasoning and tool calling to find and book flights.
</Card>

<Card title="Audio Car Cockpit Demo" icon="car" href="/examples/laptop-examples/audio-car-cockpit">
Voice-controlled car cockpit interface combining LFM2.5-Audio-1.5B in TTS/STT modes with LFM2-1.2B-Tool. Real-time local processing.
</Card>
<Card title="Audio Car Cockpit Demo" icon="car" href="/examples/laptop-examples/audio-car-cockpit">
Voice-controlled car cockpit interface combining LFM2.5-Audio-1.5B in TTS/STT modes with LFM2-1.2B-Tool. Real-time local processing.
</Card>

<Card title="Meeting Summarization CLI" icon="users" href="/examples/laptop-examples/meeting-summarization">
100% local meeting summarization tool using LFM2-2.6B-Transcript and llama.cpp. No cloud services or API keys required.
</Card>
<Card title="Meeting Summarization CLI" icon="users" href="/examples/laptop-examples/meeting-summarization">
100% local meeting summarization tool using LFM2-2.6B-Transcript and llama.cpp. No cloud services or API keys required.
</Card>

<Card title="Browser Control with GRPO" icon="browser" href="/examples/laptop-examples/browser-control">
Train language models for web automation using reinforcement learning. Demonstrates GRPO fine-tuning with BrowserGym environments.
</Card>
<Card title="Browser Control with GRPO" icon="browser" href="/examples/laptop-examples/browser-control">
Train language models for web automation using reinforcement learning. Demonstrates GRPO fine-tuning with BrowserGym environments.
</Card>

<Card title="Product Slogan Generator" icon="sparkles" href="/examples/android/slogan-generator">
Android app for single-turn generation of creative product slogans using local AI models. Built with traditional Android Views.
</Card>
</CardGroup>

<Card title="Web Content Summarizer" icon="newspaper" href="/examples/android/web-content-summarizer">
Share web pages from any browser to this Android app for instant AI-powered summarization. Complete privacy with local processing.
</Card>
## Android

<Card title="Structured Recipe Generator" icon="utensils" href="/examples/android/recipe-generator-constrained-output">
Generate recipes with guaranteed JSON structure using constrained generation. Demonstrates automatic model downloading with LeapSDK.
</Card>
<CardGroup cols={2}>

<Card title="Vision Language Model Demo" icon="eye" href="/examples/android/vision-language-model-example">
Analyze images and answer visual questions on Android using Vision Language Models. Built with Jetpack Compose and Coil.
</Card>
<Card title="Product Slogan Generator" icon="sparkles" href="/examples/android/slogan-generator">
Android app for single-turn generation of creative product slogans using local AI models. Built with traditional Android Views.
</Card>

<Card title="Web Content Summarizer" icon="newspaper" href="/examples/android/web-content-summarizer">
Share web pages from any browser to this Android app for instant AI-powered summarization. Complete privacy with local processing.
</Card>

<Card title="Structured Recipe Generator" icon="utensils" href="/examples/android/recipe-generator-constrained-output">
Generate recipes with guaranteed JSON structure using constrained generation. Demonstrates automatic model downloading with LeapSDK.
</Card>

<Card title="Vision Language Model Demo" icon="eye" href="/examples/android/vision-language-model-example">
Analyze images and answer visual questions on Android using Vision Language Models. Built with Jetpack Compose and Coil.
</Card>

<Card title="AI Agents with Koog" icon="robot" href="/examples/android/leap-koog-agent">
Build intelligent AI agents on Android with the Koog framework. Demonstrates tool invocation, context management, and MCP integration.
</Card>

</CardGroup>

## Web

<CardGroup cols={2}>

<Card title="Hand & Voice Racer" icon="gamepad" href="/examples/web/hand-voice-racer">
A browser driving game controlled with your hands and voice. MediaPipe tracks hand gestures for steering while LFM2.5-Audio-1.5B transcribes voice commands. Fully local, no server round-trips.
</Card>

<Card title="Audio Browser Demo" icon="waveform-lines" href="/examples/web/audio-webgpu-demo">
Run LFM2.5-Audio-1.5B entirely in the browser with WebGPU. Supports ASR, TTS, and interleaved audio-text conversations. No data sent to external servers.
</Card>

<Card title="Real-Time Video Captioning" icon="video" href="/examples/web/vl-webgpu-demo">
Real-time video captioning with LFM2.5-VL-1.6B running fully client-side via WebGPU and ONNX Runtime Web. No cloud inference required.
</Card>

</CardGroup>

## Model Customization

<CardGroup cols={2}>

<Card title="Car Maker Identification" icon="car" href="/examples/customize-models/car-maker-identification">
Fine-tune LFM2-VL to identify car makers from images. Learn structured generation with Outlines and parameter-efficient fine-tuning with LoRA.
</Card>

<Card title="AI Agents with Koog" icon="robot" href="/examples/android/leap-koog-agent">
Build intelligent AI agents on Android with the Koog framework. Demonstrates tool invocation, context management, and MCP integration.
</Card>
</CardGroup>

## Cannot find the example you need?
Expand Down
85 changes: 85 additions & 0 deletions examples/web/hand-voice-racer.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
title: "Hand & Voice Racer"
---

<Card title="View Source Code" icon="github" href="https://github.com/Liquid4All/cookbook/tree/main/examples/hand-voice-racer">
Browse the complete example on GitHub
</Card>

<iframe
className="w-full aspect-video rounded-xl"
src="https://www.youtube.com/embed/PdmTeDNMP2s"
title="Hand & Voice Racer demo"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowFullScreen
></iframe>

**A browser driving game you control with your hands and voice, powered by models running fully local.**

Steer by holding both hands up like a steering wheel. Speak commands to accelerate, brake, toggle headlights, and play music. No cloud calls, no server round-trips. Everything runs in your browser tab.

## How it works

Two models run in parallel, entirely client-side:

- **[MediaPipe Hand Landmarker](https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker)** tracks your hand positions via webcam at ~30 fps. The angle between your two wrists drives the steering.
- **[LFM2.5-Audio-1.5B](https://docs.liquid.ai/lfm/models/lfm25-audio-1.5b)** runs in a Web Worker with ONNX Runtime Web. It listens for speech via the [Silero VAD](https://github.com/snakers4/silero-vad) and transcribes each utterance on-device. Matched keywords control game state.

The audio model loads from Hugging Face and is cached in IndexedDB after the first run, so subsequent starts are instant.

## Voice commands

| Say | Effect |
|-----|--------|
| `speed` / `fast` / `go` | Accelerate to 120 km/h |
| `slow` / `stop` / `brake` | Decelerate to 0 km/h |
| `lights on` | Enable headlights |
| `lights off` | Disable headlights |
| `music` / `play` | Start the techno beat |
| `stop music` / `silence` | Stop the beat |

## Prerequisites

<Note>
**Browser Requirements**

- Chrome 113+ or Edge 113+ (WebGPU required for fast audio inference; falls back to WASM)
- Webcam and microphone access
- Node.js 18+
</Note>

## Run locally

```bash
npm install
npm run dev
```

Then open [http://localhost:3001](http://localhost:3001).

On first load the audio model (~900 MB at Q4 quantization) downloads from Hugging Face and is cached in your browser. Hand detection assets load from CDN and MediaPipe's model storage.

## Architecture

```
Browser tab
├── main thread
│ ├── MediaPipe HandLandmarker (webcam → hand angles → steering)
│ ├── Canvas 2D renderer (road, scenery, dashboard, HUD)
│ └── Web Audio API (procedural techno synthesizer)
└── audio-worker.js (Web Worker)
├── Silero VAD (mic → speech segments)
└── LFM2.5-Audio-1.5B ONNX (speech segment → transcript → keyword)
```

The game loop runs on `requestAnimationFrame`. Hand detection is throttled to ~30 fps so it does not block rendering. Voice processing happens off the main thread and delivers results via `postMessage`.

## Need help?

<CardGroup cols={1}>

<Card title="Join our Discord" icon="discord" iconType="brands" href="https://discord.gg/DFU3WQeaYD">
Connect with the community and ask questions about this example.
</Card>

</CardGroup>