LLM Home Assistant

LLM Home Assistant brings true natural language understanding to your smart home. Instead of memorizing rigid, robotic voice commands, this custom integration allows you to control your IoT devices by speaking or typing to your home naturally. By routing requests through advanced Large Language Models (like GPT-4o and Llama 3.3) and filtering device context to save tokens, it bridges the gap between complex smart home infrastructure and effortless human interaction.

🌟 Key Features

Talk to Your Home Naturally: No more "Turn off light zero one." Say "I'm heading to bed, can you shut down the house?" and the integration understands the context and executes the right services.
Direct Audio Processing: Speak commands directly into your dashboard. Audio is processed end-to-end via multimodal models (GPT-4o-Audio) for blazing-fast responses without needing a separate transcription step.
Token-Efficient & Cost-Effective: Smart whitelist-based entity filtering means only the devices you care about are sent to the LLM, reducing context size, lowering API costs, and speeding up response times.
Native Dashboard Cards: Includes custom Lovelace UI cards (llm-card and llm-recording-card) so you can type or record audio directly from your Home Assistant overview.

See It In Action

🚀 How to Try It

Ready to upgrade your smart home?

View the Code on GitHub
Prerequisites: A running instance of Home Assistant and an OpenAI API key.

See the Installation section below for step-by-step setup instructions, or spin up the full stack instantly using our provided Docker Compose file.

👥 Team & Credits

This project was built by a team of software engineering students at Oregon State University:

Jacob Berger - bergejac@oregonstate.edu
Varunesh Sunthar - suntharv@oregonstate.edu
Andrew Vu - vuand@oregonstate.edu
Jhonny Guzman - guzmjona@oregonstate.edu

Feedback or bugs? Please open an issue on our GitHub Issue Tracker.

Architecture

Text Pipeline

User text
  -> llm_home_assistant.chat service
  -> call_model_wrapper()
  -> async_query_openai() -> GPT-4o-Mini (JSON mode)
  -> parse actions via Pydantic (Plan/Action models)
  -> _execute_tool_call() for each action
  -> sensor update + event fire

Audio-Direct Pipeline

Microphone
  -> start_recording -> ffmpeg records WAV to _audios/current_request.wav
  -> stop_recording -> stops ffmpeg
    -> process_audio_direct service
      -> reads WAV file from disk
      -> base64 encodes audio
      -> ONE call to gpt-4o-audio-preview (multimodal: audio + HA context)
      -> function calling returns { actions, explanation }
      -> _execute_tool_call() for each action (same as text path)
      -> sensor update + event fire
      -> tts_fallback (gTTS / espeak)

The audio-direct pipeline sends speech directly to gpt-4o-audio-preview which understands audio and produces structured actions in a single API call — no separate transcription step required.

Legacy Audio Pipeline (Deprecated)

transcribe_audio service
  -> whisper.cpp STT -> text
  -> llm_home_assistant.chat service -> GPT-4o-Mini
  -> actions + TTS

This path is preserved for backwards compatibility but is no longer the recommended approach.

Installation

Option 1: Manual Installation (Recommended for testing/development)

Clone this repository into your Home Assistant's custom components directory:

cd /path/to/your/homeassistant/config
git clone https://github.com/your-repo/llm-home-assistant.git custom_components/llm_home_assistant

Create a .env file in the project root with your OpenAI API key:

echo "OPENAI_API_KEY=sk-your-key-here" > .env

Add the integration configuration to your configuration.yaml:

llm_home_assistant:
  openai_api_key: !env_var OPENAI_API_KEY
  model: gpt-4o-mini

Restart Home Assistant

Option 2: Docker Compose (Full Stack)

git clone https://github.com/your-repo/llm-home-assistant.git
cd llm-home-assistant
echo "OPENAI_API_KEY=sk-your-key-here" > .env
docker compose up -d

Then open Home Assistant at http://localhost:8123.

Configuration

Basic Configuration

llm_home_assistant:
  openai_api_key: !env_var OPENAI_API_KEY
  model: gpt-4o-mini

Whitelist-Based Entity Filtering

To reduce context size and token usage, configure an allowlist in configuration.yaml:

llm_home_assistant:
  openai_api_key: !env_var OPENAI_API_KEY
  model: gpt-4o-mini
  allow:
    domains:
      - light
      - switch
      - cover
      - binary_sensor
      - sensor
    services:
      - light.turn_on
      - light.turn_off
      - switch.turn_on
      - switch.turn_off
    entities:
      - light.kitchen
      - light.living_room
      - switch.bedroom
      - binary_sensor.door
      - binary_sensor.window
      - sensor.temperature

Option	Description
`domains`	Only include entities from these domains. Common domains: `light`, `switch`, `cover`, `binary_sensor`, `sensor`, `climate`, `fan`, `lock`, `media_player`, `vacuum`
`services`	Only allow these specific services to be called. If `allow` is configured, services must be explicitly listed (fail-closed).
`entities`	Only include these specific entity IDs in the context

Using whitelist filtering significantly reduces context size, which lowers token usage and latency.

Registered Services

Service	Description
`llm_home_assistant.chat`	Send text to the LLM. Actions are executed automatically in Home Assistant.
`llm_home_assistant.process_command`	Legacy handler for button input.
`llm_home_assistant.start_recording`	Start ffmpeg audio recording from microphone.
`llm_home_assistant.stop_recording`	Stop recording. Automatically triggers `process_audio_direct`.
`llm_home_assistant.process_audio_direct`	Send recorded WAV directly to gpt-4o-audio-preview.
`llm_home_assistant.transcribe_audio`	Legacy: whisper.cpp STT then chat service.
`llm_home_assistant.tts_fallback`	Text-to-speech using gTTS with espeak fallback.

File Structure

Core Files

File	Purpose
`__init__.py`	Integration setup. Registers all services, loads platforms, frontend.
`call_model.py`	Main orchestration: routes text or audio to the correct OpenAI caller, executes actions in parallel, caches responses, enforces allow_cfg restrictions.
`audio_utils.py`	Audio validation (format, size) and base64 encoding.
`text_audio_processing.py`	ffmpeg recording, whisper.cpp STT, gTTS/espeak TTS.
`device_info.py`	Device state/service formatting + compact context builder with whitelist filtering.
`interaction_logger.py`	Interaction logging system for full audit trail.
`services.yaml`	Home Assistant service definitions for Developer Tools UI.

Model Files

File	Purpose
`models/openai/call_openai.py`	OpenAI caller using GPT-4o-Mini (JSON mode) or gpt-4o-audio-preview (function calling). Contains `build_compact_context()` with whitelist support.
`models/openai/call_openai_audio.py`	Audio-capable OpenAI caller using gpt-4o-audio-preview.
`models/openai/tool_defs.py`	OpenAI function calling schema (`PROPOSE_ACTIONS_TOOL`).
`models/openai/call_JSON_mode.py`	JSON mode implementation for structured responses.
`models/llama3.3/call_llama.py`	Llama 3.3 stub (experimental).

Frontend

File	Purpose
`www/llm-card.js`	Text input card for Lovelace dashboard.
`www/llm-recording-card.js`	Recording toggle card with model selection.

Testing

Verify the Integration Loaded

docker compose logs homeassistant | grep -i "llm_home_assistant\|Error"

You should see service registration messages with no import errors.

Test the Text Pipeline

In Home Assistant Developer Tools > Services:

Select service: llm_home_assistant.chat
Service data:
```
text: "What devices are available?"
```
Click Call Service

Check sensor.llm_model_response for the LLM's response.

Test the Audio Pipeline

Option A: Full Recording Flow

Call llm_home_assistant.start_recording
Speak a command (e.g., "turn off the kitchen light")
Call llm_home_assistant.stop_recording

This automatically triggers process_audio_direct which calls gpt-4o-audio-preview.

Option B: Manual WAV File

# Record audio
arecord -D plughw:3,0 -f S16_LE -r 16000 -c 1 -d 5 _audios/current_request.wav

Then call llm_home_assistant.process_audio_direct from Developer Tools.

Troubleshooting

Symptom	Solution
Import error on startup	Check `docker compose logs homeassistant
Service not found	Check startup logs for errors before service registration.
Audio model call failed	API key issue or `gpt-4o-audio-preview` not available on your OpenAI plan.
Empty response / no actions	Verify the WAV file is valid audio (not empty or corrupt).
Wrong ALSA device	Run `arecord -l` to find your microphone device, update in `text_audio_processing.py`.
High token usage	Verify `allow` configuration restricts domains/services/entities.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
__pycache__		__pycache__
_audios		_audios
_texts		_texts
images		images
models		models
tests		tests
whisper.cpp		whisper.cpp
www		www
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TEST_YOURSELF.md		TEST_YOURSELF.md
__init__.py		__init__.py
audio_utils.py		audio_utils.py
button.py		button.py
call_model.py		call_model.py
dashboard_card.yaml		dashboard_card.yaml
device_info.py		device_info.py
docker-compose.yml		docker-compose.yml
interaction_logger.py		interaction_logger.py
llm-dashboard.yaml		llm-dashboard.yaml
make_capabilities.py		make_capabilities.py
manifest.json		manifest.json
pretty_services.json		pretty_services.json
pretty_states.json		pretty_states.json
select.py		select.py
sensor.py		sensor.py
services.json		services.json
services.yaml		services.yaml
services_capabilities.json		services_capabilities.json
states.json		states.json
step1_router.py		step1_router.py
switch.py		switch.py
text_audio_processing.py		text_audio_processing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Home Assistant

🌟 Key Features

See It In Action

🚀 How to Try It

👥 Team & Credits

Architecture

Text Pipeline

Audio-Direct Pipeline

Legacy Audio Pipeline (Deprecated)

Installation

Option 1: Manual Installation (Recommended for testing/development)

Option 2: Docker Compose (Full Stack)

Configuration

Basic Configuration

Whitelist-Based Entity Filtering

Registered Services

File Structure

Core Files

Model Files

Frontend

Testing

Verify the Integration Loaded

Test the Text Pipeline

Test the Audio Pipeline

Troubleshooting

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Home Assistant

🌟 Key Features

See It In Action

🚀 How to Try It

👥 Team & Credits

Architecture

Text Pipeline

Audio-Direct Pipeline

Legacy Audio Pipeline (Deprecated)

Installation

Option 1: Manual Installation (Recommended for testing/development)

Option 2: Docker Compose (Full Stack)

Configuration

Basic Configuration

Whitelist-Based Entity Filtering

Registered Services

File Structure

Core Files

Model Files

Frontend

Testing

Verify the Integration Loaded

Test the Text Pipeline

Test the Audio Pipeline

Troubleshooting

License

Contact

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages