Skip to content

mohith-das/basic_voice_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice Agent (Local)

Full-stack local voice agent:

  • Frontend: React + Vite
  • Backend: FastAPI (faster-whisper + Coqui TTS)
  • LLM: Ollama (llama3.2:3b by default)

Project Structure

voice_agent/
  backend/
    main.py
    run.sh
    requirements.txt
    .env.example
  frontend/
    src/
    package.json
    .env.example

Prerequisites

  • macOS/Linux shell
  • Python 3.10 or 3.11
  • Node.js 18+ and npm
  • Ollama installed and running
  • ffmpeg installed and in PATH

macOS install helpers:

brew install ffmpeg
brew install ollama

1) Backend Setup

cd backend
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env

Notes:

  • If python3.11 is unavailable, use any Python 3.10/3.11 binary.
  • The backend converts uploaded audio via ffmpeg, so ffmpeg must be installed.

2) Ollama Setup

Start Ollama (if not already running):

ollama serve

In another terminal, pull the default model once:

ollama pull llama3.2:3b

3) Frontend Setup

cd frontend
npm install
cp .env.example .env

If backend runs on the same machine, set:

VITE_JETSON_API=http://127.0.0.1:8000

in frontend/.env.

4) Run the App

Use 3 terminals.

Terminal A (Ollama):

ollama serve

Terminal B (Backend):

cd backend
./run.sh

Alternative backend command:

cd backend
source .venv/bin/activate
uvicorn main:app --host 0.0.0.0 --port 8000

Terminal C (Frontend):

cd frontend
npm run dev

Open:

  • Frontend: http://127.0.0.1:5173
  • Backend health: http://127.0.0.1:8000/health

5) Quick Validation

  1. Open frontend in browser.
  2. Click Start Recording.
  3. Speak a short sentence and click Stop & Send.
  4. Confirm transcript appears and reply audio plays.

Common Issues

Backend error: ffmpeg is missing on server

Install ffmpeg:

brew install ffmpeg
ffmpeg -version

Then restart backend.

ModuleNotFoundError: No module named 'app'

Use:

uvicorn main:app --host 0.0.0.0 --port 8000

Do not use app.main:app for this repo layout.

Frontend talking to wrong backend

Check frontend/.env:

VITE_JETSON_API=http://127.0.0.1:8000

Then restart frontend (npm run dev).

Gibberish or repeated self-transcription

  • Use headphones to avoid speaker feedback into mic.
  • Wait for reply audio playback to complete before re-recording.
  • Keep LANGUAGE in backend/.env aligned with spoken language.

Environment Variables

Backend (backend/.env):

  • OLLAMA_URL default http://127.0.0.1:11434
  • OLLAMA_MODEL default llama3.2:3b
  • ASR_MODEL default small
  • LANGUAGE default en
  • TTS_MODEL, TTS_FALLBACK_MODEL
  • LOG_LEVEL

Frontend (frontend/.env):

  • VITE_JETSON_API backend base URL

Git Ignore Note

  • Keep backend/.env.example and frontend/.env.example committed.
  • Keep real secrets/local values in backend/.env and frontend/.env (ignored by .gitignore).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors