Voice command upgrade: answer_question() with camera + memory context

## Parent PRD

#1

## What to build

Upgrade VoiceHandler to route voice queries through Gemma4Brain.answer_question() instead of the old Llama 3.1 text model. Voice queries now get full memory context + optional camera frame, so Vector gives richer, personalized answers.

## Acceptance criteria

- [ ] Asking Vector 'What do you see?' uses Gemma4Brain.describe_view(pil_image) — not LLaVA
- [ ] Asking 'What have you seen today?' references MemoryBank recent observations
- [ ] Asking 'What is that?' with a pointing gesture analyzes the current camera frame
- [ ] Response is spoken via say_text() with TTS chunking
- [ ] Voice queries work when GEMMA4=1 env var is set
- [ ] If camera is unavailable, answer_question() still works text-only

## Blocked by

- Blocked by #5 (E2E test must pass first)

## User stories addressed

- Voice commands get intelligent, memory-aware answers from Gemma 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice command upgrade: answer_question() with camera + memory context #6

Parent PRD

What to build

Acceptance criteria

Blocked by

User stories addressed

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Voice command upgrade: answer_question() with camera + memory context #6

Description

Parent PRD

What to build

Acceptance criteria

Blocked by

User stories addressed

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions