feat(core-ai): add safe local runtime preloader by ucguy4u · Pull Request #421 · DevelopersCoffee/airo

ucguy4u · 2026-06-28T06:02:48Z

Summary

This PR introduces changes to the local runtime orchestration layer.
Goal: centralize safe local model warmup so chat startup can reduce first-use latency without background evictions or duplicated runtime-specific startup logic.

Core outcome:

adds a residency policy and manager for safe local runtime loading decisions
adds a global preloader with abort, generation gating, and no-eviction background warmup
wires chat startup through the shared preloader instead of separate Gemini Nano and LiteRT warmups

Changes

Code

Added:
- packages/core_ai/lib/src/residency/model_residency_manager.dart
- packages/core_ai/lib/src/preload/model_preloader.dart
- app/lib/core/services/local_runtime_preloader_service.dart
- focused residency, preloader, service, and widget tests
Updated:
- packages/core_ai/lib/core_ai.dart exports for the new shared runtime orchestration primitives
- app/lib/features/agent_chat/presentation/screens/chat_screen.dart to trigger shared preload asynchronously and abort when the screen is disposed or generation starts

Logic

background preload now checks canLoadWithoutEviction before warming a runtime
active/user-triggered runtime work can use makeRoomFor, ensureResident, and runExclusive through the shared manager
image-capable packages are explicitly skipped during background preload
STT/TTS hooks exist as no-op adapters so future local runtimes can plug into the same flow

API Changes (if any)

None externally.

Database Changes (if any)

None.

Observability / Logging

Added preload completion logging in chat startup with per-runtime result reasons.

Performance Impact

Latency: expected improvement for repeat chat initialization and first local response after preload
Throughput: no meaningful change expected
Memory/CPU: background warmup now uses a no-eviction gate and shared serialization to reduce unsafe concurrent load pressure

Risks

runtime memory estimates are conservative and may skip some optional warmups on constrained devices
STT/TTS remain extension hooks only until concrete local assistant runtimes are added
Rollback plan:
- revert this PR to restore the prior direct Gemini Nano and LiteRT warmup behavior

Testing

Unit tests:
- cd packages/core_ai && flutter analyze lib/core_ai.dart lib/src/residency/model_residency_manager.dart lib/src/preload/model_preloader.dart test/residency/model_residency_manager_test.dart test/preload/model_preloader_test.dart
- cd packages/core_ai && flutter test test/residency/model_residency_manager_test.dart test/preload/model_preloader_test.dart
- cd app && flutter analyze lib/core/services/local_runtime_preloader_service.dart lib/features/agent_chat/presentation/screens/chat_screen.dart test/core/services/local_runtime_preloader_service_test.dart test/features/agent_chat/presentation/screens/chat_screen_preloader_test.dart
- cd app && flutter test --no-pub test/core/services/local_runtime_preloader_service_test.dart test/features/agent_chat/presentation/screens/chat_screen_preloader_test.dart
Integration tests:
- none
Manual testing:
- device verification instructions added to [AGENT] Add safe global local LLM preloader and residency manager #391 comment for cold vs warm latency and memory behavior

Deployment Notes

Config changes:
- none
Order of deployment:
- normal app deployment

Related Commits

feat(core-ai): add safe local runtime preloader

Notes

Closes [AGENT] Add safe global local LLM preloader and residency manager #391

- add residency policy and manager primitives for safe local runtime loading - add a global preloader with abort, generation gating, and no-eviction background warmup - wire chat startup through the shared preloader instead of ad hoc Gemini/LiteRT warmups - add focused policy, preloader, service, and widget coverage This centralizes local runtime warmup so first-use latency can improve without surprise evictions or duplicated startup logic.

github-actions · 2026-06-28T06:03:00Z

Plugin Module Size Gate

Policy: modules over 3 MB must be delivered as plugins; plugin modules over 5 MB must document cache management.

Module	Size	Type	Status
`packages/core_ai`	0.34 MB	bundled	OK

sonarqubecloud · 2026-06-28T06:03:46Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2026-06-28T06:05:05Z

🚀 PR Quick Check Summary

Check	Status	Description
PR Validation	❌ failure	Title format, docs, bundled model guardrail
Code Quality	❌ failure	Analyze, formatting
Core Tests	✅ success	Core package unit tests

💡 Note: Full app tests, coverage reports, and security scans run on merge to main.

View Details

ucguy4u merged commit fa833f0 into main Jun 28, 2026
7 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(core-ai): add safe local runtime preloader#421

feat(core-ai): add safe local runtime preloader#421
ucguy4u merged 1 commit into
mainfrom
codex/issue-391-local-preloader

ucguy4u commented Jun 28, 2026

Uh oh!

github-actions Bot commented Jun 28, 2026

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 28, 2026

Uh oh!

github-actions Bot commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ucguy4u commented Jun 28, 2026

Summary

Changes

Code

Logic

API Changes (if any)

Database Changes (if any)

Observability / Logging

Performance Impact

Risks

Testing

Deployment Notes

Related Commits

Notes

Uh oh!

github-actions Bot commented Jun 28, 2026

Plugin Module Size Gate

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 28, 2026

Quality Gate passed

Uh oh!

github-actions Bot commented Jun 28, 2026

🚀 PR Quick Check Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant