feat: MetalRT VLM backend + bolder screen overlay by AmanSwar · Pull Request #26 · RunanywhereAI/RCLI

AmanSwar · 2026-03-15T19:39:35Z

Summary

MetalRT VLM backend: VLM commands (vlm, camera, screen) now use MetalRT's native vision pipeline when running on MetalRT engine. Falls back to llama.cpp gracefully if no MetalRT VLM model is found in HF cache.
Screen capture overlay: Bolder border (8px), larger corner handles (28px), wider edge grab zones (20px), added edge midpoint handles, double-layer glow, and heavier label font.

Changes

New MetalRTVlmEngine class (metalrt_vlm_engine.h/.cpp) wrapping MetalRT vision C API via dlsym
vlm_init_locked() tries MetalRT first, falls back to llama.cpp (no longer hard-rejects MetalRT backend)
All VLM functions (rcli_vlm_analyze, rcli_vlm_analyze_stream, rcli_vlm_get_stats, rcli_vlm_exit, handle_screen_intent) branch on vlm_use_metalrt flag
Updated error messages to be backend-agnostic
rcli_overlay.m visual improvements for easier drag/resize

Test plan

Run rcli vlm <image> "describe this" on MetalRT engine — should use MetalRT VLM
Run rcli vlm <image> on llama.cpp engine — should use llama.cpp VLM as before
Run rcli vlm on MetalRT without VLM model in HF cache — should fall back to llama.cpp
Run rcli screen — verify overlay is bolder with larger handles
Verify overlay drag and resize from corners and edges

When running on MetalRT engine, VLM commands (vlm, camera, screen) now use MetalRT's native vision pipeline instead of requiring llama.cpp. Falls back to llama.cpp gracefully if MetalRT VLM model not available.

Thicker border (8px), larger corner handles (28px), wider edge grab zones (20px), added edge midpoint handles, double-layer outer glow, and heavier label font for better visibility and usability.

AmanSwar and others added 3 commits March 16, 2026 01:08

feat: add MetalRT VLM backend for vision-language models

02faedd

When running on MetalRT engine, VLM commands (vlm, camera, screen) now use MetalRT's native vision pipeline instead of requiring llama.cpp. Falls back to llama.cpp gracefully if MetalRT VLM model not available.

ui: make screen capture overlay bolder and easier to interact with

a9d3f5f

Thicker border (8px), larger corner handles (28px), wider edge grab zones (20px), added edge midpoint handles, double-layer outer glow, and heavier label font for better visibility and usability.

improving the v mode

66a38b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: MetalRT VLM backend + bolder screen overlay#26

feat: MetalRT VLM backend + bolder screen overlay#26
AmanSwar wants to merge 3 commits into
mainfrom
vlm_metalrt_integration

AmanSwar commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AmanSwar commented Mar 15, 2026

Summary

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants