Add llama.cpp support for local OpenAI-compatible LLM backends by vmlinuzx · Pull Request #29 · MiniMax-AI/OpenRoom

vmlinuzx · 2026-03-26T16:10:21Z

Summary

This PR adds first-class support for local llama.cpp-style LLM endpoints in the web UI.

adds llama.cpp as an LLM provider preset
routes llama.cpp through the existing OpenAI-compatible chat path
makes API keys optional for local/OpenAI-compatible endpoints
updates the settings UI to expose the new provider and clarify that API keys can be optional for local servers
strips <think>...</think> reasoning blocks from assistant text before rendering/storing responses
adds unit coverage for the new provider path and reasoning tag cleanup

The app already supports several hosted providers, but local inference backends were awkward to use because:

there was no provider preset for llama.cpp
chat startup assumed an API key was always required
some local/Qwen-family models may emit reasoning markup that should not be shown directly in the UI

This change keeps the existing provider architecture intact while making local OpenAI-compatible backends much easier to use.

Ran:

pnpm --filter @openroom/webuiapps test -- --run src/lib/__tests__/llmClient.test.ts

Result:

vmlinuzx added 2 commits March 26, 2026 10:49

feat(webuiapps): add local llama.cpp LLM support

7b22f6d

fix(webuiapps): parse inline local-model tool calls

847b54b