Skip to content

Add llama.cpp support for local OpenAI-compatible LLM backends#29

Open
vmlinuzx wants to merge 2 commits intoMiniMax-AI:mainfrom
vmlinuzx:feat/local-llamacpp-support
Open

Add llama.cpp support for local OpenAI-compatible LLM backends#29
vmlinuzx wants to merge 2 commits intoMiniMax-AI:mainfrom
vmlinuzx:feat/local-llamacpp-support

Conversation

@vmlinuzx
Copy link

Summary

This PR adds first-class support for local llama.cpp-style LLM endpoints in the web UI.

What changed

  • adds llama.cpp as an LLM provider preset
  • routes llama.cpp through the existing OpenAI-compatible chat path
  • makes API keys optional for local/OpenAI-compatible endpoints
  • updates the settings UI to expose the new provider and clarify that API keys can be optional for local servers
  • strips <think>...</think> reasoning blocks from assistant text before rendering/storing responses
  • adds unit coverage for the new provider path and reasoning tag cleanup

Why

The app already supports several hosted providers, but local inference backends were awkward to use because:

  • there was no provider preset for llama.cpp
  • chat startup assumed an API key was always required
  • some local/Qwen-family models may emit reasoning markup that should not be shown directly in the UI

This change keeps the existing provider architecture intact while making local OpenAI-compatible backends much easier to use.

Verification

Ran:

  • pnpm --filter @openroom/webuiapps test -- --run src/lib/__tests__/llmClient.test.ts

Result:

  • 46 tests passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant