fix(model): deterministic, type-filtered backend auto-detection (#9287) by localai-bot · Pull Request #10286 · mudler/LocalAI

localai-bot · 2026-06-12T21:49:11Z

Problem

When a model config has no explicit backend:, (*ModelLoader).Load built the auto-detect candidate list by ranging an unordered Go map of installed backends with no filtering, then loaded the first one whose gRPC LoadModel succeeded. Every installed backend is registered there - including non-LLM ones like the opus audio codec - so after installing such a backend it could win a GGUF/LLM load, sending the model to the wrong backend.

Fix

New pure, unit-tested SelectAutoLoadBackends(available, modelFile):

deterministic sort (no more map-iteration randomness),
for .gguf files, filters to LLM-capable backends (chat/completion/edit/embeddings usecases via core/config.BackendCapabilities) with llama-cpp first,
zero-candidate fallback returns the full sorted set, so nothing previously loadable becomes unloadable.

Load() now calls this instead of ranging the map directly. (Verified pkg/model -> core/config introduces no import cycle.)

Test plan

New Ginkgo specs in pkg/model/autoload_test.go (red -> green): given {opus, llama-cpp} + a .gguf, opus is excluded and llama-cpp is first; deterministic order; zero-candidate fallback returns the original set.
go test ./pkg/model/... ./core/config/... green; scoped golangci-lint --new-from-merge-base clean.

Follow-up (noted, not done)

Did not force cfg.Backend = "llama-cpp" in the empty-backend GGUF hook (more blast radius on non-llama GGUFs); the candidate filter alone fixes the bug. A metadata-based GGUF architecture check is a possible refinement.

Assisted-by: claude:claude-opus-4-8 [Claude Code]

) When a model config declares no explicit `backend:`, Load() fell into a trial loop built by ranging the external-backends Go map (random order) with no filtering, returning the first backend whose gRPC LoadModel succeeded. An unrelated installed backend - e.g. the "opus" audio codec - could therefore win a GGUF/LLM model load, so a model that should run on llama.cpp wrongly tried to use opus. Extract the candidate selection into a pure, testable function SelectAutoLoadBackends that: - sorts the candidate list deterministically (no more map-order nondeterminism), and - for a `.gguf` model, filters to LLM-capable backends (via core/config.BackendCapabilities) and puts llama-cpp first, so an incompatible audio/codec/image backend can never win the trial loop. If filtering would leave zero candidates, the full sorted set is returned unchanged, so a previously-loadable model is never made unloadable. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(model): deterministic, type-filtered backend auto-detection (#9287)#10286

fix(model): deterministic, type-filtered backend auto-detection (#9287)#10286
localai-bot wants to merge 1 commit into
masterfrom
fix/9287-backend-autodetect

localai-bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

localai-bot commented Jun 12, 2026

Problem

Fix

Test plan

Follow-up (noted, not done)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants