Skip to content

feat: make llama.cpp a deferred backend on macOS/Windows#737

Draft
doringeman wants to merge 1 commit intodocker:mainfrom
doringeman:llamacpp-on-demand
Draft

feat: make llama.cpp a deferred backend on macOS/Windows#737
doringeman wants to merge 1 commit intodocker:mainfrom
doringeman:llamacpp-on-demand

Conversation

@doringeman
Copy link
Contributor

MODEL_RUNNER_PORT=8080 make run LLAMA_SERVER_VERSION=v0.1.0

docker/model-runner:llamacpp-metal-v0.1.0

$ MODEL_RUNNER_HOST=http://localhost:8080 docker model run smollm2 hi
Installing llama.cpp backend...
llama.cpp backend installed successfully
Hello. I'm happy to help you with any questions or queries you may have. What would you like to discuss today?

$ MODEL_RUNNER_HOST=http://localhost:8080 docker model status
Docker Model Runner is running

BACKEND    STATUS         DETAILS
llama.cpp  Running        llama.cpp llamacpp-metal-v0.1.0 (sha256:6a242e1021bc7dc504efcd9c59dca998aa5e9cf7148b9dfa514d1dae4582b952) 3191462
vllm       Running        vllm-metal v0.1.0-20260126-121650
diffusers  Not Installed
mlx        Not Installed  package not installed
sglang     Not Installed  only supported on Linux

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the llama.cpp backend to support deferred installation on macOS and Windows, which is a great improvement. The changes are well-structured, introducing a unified installDir and a BackendUpdater interface for managing binary updates. My main concern is with error handling in main.go where a failure to determine the llama.cpp server path is logged but not treated as a fatal error, which could lead to issues later on. I've added a critical comment with a suggestion to address this.

Comment on lines +62 to +64
if err != nil {
log.Error("Failed to get llama.cpp server path", "error", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The program continues to run even if envconfig.LlamaServerPath() returns an error. If this happens, llamaServerPath will be an empty string, which will cause the model runner to attempt to write to the current working directory instead of the intended installation directory. This can lead to unexpected behavior and potential permission issues. This error should be treated as fatal, and the program should exit.

Suggested change
if err != nil {
log.Error("Failed to get llama.cpp server path", "error", err)
}
if err != nil {
log.Error("Failed to get llama.cpp server path", "error", err)
exitFunc(1)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant