Skip to content

feat: support vLLM nightly builds via wheels.vllm.ai#736

Draft
doringeman wants to merge 1 commit intomainfrom
vllm-from-commit
Draft

feat: support vLLM nightly builds via wheels.vllm.ai#736
doringeman wants to merge 1 commit intomainfrom
vllm-from-commit

Conversation

@doringeman
Copy link
Contributor

@doringeman doringeman commented Mar 5, 2026

Install vLLM from https://wheels.vllm.ai/{VLLM_VERSION}/{VLLM_CUDA_VERSION} instead of GitHub Releases, allowing nightly builds to be used via make docker-run-vllm VLLM_VERSION=nightly (or pinned to a specific commit hash for reproducible builds).

vLLM stable releases (0.16.x) do not yet support Qwen3.5 (#731) — support is available on the main branch ahead of 0.17.0. vLLM publishes pre-built wheels for every merged commit at wheels.vllm.ai, which this change allows us to use.

Tested in https://github.com/docker/model-runner/actions/runs/22712921339.

Usage

  • Latest nightly
make docker-run-vllm VLLM_VERSION=nightly

E.g.,

$ docker model status | grep vllm
vllm       Running        vllm 0.16.1rc1.dev268+ge2b31243c
  • Pinned to a specific commit (recommended for reproducible builds)
make docker-run-vllm VLLM_VERSION=e2b31243c092e9f4ade5ffe4bf9a5d5ddae06ca7

E.g., (intentionally the same commit as nightly)

$ docker model status | grep vllm
vllm       Running        vllm 0.16.1rc1.dev268+ge2b31243c
  • Default stable release (unchanged)
make docker-run-vllm

E.g.,

$ docker model status | grep vllm
vllm       Running        vllm 0.12.0

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for installing vLLM from wheels.vllm.ai, enabling the use of nightly builds and specific commit hashes. However, a critical command injection vulnerability has been identified in both the Dockerfile and Makefile. This occurs because the user-provided VLLM_VERSION is used in shell commands within double quotes, allowing for shell expansion and potential arbitrary command execution, even affecting the validation check. Additionally, the current implementation has a critical issue where release versions are not correctly prefixed with v (e.g., v0.12.0) as required by wheels.vllm.ai, which will cause default builds to fail. To mitigate the command injection, it is recommended to use single quotes around variables in shell commands to prevent shell expansion.

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
@ericcurtin
Copy link
Contributor

CUDA/ROCm and Metal make sense for us to integrate, there's a really simplified installation guide here now:

https://vllm.ai/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants