Add an initial varlink IPC interface#222
Conversation
Refactor the container removal logic out of remove_all_ephemeral_containers into a standalone remove_ephemeral_containers function that returns per-container results. This enables the varlink IPC layer to call container removal without going through the interactive CLI prompt. Also make list_ephemeral_containers pub(crate) so other modules can call it directly. Assisted-by: OpenCode (claude-opus-4-6) Signed-off-by: Colin Walters <walters@verbum.org>
I'm thinking about bootc-dev/bootc#522 again, and varlink is definitely an option there. In order for us to gain experience with it, I think it makes sense to start here. In some experimentation, it seems zlink is indeed better maintained than varlink-rs and more complete. (there's some unreleased async support in varlink-rs, and the activation/service stuff was not complete). In this initial pass, we first cover the `images` API (which is really just a thin wrapper for enumerating images in the container runtime store). The more interesting one is `io.bootc.vk.ephemeral` which supports spawning. Note: The workspace lint for unused_must_use is relaxed from forbid to deny so that zlink's proc-macro-generated #[allow(unused)] does not conflict. the practical enforcement is identical. Assisted-by: OpenCode (claude-opus-4-6) Signed-off-by: Colin Walters <walters@verbum.org>
There was a problem hiding this comment.
Code Review
This pull request introduces a varlink IPC interface for managing ephemeral VMs, a significant and well-implemented feature. While it includes service implementation, refactoring in ephemeral.rs and ssh.rs, and comprehensive integration tests, it also introduces argument injection vulnerabilities and broken access control. Specifically, several methods accept user-supplied container IDs or image names and pass them directly to podman commands, potentially allowing operations on arbitrary containers and images not managed by bcvk. To mitigate this, it is recommended to use the -- separator in all podman calls and verify that target containers possess the expected bcvk.ephemeral=1 label before performing operations. Additionally, there are a few suggestions to improve maintainability and fix a version mismatch that would break the build.
| interface = "io.bootc.vk.images", | ||
| vendor = "io.bootc.vk", | ||
| product = "bcvk", | ||
| version = "0.11.0", |
There was a problem hiding this comment.
The version in the #[zlink::service] attribute is hardcoded to "0.11.0", but the crate version in crates/kit/Cargo.toml is "0.12.0". This will cause the version_matches_cargo_toml test at the end of this file to fail. Please update this version to match the crate's version. You will also need to update the hardcoded version in the test itself to ensure they stay in sync.
| version = "0.11.0", | |
| version = "0.12.0", |
| assert_eq!( | ||
| "0.11.0", | ||
| env!("CARGO_PKG_VERSION"), | ||
| "varlink service version must match Cargo.toml" |
There was a problem hiding this comment.
This test will fail because env!("CARGO_PKG_VERSION") will expand to "0.12.0", but it's being compared against the hardcoded string "0.11.0". Please update this to match the crate's version to ensure the test passes and correctly validates the version sync.
| assert_eq!( | |
| "0.11.0", | |
| env!("CARGO_PKG_VERSION"), | |
| "varlink service version must match Cargo.toml" | |
| assert_eq!( | |
| env!("CARGO_PKG_VERSION"), | |
| "0.12.0", | |
| "varlink service version must match Cargo.toml" | |
| ); |
| /// block used by both the CLI (`rm-all`) and the varlink `Rm` method. | ||
| pub(crate) fn remove_single_container(container_id: &str) -> RemoveContainerResult { | ||
| let result = Command::new("podman") | ||
| .args(["rm", "-f", container_id]) |
There was a problem hiding this comment.
The container_id parameter is passed directly to the podman rm -f command without a -- separator. Since this value comes from an external IPC interface (Varlink), an attacker could provide a flag such as --all instead of a container ID. This would cause podman to remove all containers accessible to the process, leading to a significant denial of service. Additionally, the function does not verify that the container has the bcvk.ephemeral=1 label, allowing the removal of arbitrary containers not managed by this tool.
| .args(["rm", "-f", container_id]) | |
| .args(["rm", "-f", "--", container_id]) |
| if options.allocate_tty { | ||
| cmd.args(["exec", "-it", container_name, "ssh"]); | ||
| } else { | ||
| cmd.args(["exec", container_name, "ssh"]); | ||
| } |
There was a problem hiding this comment.
The container_name parameter is passed to podman exec without a -- separator, which could allow argument injection if the name starts with a hyphen. Furthermore, the function does not verify that the target container is an ephemeral VM managed by bcvk (e.g., by checking for the bcvk.ephemeral=1 label). This could allow an attacker with access to the Varlink interface to execute SSH commands in any container on the system, potentially leading to unauthorized access or privilege escalation.
| if options.allocate_tty { | |
| cmd.args(["exec", "-it", container_name, "ssh"]); | |
| } else { | |
| cmd.args(["exec", container_name, "ssh"]); | |
| } | |
| if options.allocate_tty { | |
| cmd.args(["exec", "-it", "--", container_name, "ssh"]); | |
| } else { | |
| cmd.args(["exec", "--", container_name, "ssh"]); | |
| } |
| Err(e) => { | ||
| let msg = e.to_string(); | ||
| if msg.contains("No such image") || msg.contains("image not known") { | ||
| Err(ImagesError::NotFound { name }) | ||
| } else { | ||
| Err(ImagesError::PodmanError { message: msg }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Relying on string matching of the error message from podman to detect a "not found" condition is brittle. If podman changes its error messages in a future version, this logic will break.
While this might be the only option if podman doesn't provide structured error information or distinct exit codes for this case, it would be good to add a comment here acknowledging this fragility. If podman provides a more reliable way to distinguish this error (e.g., a specific exit code), it would be better to use that.
| let containers = tokio::task::spawn_blocking(crate::ephemeral::list_ephemeral_containers) | ||
| .await | ||
| .map_err(|e| EphemeralError::PodmanError { | ||
| message: e.to_string(), | ||
| })? | ||
| .map_err(|e| EphemeralError::PodmanError { | ||
| message: e.to_string(), | ||
| })?; |
There was a problem hiding this comment.
The pattern of mapping both JoinError and the inner Result's error to a PodmanError is repeated in several methods (ps, rm_all, run, exec, wait_ssh_ready). To reduce duplication and improve readability, consider defining a local error mapping closure.
For example, you could change this to:
async fn ps(&self) -> Result<PsReply, EphemeralError> {
let map_err = |e: impl std::fmt::Display| EphemeralError::PodmanError {
message: e.to_string(),
};
let containers = tokio::task::spawn_blocking(crate::ephemeral::list_ephemeral_containers)
.await
.map_err(map_err)?
.map_err(map_err)?;
// ...
}This same pattern can be applied to the other methods in this service implementation.
I'm thinking about bootc-dev/bootc#522
again, and varlink is definitely an option there.
In order for us to gain experience with it, I think it makes
sense to start here. In some experimentation, it seems zlink
is indeed better maintained than varlink-rs and more complete.
(there's some unreleased async support in varlink-rs, and
the activation/service stuff was not complete).
In this initial pass, we first cover the
imagesAPI (which isreally just a thin wrapper for enumerating images in the container
runtime store). The more interesting one is
io.bootc.vk.ephemeralwhich supports spawning.
Note: The workspace lint for unused_must_use is relaxed from forbid to deny
so that zlink's proc-macro-generated #[allow(unused)] does not conflict.
the practical enforcement is identical.