[WIP] feat(sandbox): opt-in best-effort bootstrap via OPENSHELL_BEST_EFFORT_FAILURES#1548
Open
dims wants to merge 1 commit into
Open
[WIP] feat(sandbox): opt-in best-effort bootstrap via OPENSHELL_BEST_EFFORT_FAILURES#1548dims wants to merge 1 commit into
dims wants to merge 1 commit into
Conversation
|
All contributors have signed the DCO ✍️ ✅ |
3369616 to
31eb530
Compare
…_FAILURES When the OPENSHELL_BEST_EFFORT_FAILURES env var is set, failures from the three subsystems an outer sandbox typically degrades — network namespace creation, the supervisor seccomp prelude, and the workload seccomp filter — are logged and skipped instead of aborting startup. Default remains strict. The gVisor runtime, when invoked with --network=host on Kubernetes, returns EPERM from unshare(CLONE_NEWNET), EINVAL from seccomp(2) on filters it does not yet model, and EPERM from setresuid/setresgid when the container entrypoint already dropped to a non-root uid. These are defense-in-depth on a bare-metal host but duplicative when the workload already runs inside a strong outer sandbox. The env-var gate keeps the strict default for standalone deployments while letting outer-sandbox integrations (gVisor, Firecracker, Kata) opt in. Also make drop_privileges idempotent: when the process is already at the resolved target uid/gid, skip initgroups/setresgid/setresuid instead of failing with EPERM. Lets a container entrypoint pre-drop privileges before exec'ing the sandbox without breaking the verification path. Signed-off-by: Davanum Srinivas <dsrinivas@nvidia.com>
31eb530 to
835f1f9
Compare
Author
|
I have read the DCO document and I hereby sign the DCO. |
Author
|
recheck |
dims
added a commit
to dims/openshell-driver-substrate
that referenced
this pull request
May 23, 2026
Now that the three companion changes are filed upstream as - NVIDIA/OpenShell#1548 (env-var-gated best-effort bootstrap) - agent-substrate/substrate#66 (ateom-gvisor eth0 fix) - agent-substrate/substrate#67 (install-ate.sh publishes ateom-gvisor) rewrite the README + poc-intro.md to point at the PRs rather than at specific commits or fork branches. Easier to follow for any reader who isn't already deep in our local-fork state. Also fold the operator-handshake follow-up into the §3 component table and §9 "Where to next" list with the PR reference. Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Collaborator
Author
|
Thanks @drew ! |
dims
added a commit
to dims/openshell-driver-substrate
that referenced
this pull request
May 23, 2026
Items 1-3 in §9 "Where to next" have all been filed as PRs (NVIDIA/OpenShell#1548, agent-substrate/substrate#66, #67); marking them with strike-through and an "awaiting review" callout so readers don't think they're still TODO. Signed-off-by: Davanum Srinivas <dsrinivas@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add an
OPENSHELL_BEST_EFFORT_FAILURESenv var that lets the supervisor tolerate bootstrap-syscall failures from outer-sandbox runtimes (gVisor, Firecracker, Kata). Default stays strict — standalone deployments are unaffected.Related Issue
None — proposed as a small, opt-in change to unblock outer-sandbox integrations.
Changes
crates/openshell-sandbox/src/lib.rs— addbest_effort_failures()env-var probe andhandle_bootstrap_failure()helper. Route the netns-create and supervisor-seccomp call sites through the helper.crates/openshell-sandbox/src/sandbox/linux/mod.rs— route the workload seccomp call site through the helper.crates/openshell-sandbox/src/process.rs— makedrop_privilegesidempotent: skipinitgroups/setresgid/setresuidwhen the process is already at the resolved target uid/gid.Net diff: 3 files, +51 / −7.
Testing
cargo fmt --all -- --checkclean on the touched files.cargo test -p openshell-sandbox --lib— 777 tests pass.cargo clippy -p openshell-sandbox --lib --tests -- -D warnings— zero new warnings introduced (pre-existing warnings onmainunchanged).mise run pre-commit— to be re-verified by CI after copy-pr-bot mirrors.Checklist
feat(sandbox): …Motivation
Integrating the OpenShell supervisor with NVIDIA Agent Substrate (gVisor + checkpoint/restore via
runsc). With this gate the stock supervisor binary boots cleanly inside that runtime; without it a fork is required.