fix(cli): roll back gateway registration when auth fails during gateway add#1538
Open
zanetworker wants to merge 1 commit into
Open
fix(cli): roll back gateway registration when auth fails during gateway add#1538zanetworker wants to merge 1 commit into
zanetworker wants to merge 1 commit into
Conversation
…ay add When OIDC or Cloudflare browser auth fails during gateway add, remove the gateway registration and restore the previously active gateway instead of leaving a broken entry on disk. Previously, store_gateway_metadata and save_active_gateway were called before the auth attempt. On failure, the registration persisted with an 'authenticate later' message, causing stale entries to accumulate when users retried with different flags or names. The rollback is skipped when the browser is intentionally suppressed (OPENSHELL_NO_BROWSER=1), since the user intends to authenticate later with gateway login. Fixes NVIDIA#1537 Signed-off-by: Adel Zaalouk <azaalouk@redhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When
openshell gateway addfails to authenticate (OIDC discovery error, browser timeout, Cloudflare callback failure), the gateway registration was left on disk. Users who retried with different names or flags accumulated broken entries that required manualgateway removecleanup.Root cause:
store_gateway_metadata()andsave_active_gateway()are called before the auth attempt. On failure, the code printed "Authentication skipped" but never cleaned up. There was no rollback at all, so every failed attempt left a stale registration behind regardless of why auth failed.The problem in practice
Debugging auth produces this after a few retries with different flags and names:
Only 2 of 9 entries work. The rest are leftovers from failed
gateway addattempts that were never cleaned up.After this fix
The failed registration is gone. The previously active gateway is restored.
Behavior change
OPENSHELL_NO_BROWSER=1)The only case where a failed registration is preserved is when the user explicitly set
OPENSHELL_NO_BROWSER=1, which signals "I know auth will not complete now, I will authenticate separately withgateway login." This env var is used in CI/headless environments where no browser is available.Related Issue
Fixes #1537
Changes
is_browser_suppressed()helper that checksOPENSHELL_NO_BROWSERrollback_gateway_registration()helper that removes the registration and restores the previously active gatewayTesting
135 lib tests pass, 0 failures
2 new tests:
gateway_add_oidc_rolls_back_on_auth_failure: registers a seed gateway, attempts OIDC add against unreachable issuer, verifies the failed registration is removed and active gateway is restoredgateway_add_cloud_rolls_back_on_auth_failure: same pattern for the Cloudflare auth pathmise run pre-commitpassesUnit tests added/updated
E2E tests not applicable (auth rollback is a CLI-local operation)
Note
The cloud auth rollback test opens a browser tab to
https://127.0.0.1:1/auth/connectwhich Chrome blocks withERR_UNSAFE_PORT. This is expected and the tab can be closed.Checklist