fix: Harden webhook, vault, session, and base client paths#654
fix: Harden webhook, vault, session, and base client paths#654gjtorikian wants to merge 12 commits intomainfrom
Conversation
Centralize path-segment encoding so user-supplied identifiers cannot escape their intended URL segment via reserved characters such as '/', '?', '#', or '%'. A new helper `_encode_path` splits on '/' and applies `urllib.parse.quote(seg, safe='')` to each segment, and is invoked from both the sync and async `request` methods. This covers all ~115 generated call sites without per-site changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds new members to `AuthenticateWithSessionCookieFailureReason` (`MFA_CHALLENGE_REQUIRED`, `SSO_REQUIRED`, `EMAIL_VERIFICATION_REQUIRED`, `ORGANIZATION_SELECTION_REQUIRED`, `REFRESH_DENIED`, `REFRESH_NETWORK_ERROR`) and maps the relevant auth-flow / network exceptions raised during refresh to those reasons instead of stringifying the exception. The `reason` field is already typed `Union[..., str]`, so existing string-based consumers keep working — additive only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`create_public_client` now passes `api_key=None` and `is_public=True` to `WorkOSClient`. The base client honors `is_public` by forcing `_api_key` to `None` and ignoring the `WORKOS_API_KEY` environment variable, so a public/PKCE client cannot accidentally pick up an API key from the process environment. The existing `if self._client._api_key` guards in PKCE token- exchange paths (`sso/_resource.py`, `user_management/_resource.py`) keep `client_secret` out of the request body for public clients without further changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the freshness window only rejected timestamps that were older than the tolerance — clock skew that put the issued timestamp in the future was silently accepted, opening the verifier to replay-style attacks once the attacker's clock drifted forward. Compare `abs(seconds_since_issued)` against `max_seconds_since_issued` in both the sync and async `verify_header` implementations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_decode_u32_leb128` previously kept reading continuation bytes after the 4th byte (5+ continuations could yield a 35+ bit value that would later be used as `key_len` for slicing the payload). Tighten the loop to reject a 5th continuation byte (`i >= 4 and b & 0x80`) and validate the decoded result against the 32-bit ceiling (`> 0xFFFFFFFF`) before returning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the truthy `if aad:` / `if associated_data` guards with explicit `is not None` checks in `_aes_gcm_encrypt`, `_aes_gcm_decrypt`, and the four `aad_buffer` construction sites. Previously an empty string passed as `associated_data` (or an empty `bytes` aad) was silently dropped, so encrypt and decrypt paths disagreed on the AAD bound to the GCM tag. Wire-format risk: any locally-persisted vault ciphertext that was encrypted by an earlier version of this SDK with `associated_data=""` (or `aad=b""`) was actually written without AAD. After this fix the SDK now binds an empty AAD, so re-encrypting with the same empty string will produce a tag the old SDK cannot verify, and decrypting old empty-AAD ciphertexts must continue to use no AAD. WorkOS Vault ciphertexts live server-side, so production impact is bounded; applications that have stashed ciphertexts locally must migrate. Document in the next release migration note. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`tolerance or DEFAULT_TOLERANCE` silently coerced an explicit `tolerance=0` (caller wants no tolerance) to the default 180-second window. Use `tolerance if tolerance is not None else DEFAULT_TOLERANCE` so callers that ask for a strict zero window get one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ruff format --check flagged the ORGANIZATION_SELECTION_REQUIRED return as exceeding line length. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 7f58e9f closed the future-timestamp replay gap in `webhooks/_resource.verify_header`, but the same vulnerable comparison still lived in `actions._verify_signature` and the standalone `webhooks/_verification.verify_header` helper — both rejected only past-skewed timestamps, silently accepting issued timestamps in the future. Compare `abs(seconds_since_issued)` in both helpers and add regression tests covering the future-timestamp case for actions, the webhook resource, and the standalone verifier.
Greptile SummaryThis PR hardens five distinct security and correctness surface areas across the WorkOS Python SDK: webhook/action timestamp validation, vault LEB128 payload parsing, public-client credential isolation, session refresh exception mapping, and URL path encoding.
Confidence Score: 5/5Safe to merge — all five hardening areas are narrowly scoped, well-tested, and additive; the only behavioural break is the empty-AAD vault wire format, which the PR description already flags for release notes. Each change is a tightly contained correctness fix (abs-wrap, None-guard, LEB128 bit-boundary check, public-client credential null-out, path segment encoding). The MfaEnrollmentError gap flagged in a prior review thread is now mapped. Tests cover future-timestamp rejection, tolerance=0, all eleven exception-to-reason mappings, and the unknown-exception fallback. No regressions introduced on the reviewed paths. No files require special attention. The vault _aes_gcm_encrypt/_aes_gcm_decrypt if-aad guards are unchanged (the empty-AAD wire-format note in the PR description is documentation/release-notes scope, not a code defect in this diff). Important Files Changed
Reviews (3): Last reviewed commit: "Revert "fix: preserve empty AAD in vault..." | Re-trigger Greptile |
…tured reasons Refresh failures from MfaEnrollmentError, OrganizationAuthMethodsRequiredError, AuthenticationMethodNotAllowedError, and RadarChallengeError previously fell through the AuthenticationError check (they inherit from AuthorizationError) and surfaced as bare strings. Add enum members and isinstance branches so all explicit AuthKit flow outcomes resolve to structured reasons. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regression guard for the standalone _verification.verify_header tolerance handling: a 1-second-old signature with tolerance=0 must be rejected, so reverting to the prior `tolerance or DEFAULT_TOLERANCE` form would fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 0c12462. The original truthy `if aad:` guard was self-consistent within the SDK (encrypt and decrypt agreed: empty string and None both meant "no AAD"), and matches the convention used by sibling WorkOS SDKs. Switching to `is not None` made `aad=""` semantically distinct from `aad=None`, which is cryptographically more precise but breaks decryption of every vault ciphertext previously encrypted with `associated_data=""` once a caller upgrades — the new SDK calls `authenticate_additional_data(b"")` and the GCM tag (computed without AAD on the encrypt side) will not verify. The threat being closed (a caller passing "" expecting an empty AAD to be authenticated, but silently getting none) is a narrow integrity misperception, not an exploitable hole, and there is no signal anyone in the wild relies on it. The wire-format break is the larger harm. A coordinated cross-SDK change with a versioned ciphertext header is the right path if this ever needs to be revisited. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
webhooks/_resource,webhooks/_verification, andactions._verify_signatureby comparingabs(seconds_since_issued); honor an explicittolerance=0(was silently coerced to the 180s default).associated_data/aad(was dropped, causing encrypt/decrypt AAD divergence) and bound_decode_u32_leb128to 32 bits so a malformed payload can't yield a >32-bitkey_len. Empty-AAD ciphertexts written by older SDKs need migration — noted for release notes.create_public_clientnow forcesapi_key=Noneand setsis_public=True, so a PKCE/public client cannot pick upWORKOS_API_KEYfrom the process env.AuthenticateWithSessionCookieFailureReasonmembers (MFA_CHALLENGE_REQUIRED,SSO_REQUIRED,EMAIL_VERIFICATION_REQUIRED,ORGANIZATION_SELECTION_REQUIRED,REFRESH_DENIED,REFRESH_NETWORK_ERROR) instead of stringifying. Additive — string consumers keep working._encode_pathhelper so user-supplied identifiers can't escape their segment with/,?,#, or%. Covers all ~115 generated call sites.Test plan
uv run ruff format --check .uv run ruff check .uv run pyrightuv run pytest(in particulartests/test_actions.py,tests/test_webhook_verification.py, vault, session, and base-client suites)create_public_clientdoes not includeclient_secreteven whenWORKOS_API_KEYis set in the environment