fix(ee): eliminate boot-time race that latched deployment as community#43
Merged
Conversation
Staging logs (NUXT_DEPLOYMENT_PROFILE=managed, ee/ bundled, native deps
healthy) showed two identical warnings followed by a successful bridge
load:
[deployment] NUXT_DEPLOYMENT_PROFILE="managed" requires the Enterprise
Edition (ee/) but the enterprise bridge did not load; falling back
to 'community'.
[deployment] (same line, again)
[ee] Enterprise bridge loaded successfully.
Root cause: a microtask vs macro-task ordering mistake in the boot
sequence. `server/plugins/00.billing-flag.ts` was a sync plugin that
called `resolveDeployment()` before the bridge was awaited, then
scheduled a second pass via `queueMicrotask`. The plugin's comment
claimed the microtask would run "after the ee init plugin has loaded
the bridge", but ES dynamic import resolution is a macro-task — the
microtask drained first, twice locking `_cached` to 'community' before
the bridge finished resolving.
Blast radius was wider than a stray warning: with `_cached` poisoned,
`getEdition()` returned 'agpl' for the rest of the process, so
`hasFeature()` denied every `requires_ee` feature server-side, the
billing middleware (`03.billing.ts`) ran the `fixed`-plan branch
(community), and `runEnterpriseRoute` rejected AI keys / webhooks /
conversation API with 403. Client UI was only correct because operators
had set `NUXT_PUBLIC_DEPLOYMENT_*` env vars; with those unset, the
client snapshot would have been broken too.
Fix is defense-in-depth across three files:
1. `server/utils/enterprise.ts` — new `isEnterpriseBridgeSettled()`
that exposes whether the dynamic import has resolved (either to a
bridge or definitively to null for CE). Pure read of existing global
state; no semantic change to load/get/init helpers.
2. `server/utils/deployment.ts` — `resolveDeployment()` now gates BOTH
the `_cached` assignment AND the misconfiguration warning on
`isEnterpriseBridgeSettled()`. Pre-settle calls return a transient
community shape without latching, so a later post-settle call gets
the correct ee/managed result and caches it. The warning only fires
when ee/ is genuinely absent at settle time, not as a spurious
side-effect of a pre-settle reading.
3. `server/plugins/00.billing-flag.ts` — single async pass:
`await initEnterpriseBridge()` then `applyDeploymentSnapshot()`. The
`queueMicrotask` two-pass trick and `__resetDeploymentCache` import
are gone. `01.init-ee.ts` keeps awaiting the bridge too — the
memoized promise makes the double-await a no-op and removes the
"filename-ordering is load-bearing" coupling.
Tests: two new regression cases in `tests/unit/deployment.test.ts`
under a `cache settlement (race-condition regression)` block. The first
asserts that a pre-settle `resolveDeployment()` call does not latch
`_cached` and that a post-settle call returns the correct ee/managed
shape. The second asserts the misconfig warning is silent before
settle and fires once after settle when ee/ is genuinely absent. Full
suite: 568/568 pass (was 566, +2 new).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Staging deploy logs (
NUXT_DEPLOYMENT_PROFILE=managed,ee/bundled, native deps healthy) showed:PR #39's diagnostics confirmed the bridge itself loads fine — the bug was a microtask vs macro-task ordering mistake in the boot sequence.
server/plugins/00.billing-flag.tsran sync, calledresolveDeployment()before the bridge was awaited, then queued aqueueMicrotask"second pass" that the plugin's own comment claimed would run after the bridge loaded. But ES dynamic import resolution is a macro-task, so the microtask drained first — twice locking_cachedto'community'for the rest of the process lifetime.Why this mattered
Wider than a stray warning. With
_cachedpoisoned:getEdition()returns'agpl'for the rest of the processhasFeature()denies everyrequires_eefeature server-side03.billing.ts) runs thefixed-plan branch (community) — subscription billing checks never runrunEnterpriseRouteplan gate denies AI keys / webhooks / conversation API with 403NUXT_PUBLIC_DEPLOYMENT_*env vars; without those it would have been broken tooFix — defense in depth across three files
server/utils/enterprise.ts— newisEnterpriseBridgeSettled()exposes whether the dynamic import has resolved (either to a bridge or definitively tonullfor CE). Pure read of existing global state.server/utils/deployment.ts—resolveDeployment()gates BOTH_cachedand the misconfig warning onisEnterpriseBridgeSettled(). Pre-settle calls return a transient community shape without latching; post-settle calls get the correct ee/managed shape and cache it. Warning only fires when ee/ is genuinely absent at settle time.server/plugins/00.billing-flag.ts— single async pass:await initEnterpriseBridge()→applyDeploymentSnapshot().queueMicrotasktwo-pass trick is gone.01.init-ee.tskeeps awaiting the bridge — the memoized promise makes the double-await a no-op and removes the "filename-ordering is load-bearing" coupling.Why this approach (vs alternatives)
Considered (and rejected):
resolveDeployment(): cascades async throughhasFeature/getPlanLimit/getEditionand every server route that calls them — huge blast radius for a problem that's really just "cache locked too early"The chosen hybrid (cache settlement guard + single-pass async plugin) fixes the race at the cache layer (correct for any caller, not just the plugin path) AND writes a deterministic SSR snapshot post-settle. No public API change.
Test plan
pnpm test tests/unit/deployment.test.ts— 14/14 pass (12 existing + 2 new regression cases)pnpm testfull suite — 568/568 pass (was 566, +2 new)pnpm lint— 0 errors (7 pre-existing warnings, none in touched files)pnpm typecheck— clean[deployment] ...did not loadwarnings are gone from boot logs, only[ee] Enterprise bridge loaded successfullyremains_cached='community')fixedcommunity fallback)Out-of-scope (worth follow-up issues)
00.-prefix plugins (00.validate-config.ts,00.billing-flag.ts) — ordering between them is filesystem-dependentuseDeployment().isCommunityfailsafe collapsesedition === ''(snapshot not yet hydrated) into "community" — worth distinguishing "not yet known" from "definitively community" so UI can render a brief skeleton instead of mis-rendering_billingConfiguredcache is global, not edition-awarerunEnterpriseRouteplan-gate silently skips whenevent.context.billingis absent — documented but worth a defensive default-deny path