diff --git a/docs/reference/detail/legacy-module-registry.md b/docs/reference/detail/legacy-module-registry.md index 0aa7ad40dc5..30ffbcac0f6 100644 --- a/docs/reference/detail/legacy-module-registry.md +++ b/docs/reference/detail/legacy-module-registry.md @@ -3,7 +3,8 @@ Reference for reasoning about the original (legacy) module registry implementation, how it interfaces with V8's module APIs, and how modules flow from config through compilation to evaluation. For V8 module internals, see -[v8-module-internals.md](v8-module-internals.md). +[v8-module-internals.md](v8-module-internals.md). For the new replacement +implementation, see [new-module-registry.md](new-module-registry.md). ## Source Files diff --git a/docs/reference/detail/new-module-registry.md b/docs/reference/detail/new-module-registry.md index 8910ed071c4..69c6d1d014a 100644 --- a/docs/reference/detail/new-module-registry.md +++ b/docs/reference/detail/new-module-registry.md @@ -118,6 +118,19 @@ IsolateModuleRegistry | `EVAL` | 0x04 | Requires evaluation outside IoContext (deferred to `EvalCallback`) | | `WASM` | 0x08 | WebAssembly module | +### `Module::ContentType` enum + +Used to validate import attributes (e.g. `with { type: 'json' }`). Set at +module construction time. + +| ContentType | Meaning | +| ----------- | ---------------------------------------------------- | +| `NONE` | No specific content type (ESM, CJS, builtin objects) | +| `JSON` | JSON module | +| `TEXT` | Text module | +| `DATA` | Data/binary module | +| `WASM` | WebAssembly module | + ## Module Subclasses `Module` is an abstract base class. Two concrete implementations exist @@ -165,9 +178,14 @@ SyntheticModule extends Module { - `getDescriptor()`: Calls `v8::Module::CreateSyntheticModule` with the declared export names (always includes `"default"` plus any `namedExports`). -- `evaluate()`: Creates a resolved Promise, calls the `callback` which uses the - `ModuleNamespace` helper to set exports via `SetSyntheticModuleExport`. -- For modules with `Flags::EVAL`, delegates to the `Evaluator` first. +- `evaluate()`: Ensures the module is instantiated, optionally delegates to the + `Evaluator` (for `Flags::EVAL` modules), then calls `module->Evaluate()` to + enter V8's status machine. V8 calls back into `evaluationSteps` which calls + `actuallyEvaluate()`. +- `actuallyEvaluate()`: Creates a resolved Promise, calls the `callback` which + uses the `ModuleNamespace` helper to set exports via + `SetSyntheticModuleExport`. This is always invoked via V8's evaluation steps + callback, never called directly from external code. ### Static `evaluationSteps` Callback @@ -177,7 +195,14 @@ as the V8 `SyntheticModuleEvaluationSteps`. When V8 calls it: 1. Gets the `IsolateModuleRegistry` from context embedder data. 2. Looks up the `Entry` by v8::Module identity (hash-indexed, O(1) — unlike the legacy registry's O(n) linear scan). -3. Calls `module.evaluate()` on the found entry. +3. Calls `actuallyEvaluate()` on the found module — not `evaluate()`, to avoid + reentry into `module->Evaluate()`. + +This design ensures V8 always manages the status transitions (`kEvaluating` → +`kEvaluated` or `kErrored`) regardless of whether evaluation was initiated by +V8 (static import) or by our code (dynamic import, require). The `evaluate()` +method is the external entry point that goes through V8; `evaluationSteps` is +V8's callback entry point that goes directly to the work. ## ModuleBundle — Sources of Modules @@ -355,17 +380,22 @@ builder.setEvalCallback([](Lock& js, const auto& module, auto v8Module, ### How it flows 1. `module.evaluate(js, v8Module, observer, evaluator)` is called. -2. For modules with `Flags::EVAL` set (all ESM modules, and synthetic modules - that opt in): +2. For ESM modules (`Flags::EVAL` always set): - The `evaluator(js, module, v8Module, observer)` is invoked. - The evaluator calls `ModuleRegistry::evaluateImpl` which invokes the - `EvalCallback`. - - If the callback returns a `Promise`, it is wrapped and returned. -3. If the evaluator returns `kj::none` (no callback set), or the module doesn't - have `Flags::EVAL`, the module's `actuallyEvaluate()` is called directly. -4. For ESM: `actuallyEvaluate` calls `v8::Module::Evaluate()`. -5. For Synthetic: `actuallyEvaluate` creates a resolved Promise, then invokes - the synthetic `callback` to set exports. + `EvalCallback` (wrapping evaluation in `SuppressIoContextScope`). + - The `EvalCallback` calls `v8Module->Evaluate()`. V8 evaluates the + source text directly. +3. For Synthetic modules (dynamic import and require paths): + - If `Flags::EVAL` is set, the evaluator is invoked first (same as ESM). + - Otherwise, `module->Evaluate()` is called directly. + - V8 sets status to `kEvaluating` and calls the `evaluationSteps` callback. + - `evaluationSteps` calls `actuallyEvaluate()` which runs the callback + to set exports. + - V8 receives the result and sets status to `kEvaluated` or `kErrored`. +4. For static imports of synthetic modules, V8 drives `Module::Evaluate()` + itself, which calls `evaluationSteps` → `actuallyEvaluate()`. The same + code path executes; only the initial caller differs. ## `import.meta` Support @@ -419,12 +449,14 @@ compile cache). `Module::newCjsStyleModuleHandler` is a template that creates a handler for CommonJS-style modules: -1. Guards against re-evaluation with `EvaluateOnce` (CJS modules evaluate once). -2. Allocates a JSG resource object of type `T` (e.g. `CommonJsModuleContext`). -3. Compiles the source as a function via `ScriptCompiler::CompileFunction` with +1. Allocates a JSG resource object of type `T` (e.g. `CommonJsModuleContext`). +2. Compiles the source as a function via `ScriptCompiler::CompileFunction` with the JSG object as extension object (providing `module`, `exports`, `require`). -4. Calls the compiled function. -5. Extracts `exports` from the JSG object and sets them on the module namespace. +3. Calls the compiled function. +4. Extracts `exports` from the JSG object and sets them on the module namespace. + +Re-evaluation is prevented by V8's module status machine (`kEvaluated` prevents +re-entry), not by application-level guards. ## Specifier Processing @@ -622,6 +654,22 @@ Aliases are single-level redirects. When a bundle lookup returns a string 2. `lookupImpl` recurses with the new specifier, but with a `recursed=true` flag that prevents further recursion. Only one level of aliasing is supported. +### Error Propagation for Errored Dependencies + +V8's `InnerModuleEvaluation` only recurses into `SourceTextModule` dependencies +when checking for `kErrored` status — it skips `SyntheticModule` dependencies +entirely. This means if a synthetic module is `kErrored`, an ESM that imports +it would evaluate successfully with `undefined` export values instead of +propagating the error. + +To work around this, `resolveModuleCallback` checks the resolved module's +status after resolution. If the module is `kErrored`, the callback throws the +module's cached exception instead of returning the handle to V8. This prevents +V8 from instantiating an ESM graph containing errored synthetic dependencies. + +Similarly, `dynamicResolve` checks for `kErrored` before calling `evaluate()`, +returning a rejected promise with the cached exception. + ## Thread Safety Model ### Shared State (ModuleRegistry, ModuleBundle, Module) @@ -667,14 +715,46 @@ Aliases are single-level redirects. When a bundle lookup returns a string functions can be called multiple times from multiple isolates. They create fresh JS objects each time. -7. **Import attributes are rejected.** The current implementation throws - `TypeError` for any import attributes. This is the spec-recommended default - for unrecognized attributes. - -8. **No `require(esm)` convention matching.** Unlike the legacy registry which - had complex `require()` return value logic (checking `__cjsUnwrapDefault`, - `module.exports` key, etc.), the new registry's `require()` always returns - the full module namespace. CJS interop is handled at the module handler level. +7. **Import attributes are validated.** The `type` import attribute is supported + for both static and dynamic imports. Each `Module` has a `ContentType` + (`NONE`, `JSON`, `TEXT`, `DATA`, `WASM`) set at construction time. Supported + type values and their corresponding TC39 proposals: + - `type: 'json'` — JSON Modules (TC39 Stage 4, finished) → `ContentType::JSON`. + This is the only type currently enabled. + - `type: 'text'` — Import Text (TC39 Stage 3) → `ContentType::TEXT`. + Recognized but rejected with "not yet supported" pending Stage 4. + - `type: 'bytes'` — Import Bytes (TC39 Stage 2.7) → `ContentType::DATA`. + Recognized but rejected with "not yet supported". The proposal requires + `Uint8Array` backed by an immutable `ArrayBuffer`, which is not yet + implemented. Data modules currently expose mutable `ArrayBuffer`s. + All three types are parsed in `parseImportAttributes`. Only `json` passes + validation in `validateImportType`; `text` and `bytes` are rejected there + with specific error messages. The `ContentType` enum and module tagging are + fully plumbed for all three, ready to enable when the proposals are ready. + When a type attribute is specified, the resolved module's content type must + match or a `TypeError` is thrown. Unrecognized attribute keys (anything + other than `type`) and unsupported type values are rejected with `TypeError`. + Attribute parsing is handled + by `parseImportAttributes()` which reads V8's `FixedArray` of key-value- + location triples. Validation is handled by `validateImportType()` which + compares the declared type against `Module::contentType()`. Both static + imports (`resolveModuleCallback`) and dynamic imports + (`dynamicImportModuleCallback` → `dynamicResolve`) use these helpers. + +8. **`require()` return value semantics (UNWRAP_DEFAULT).** When callers use the + `UNWRAP_DEFAULT` require option (used by `createRequire` and + `CommonJsModuleContext::require`), the return value depends on the module type: + - **User bundle ESM**: returns the module namespace (matching Node.js + `require(esm)` behavior), unless the module exports `__cjsUnwrapDefault` + as truthy (a convention used by bundlers like esbuild when transpiling CJS + to ESM), in which case the `default` export is returned. + - **Builtin ESM** (`node:assert`, `node:buffer`, etc.): returns the `default` + export, because workerd implements builtins as ESM that wrap CJS-style APIs + in a default export. + - **Synthetic modules** (CJS, JSON, Text, Data, WASM): returns the `default` + export, which is where the module's value lives (`module.exports` for CJS, + parsed value for JSON, etc.). + Without `UNWRAP_DEFAULT`, `require()` always returns the full module namespace. 9. **Python modules are not yet supported.** The new registry currently throws `KJ_FAIL_ASSERT` for `PythonModule` content. Python support remains on the diff --git a/docs/reference/detail/v8-module-internals.md b/docs/reference/detail/v8-module-internals.md index 788113a735b..536a32afcf6 100644 --- a/docs/reference/detail/v8-module-internals.md +++ b/docs/reference/detail/v8-module-internals.md @@ -1,7 +1,11 @@ # V8 Module System Internals Reference for reasoning about V8's module lifecycle, binding resolution, and evaluation -semantics. All file paths are relative to the V8 source root +semantics. For how workerd interfaces with these V8 APIs, see +[new-module-registry.md](new-module-registry.md) (current implementation) and +[legacy-module-registry.md](legacy-module-registry.md) (legacy implementation). + +All file paths are relative to the V8 source root (`external/+http_archive+v8/`). Line numbers are approximate and may drift across V8 versions. diff --git a/src/workerd/api/commonjs.c++ b/src/workerd/api/commonjs.c++ index 08a767a9087..63cd0d2bca1 100644 --- a/src/workerd/api/commonjs.c++ +++ b/src/workerd/api/commonjs.c++ @@ -25,9 +25,15 @@ jsg::JsValue CommonJsModuleContext::require(jsg::Lock& js, kj::String specifier) } if (FeatureFlags::get(js).getNewModuleRegistry()) { - return jsg::modules::ModuleRegistry::resolve(js, specifier, "default"_kj, - jsg::modules::ResolveContext::Type::BUNDLE, jsg::modules::ResolveContext::Source::REQUIRE, - KJ_ASSERT_NONNULL(pathOrSpecifier.tryGet())); + auto& referrer = KJ_ASSERT_NONNULL(pathOrSpecifier.tryGet()); + KJ_IF_SOME(ns, + jsg::modules::ModuleRegistry::tryResolveModuleNamespace(js, specifier, + jsg::modules::ResolveContext::Type::BUNDLE, + jsg::modules::ResolveContext::Source::REQUIRE, referrer, + jsg::modules::UnwrapDefault::YES)) { + return ns; + } + JSG_FAIL_REQUIRE(Error, kj::str("Module not found: ", specifier)); } auto& path = KJ_ASSERT_NONNULL(pathOrSpecifier.tryGet()); diff --git a/src/workerd/api/node/module.c++ b/src/workerd/api/node/module.c++ index 9dc915fc64a..479433376f5 100644 --- a/src/workerd/api/node/module.c++ +++ b/src/workerd/api/node/module.c++ @@ -32,9 +32,14 @@ jsg::JsValue ModuleUtil::createRequire(jsg::Lock& js, kj::String path) { specifier = kj::mv(nodeSpec); } } - return jsg::modules::ModuleRegistry::resolve(js, specifier, "default"_kj, - jsg::modules::ResolveContext::Type::BUNDLE, jsg::modules::ResolveContext::Source::REQUIRE, - referrer); + KJ_IF_SOME(val, + jsg::modules::ModuleRegistry::tryResolveModuleNamespace(js, specifier, + jsg::modules::ResolveContext::Type::BUNDLE, + jsg::modules::ResolveContext::Source::REQUIRE, referrer, + jsg::modules::UnwrapDefault::YES)) { + return val; + } + JSG_FAIL_REQUIRE(Error, kj::str("Module not found: ", specifier)); })); } diff --git a/src/workerd/api/tests/new-module-registry-test.js b/src/workerd/api/tests/new-module-registry-test.js index c086ab2c4da..ed433eb4565 100644 --- a/src/workerd/api/tests/new-module-registry-test.js +++ b/src/workerd/api/tests/new-module-registry-test.js @@ -7,6 +7,7 @@ import { ok, rejects, strictEqual, + throws, deepStrictEqual, } from 'assert'; // Intentionally omit the 'node:' prefix import { foo, default as def } from 'foo'; @@ -190,6 +191,17 @@ await rejects(import('%90%E8%54%C1'), { message: /Module not found/, }); +// The cjs6 module attempts to require and esm with a top-level await, which is rejected +// following node.js' established require(esm) precedent. +await rejects(import('cjs6'), { + message: /^Top-level await is not supported/, +}); + +// Cannot directly require an ESM with top-level await either. +throws(() => myRequire('tla'), { + message: /^Top-level await is not supported/, +}); + // Verify that a module is unable to perform IO operations at the top level, even if // the dynamic import is initiated within the scope of an active IoContext. export const noTopLevelIo = { @@ -246,15 +258,14 @@ export const queryAndFragment = { }, }; -// We do not currently support import attributes. Per the recommendation -// in the spec, we throw an error when they are encountered. +// Unrecognized import attributes are rejected. export const importAssertionsFail = { async test() { await rejects(import('ia'), { - message: /^Import attributes are not supported/, + message: /^Unsupported import attribute: "a"/, }); await rejects(import('foo', { with: { a: 'abc' } }), { - message: /^Import attributes are not supported/, + message: /^Unsupported import attribute: "a"/, }); }, }; diff --git a/src/workerd/api/tests/new-module-registry-test.wd-test b/src/workerd/api/tests/new-module-registry-test.wd-test index a83880b7aa7..0715485c7ad 100644 --- a/src/workerd/api/tests/new-module-registry-test.wd-test +++ b/src/workerd/api/tests/new-module-registry-test.wd-test @@ -34,6 +34,7 @@ const unitTests :Workerd.Config = ( # Intentional circular dependency (name = "cjs4", commonJsModule = "module.exports = require('cjs5')"), (name = "cjs5", commonJsModule = "module.exports = require('cjs4')"), + (name = "cjs6", commonJsModule = "module.exports = require('tla')"), # Other module types work (name = "text", text = "abc"), @@ -68,6 +69,8 @@ const unitTests :Workerd.Config = ( # Unicode characters in the path should be handled as UTF-8 and supported. (name = "部品", esModule = "export default 1;"), + + (name = "tla", esModule = "export default await import('text')"), ], compatibilityFlags = [ "nodejs_compat_v2", diff --git a/src/workerd/io/worker-modules.h b/src/workerd/io/worker-modules.h index aa09355e59e..ba602ab2bd4 100644 --- a/src/workerd/io/worker-modules.h +++ b/src/workerd/io/worker-modules.h @@ -155,8 +155,9 @@ static kj::Arc newWorkerModuleRegistry( // module registry. We can safely pass a reference to the module handler. // It will not be copied into a JS string until the module is actually // evaluated. - bundleBuilder.addSyntheticModule( - def.name, jsg::modules::Module::newTextModuleHandler(content.body)); + bundleBuilder.addSyntheticModule(def.name, + jsg::modules::Module::newTextModuleHandler(content.body), nullptr, + jsg::modules::Module::ContentType::TEXT); break; } KJ_CASE_ONEOF(content, Worker::Script::DataModule) { @@ -164,8 +165,9 @@ static kj::Arc newWorkerModuleRegistry( // module registry. We can safely pass a reference to the module handler. // It will not be copied into a JS string until the module is actually // evaluated. - bundleBuilder.addSyntheticModule( - def.name, jsg::modules::Module::newDataModuleHandler(content.body)); + bundleBuilder.addSyntheticModule(def.name, + jsg::modules::Module::newDataModuleHandler(content.body), nullptr, + jsg::modules::Module::ContentType::DATA); break; } KJ_CASE_ONEOF(content, Worker::Script::WasmModule) { @@ -181,8 +183,9 @@ static kj::Arc newWorkerModuleRegistry( // module registry. We can safely pass a reference to the module handler. // It will not be copied into a JS string until the module is actually // evaluated. - bundleBuilder.addSyntheticModule( - def.name, jsg::modules::Module::newJsonModuleHandler(content.body)); + bundleBuilder.addSyntheticModule(def.name, + jsg::modules::Module::newJsonModuleHandler(content.body), nullptr, + jsg::modules::Module::ContentType::JSON); break; } KJ_CASE_ONEOF(content, Worker::Script::CommonJsModule) { diff --git a/src/workerd/jsg/jsg.c++ b/src/workerd/jsg/jsg.c++ index d53a6b0a134..1fbd4ccf030 100644 --- a/src/workerd/jsg/jsg.c++ +++ b/src/workerd/jsg/jsg.c++ @@ -358,7 +358,8 @@ kj::Maybe Lock::resolveInternalModule(kj::StringPtr specifier) { auto& isolate = IsolateBase::from(v8Isolate); if (isolate.isUsingNewModuleRegistry()) { return jsg::modules::ModuleRegistry::tryResolveModuleNamespace( - *this, specifier, jsg::modules::ResolveContext::Type::BUILTIN); + *this, specifier, jsg::modules::ResolveContext::Type::BUILTIN) + .map([](JsValue val) { return KJ_ASSERT_NONNULL(val.tryCast()); }); } // Use the original module registry implementation @@ -372,14 +373,16 @@ kj::Maybe Lock::resolvePublicBuiltinModule(kj::StringPtr specifier) { auto& isolate = IsolateBase::from(v8Isolate); KJ_ASSERT(isolate.isUsingNewModuleRegistry()); return jsg::modules::ModuleRegistry::tryResolveModuleNamespace( - *this, specifier, jsg::modules::ResolveContext::Type::PUBLIC_BUILTIN); + *this, specifier, jsg::modules::ResolveContext::Type::PUBLIC_BUILTIN) + .map([](JsValue val) { return KJ_ASSERT_NONNULL(val.tryCast()); }); } kj::Maybe Lock::resolveModule(kj::StringPtr specifier, RequireEsm requireEsm) { auto& isolate = IsolateBase::from(v8Isolate); if (isolate.isUsingNewModuleRegistry()) { return jsg::modules::ModuleRegistry::tryResolveModuleNamespace( - *this, specifier, jsg::modules::ResolveContext::Type::BUNDLE); + *this, specifier, jsg::modules::ResolveContext::Type::BUNDLE) + .map([](JsValue val) { return KJ_ASSERT_NONNULL(val.tryCast()); }); } auto moduleRegistry = jsg::ModuleRegistry::from(*this); diff --git a/src/workerd/jsg/modules-new-test.c++ b/src/workerd/jsg/modules-new-test.c++ index bb7c8705619..2bed754be1c 100644 --- a/src/workerd/jsg/modules-new-test.c++ +++ b/src/workerd/jsg/modules-new-test.c++ @@ -1405,6 +1405,131 @@ KJ_TEST("Recursively require ESM from CJS required from ESM fails as expected (s }); } +// ====================================================================================== + +KJ_TEST("ESM -> CJS -> require(ESM) -> static import CJS circular dependency fails gracefully") { + // This tests a specific crash scenario: when a CJS module (b) is mid-evaluation + // (kEvaluating), and it require()s an ESM (c) that statically imports the same + // CJS module (b), V8's InnerModuleEvaluation would call Module::Evaluate on + // the kEvaluating synthetic module, hitting a CHECK crash. The resolveModuleCallback + // kEvaluating guard prevents this by rejecting the circular dependency at + // instantiation time. + PREAMBLE([&](Lock& js) { + ResolveObserverImpl observer; + CompilationObserver compilationObserver; + + ModuleBundle::BundleBuilder bundleBuilder(BASE); + + // a.js (ESM) -> imports b (CJS) + auto a = kj::str("import b from 'b'; export default b;"); + bundleBuilder.addEsmModule("a", a, Module::Flags::MAIN); + + // b (CJS) -> require('c') which is an ESM that imports b back + auto bSource = kj::str("exports = require('c');"); + bundleBuilder.addSyntheticModule( + "b", Module::newCjsStyleModuleHandler(bSource, "b"_kj)); + + // c.js (ESM) -> imports b (CJS) — creates the circular dependency + auto c = kj::str("import b from 'b'; export default b;"); + bundleBuilder.addEsmModule("c", c); + + auto registry = ModuleRegistry::Builder(observer, BASE).add(bundleBuilder.finish()).finish(); + + auto attached = registry->attachToIsolate(js, compilationObserver); + + js.tryCatch([&] { + ModuleRegistry::resolve(js, "file:///a", "default"_kjc); + JSG_FAIL_REQUIRE(Error, "Should have thrown"); + }, [&](Value exception) { + auto str = kj::str(exception.getHandle(js)); + KJ_ASSERT(str == "TypeError: Circular dependency when resolving module: b"); + }); + }); +} + +// ====================================================================================== + +KJ_TEST("UNWRAP_DEFAULT returns namespace for bundle ESM, default for others") { + PREAMBLE([&](Lock& js) { + ResolveObserverImpl observer; + CompilationObserver compilationObserver; + + ModuleBundle::BundleBuilder bundleBuilder(BASE); + + // Bundle ESM with named exports (no __cjsUnwrapDefault) + auto esm = kj::str("export default 42; export const name = 'esm';"); + bundleBuilder.addEsmModule("esm-mod", esm, Module::Flags::MAIN); + + // Bundle ESM with __cjsUnwrapDefault convention + auto esmCjs = kj::str("export default 'unwrapped'; export const __cjsUnwrapDefault = true;"); + bundleBuilder.addEsmModule("esm-cjs", esmCjs); + + // JSON synthetic module + auto json = kj::str("{\"key\": \"value\"}"); + bundleBuilder.addSyntheticModule( + "data.json", Module::newJsonModuleHandler(json.first(json.size()))); + + // Text synthetic module + auto text = kj::str("hello world"); + bundleBuilder.addSyntheticModule( + "data.txt", Module::newTextModuleHandler(text.first(text.size()))); + + auto registry = ModuleRegistry::Builder(observer, BASE).add(bundleBuilder.finish()).finish(); + auto attached = registry->attachToIsolate(js, compilationObserver); + + // Bundle ESM without __cjsUnwrapDefault: tryResolveModuleNamespace with UnwrapDefault::YES + // returns the full namespace (has "default", "name" properties). + js.tryCatch([&] { + auto val = KJ_ASSERT_NONNULL(ModuleRegistry::tryResolveModuleNamespace(js, "file:///esm-mod", + ResolveContext::Type::BUNDLE, ResolveContext::Source::REQUIRE, kj::none, + modules::UnwrapDefault::YES)); + auto ns = KJ_ASSERT_NONNULL(val.tryCast()); + // The namespace should have both "default" and "name" properties. + auto nameVal = ns.get(js, "name"); + KJ_ASSERT(!nameVal.isUndefined()); + KJ_ASSERT(kj::str(nameVal) == "esm"); + auto defaultVal = ns.get(js, "default"); + KJ_ASSERT(!defaultVal.isUndefined()); + }, [&](Value exception) { js.throwException(kj::mv(exception)); }); + + // Bundle ESM with __cjsUnwrapDefault: returns the default export. + js.tryCatch([&] { + auto result = KJ_ASSERT_NONNULL(ModuleRegistry::tryResolveModuleNamespace(js, + "file:///esm-cjs", ResolveContext::Type::BUNDLE, ResolveContext::Source::REQUIRE, + kj::none, modules::UnwrapDefault::YES)); + KJ_ASSERT(kj::str(result) == "unwrapped"); + }, [&](Value exception) { js.throwException(kj::mv(exception)); }); + + // JSON synthetic module: returns the parsed value (the default export). + js.tryCatch([&] { + auto result = KJ_ASSERT_NONNULL(ModuleRegistry::tryResolveModuleNamespace(js, + "file:///data.json", ResolveContext::Type::BUNDLE, ResolveContext::Source::REQUIRE, + kj::none, modules::UnwrapDefault::YES)); + // Should be the parsed JSON object, not the namespace. + auto obj = KJ_ASSERT_NONNULL(result.tryCast()); + KJ_ASSERT(kj::str(obj.get(js, "key")) == "value"); + }, [&](Value exception) { js.throwException(kj::mv(exception)); }); + + // Text synthetic module: returns the string value (the default export). + js.tryCatch([&] { + auto result = KJ_ASSERT_NONNULL(ModuleRegistry::tryResolveModuleNamespace(js, + "file:///data.txt", ResolveContext::Type::BUNDLE, ResolveContext::Source::REQUIRE, + kj::none, modules::UnwrapDefault::YES)); + // The default is a string — JsValue directly. + KJ_ASSERT(kj::str(result) == "hello world"); + }, [&](Value exception) { js.throwException(kj::mv(exception)); }); + + // Without UnwrapDefault, all modules return the namespace. + js.tryCatch([&] { + auto val = KJ_ASSERT_NONNULL(ModuleRegistry::tryResolveModuleNamespace( + js, "file:///data.json", ResolveContext::Type::BUNDLE, ResolveContext::Source::REQUIRE)); + auto ns = KJ_ASSERT_NONNULL(val.tryCast()); + // Should have a "default" property (it's the namespace). + KJ_ASSERT(!ns.get(js, "default").isUndefined()); + }, [&](Value exception) { js.throwException(kj::mv(exception)); }); + }); +} + // ====================================================================================== KJ_TEST("Resolution occurs relative to the referrer") { ResolveObserver observer; @@ -1839,12 +1964,69 @@ KJ_TEST("Aliased modules (import maps) work") { // ====================================================================================== -KJ_TEST("Import attributes are currently unsupported") { +KJ_TEST("Import attribute type:json succeeds for JSON modules") { + ResolveObserver resolveObserver; + CompilationObserver compilationObserver; + ModuleBundle::BundleBuilder builder(BASE); + + // An ESM that imports a JSON module with the type attribute + auto entry = kj::str("import data from 'data.json' with { type: 'json' }; export default data;"); + builder.addEsmModule("entry", entry, Module::Flags::MAIN); + + auto json = kj::str("{\"key\": \"value\"}"); + builder.addSyntheticModule("data.json", Module::newJsonModuleHandler(json.first(json.size())), + nullptr, Module::ContentType::JSON); + + auto registry = ModuleRegistry::Builder(resolveObserver, BASE).add(builder.finish()).finish(); + + PREAMBLE([&](Lock& js) { + auto attached = registry->attachToIsolate(js, compilationObserver); + + js.tryCatch([&] { + auto val = ModuleRegistry::resolve(js, "file:///entry", "default"_kjc); + // The JSON module should have been resolved and its value should be the parsed object. + KJ_ASSERT(!val.isUndefined()); + }, [&](Value exception) { js.throwException(kj::mv(exception)); }); + }); +} + +// ====================================================================================== + +KJ_TEST("Import attribute type:json fails for non-JSON modules") { + ResolveObserver resolveObserver; + CompilationObserver compilationObserver; + ModuleBundle::BundleBuilder builder(BASE); + + // An ESM that imports another ESM with type:json (should fail - ESM is not JSON) + auto entry = kj::str("import foo from 'other' with { type: 'json' }; export default foo;"); + builder.addEsmModule("entry", entry, Module::Flags::MAIN); + + auto other = kj::str("export default 42;"); + builder.addEsmModule("other", other); + + auto registry = ModuleRegistry::Builder(resolveObserver, BASE).add(builder.finish()).finish(); + + PREAMBLE([&](Lock& js) { + auto attached = registry->attachToIsolate(js, compilationObserver); + + js.tryCatch([&] { + ModuleRegistry::resolve(js, "file:///entry", "default"_kjc); + JSG_FAIL_REQUIRE(Error, "Should have thrown"); + }, [&](Value exception) { + auto str = kj::str(exception.getHandle(js)); + KJ_ASSERT(str == "TypeError: Module \"other\" is not of type \"json\""); + }); + }); +} + +// ====================================================================================== + +KJ_TEST("Unsupported import attributes are rejected") { ResolveObserver resolveObserver; CompilationObserver compilationObserver; ModuleBundle::BundleBuilder builder(BASE); - auto foo = kj::str("import abc from 'foo' with { type: 'json' };"); + auto foo = kj::str("import abc from 'foo' with { unsupported: 'value' };"); builder.addEsmModule("foo", foo); auto registry = ModuleRegistry::Builder(resolveObserver, BASE).add(builder.finish()).finish(); @@ -1857,7 +2039,7 @@ KJ_TEST("Import attributes are currently unsupported") { JSG_FAIL_REQUIRE(Error, "Should have thrown"); }, [&](Value exception) { auto str = kj::str(exception.getHandle(js)); - KJ_ASSERT(str == "TypeError: Import attributes are not supported"); + KJ_ASSERT(str == "TypeError: Unsupported import attribute: \"unsupported\""); }); }); } diff --git a/src/workerd/jsg/modules-new.c++ b/src/workerd/jsg/modules-new.c++ index 93d224dc5ba..37dd37170f2 100644 --- a/src/workerd/jsg/modules-new.c++ +++ b/src/workerd/jsg/modules-new.c++ @@ -170,7 +170,7 @@ class EsModule final: public Module { // since we're either consuming cached data or not using any options at all. KJ_ASSERT(v8::ScriptCompiler::CompileOptionsIsValid(options)); if (!v8::ScriptCompiler::CompileModule(js.v8Isolate, &source, options).ToLocal(&module)) { - return v8::MaybeLocal(); + return {}; } } @@ -220,8 +220,11 @@ class EsModule final: public Module { const CompilationObserver& observer, const Evaluator& maybeEvaluate) const override { if (!ensureInstantiated(js, module, observer, *this)) { - return v8::MaybeLocal(); - }; + if (!js.v8Isolate->HasPendingException()) { + js.v8Isolate->ThrowError(js.str("Failed to instantiate module"_kj)); + } + return {}; + } KJ_IF_SOME(result, maybeEvaluate(js, *this, module, observer)) { return js.wrapSimplePromise(kj::mv(result)); @@ -250,8 +253,9 @@ class SyntheticModule final: public Module { Type type, ModuleBundle::BundleBuilder::EvaluateCallback callback, kj::Array namedExports, - Flags flags = Flags::NONE) - : Module(kj::mv(id), type, flags), + Flags flags = Flags::NONE, + ContentType contentType = ContentType::NONE) + : Module(kj::mv(id), type, flags, contentType), callback(kj::mv(callback)), namedExports(kj::mv(namedExports)) { // Synthetic modules can never be ESM or Main @@ -279,19 +283,15 @@ class SyntheticModule final: public Module { Lock& js, v8::Local module, const CompilationObserver& observer) const override { // The return value will be a resolved promise. v8::Local resolver; - if (!v8::Promise::Resolver::New(js.v8Context()).ToLocal(&resolver)) { - return v8::MaybeLocal(); + return {}; } ModuleNamespace ns(module, namedExports); - if (!callback(js, id(), ns, observer)) { + if (!callback(js, id(), ns, observer) || + resolver->Resolve(js.v8Context(), js.v8Undefined()).IsNothing()) { // An exception should already be scheduled with the isolate - return v8::MaybeLocal(); - } - - if (resolver->Resolve(js.v8Context(), js.v8Undefined()).IsNothing()) { - return v8::MaybeLocal(); + return {}; } return resolver->GetPromise(); @@ -302,9 +302,11 @@ class SyntheticModule final: public Module { const CompilationObserver& observer, const Evaluator& maybeEvaluate) const override { if (!ensureInstantiated(js, module, observer, *this)) { - return v8::MaybeLocal(); + if (!js.v8Isolate->HasPendingException()) { + js.v8Isolate->ThrowError(js.str("Failed to instantiate module"_kj)); + } + return {}; } - // If this synthetic module is marked with Flags::EVAL, and the evalCallback // is specified, then we defer evaluation to the given callback. if (isEval()) { @@ -312,8 +314,7 @@ class SyntheticModule final: public Module { return js.wrapSimplePromise(kj::mv(result)); } } - - return actuallyEvaluate(js, module, observer); + return module->Evaluate(js.v8Context()); } // Marked mutable because kj::Function::operator() is non-const, but evaluation @@ -329,6 +330,72 @@ class SyntheticModule final: public Module { WD_STRONG_BOOL(SourcePhase); #pragma clang diagnostic pop +// Parses import attributes from V8's FixedArray format (key-value-location triples). +// Returns the value of the "type" attribute if present, or kj::none if no attributes. +// Throws TypeError for any unrecognized attribute keys or unsupported type values. +kj::Maybe parseImportAttributes( + Lock& js, v8::Local import_attributes) { + if (import_attributes.IsEmpty() || import_attributes->Length() == 0) { + return kj::none; + } + // V8 encodes import attributes as a FixedArray of triples: [key, value, location, ...] + kj::Maybe typeValue; + for (int i = 0; i < import_attributes->Length(); i += 3) { + auto key = js.toString(import_attributes->Get(i).As()); + if (key == "type"_kj) { + auto value = js.toString(import_attributes->Get(i + 1).As()); + if (value == "json"_kj) { + typeValue = "json"_kjc; + } else if (value == "text"_kj) { + typeValue = "text"_kjc; + } else if (value == "bytes"_kj) { + typeValue = "bytes"_kjc; + } else { + js.throwException( + js.typeError(kj::str("Unsupported import attribute type: \"", value, "\""))); + } + } else { + js.throwException(js.typeError(kj::str("Unsupported import attribute: \"", key, "\""))); + } + } + return typeValue; +} + +// Validates that the resolved module's content type matches the import attribute "type" value. +// Throws TypeError on mismatch. Does nothing if no type attribute was specified. +void validateImportType( + Lock& js, kj::Maybe importType, const Module& module, kj::StringPtr specifier) { + KJ_IF_SOME(type, importType) { + // Import Text (TC39 Stage 3) and Import Bytes (TC39 Stage 2.7) are + // recognized but not yet supported. Text support is pending the proposal + // reaching Stage 4. Bytes support requires Uint8Array backed by an + // immutable ArrayBuffer, which is not yet implemented. + if (type == "text"_kj) { + js.throwException(js.typeError("Import attribute type \"text\" is not yet supported"_kj)); + } + if (type == "bytes"_kj) { + js.throwException(js.typeError("Import attribute type \"bytes\" is not yet supported"_kj)); + } + + Module::ContentType expected = Module::ContentType::NONE; + if (type == "json"_kj) { + expected = Module::ContentType::JSON; + } + // TODO(later): Enable when Import Text (TC39) reaches Stage 4. + // else if (type == "text"_kj) { + // expected = Module::ContentType::TEXT; + // } + // TODO(later): Enable when immutable ArrayBuffer is implemented. + // else if (type == "bytes"_kj) { + // expected = Module::ContentType::DATA; + // } + if (module.contentType() != expected) { + js.throwException( + js.typeError(kj::str("Module \"", specifier, "\" is not of type \"", type, "\""))); + } + } +} + // Binds a ModuleRegistry to an Isolate. class IsolateModuleRegistry final { public: @@ -385,13 +452,17 @@ class IsolateModuleRegistry final { Url normalizedSpecifier, Url referrer, kj::StringPtr rawSpecifier, - SourcePhase sourcePhase) { - static constexpr auto evaluate = [](Lock& js, Entry& entry, const CompilationObserver& observer, - const Module::Evaluator& maybeEvaluate) { - auto module = entry.key.getHandle(js); + SourcePhase sourcePhase, + kj::Maybe importType = kj::none) { + // Note: Takes v8::Local and const Module& directly rather than + // Entry& for the same reason as require()'s evaluate lambda — the lookupCache + // table may rehash during evaluate(), invalidating Entry& references. + static constexpr auto evaluate = + [](Lock& js, v8::Local module, const Module& moduleDef, + const CompilationObserver& observer, const Module::Evaluator& maybeEvaluate) { return js .toPromise( - check(entry.module.evaluate(js, module, observer, maybeEvaluate)).As()) + check(moduleDef.evaluate(js, module, observer, maybeEvaluate)).As()) .then(js, [module = js.v8Ref(module)](Lock& js, Value) mutable -> Promise { return js.resolvedPromise(js.v8Ref(module.getHandle(js)->GetModuleNamespace())); }); @@ -415,8 +486,21 @@ class IsolateModuleRegistry final { }; auto handleFoundModule = [&](Entry& found) -> Promise { - auto evaluatePromise = evaluate(js, found, getObserver(), inner.getEvaluator()); - auto isWasm = found.module.isWasm(); + // Extract module handle and Module& before calling evaluate, since + // evaluate may trigger table rehashing that invalidates the Entry&. + auto v8Module = found.key.getHandle(js); + auto& moduleDef = found.module; + + // Validate import type attribute against the resolved module's content type. + validateImportType(js, importType, moduleDef, rawSpecifier); + + if (v8Module->GetStatus() == v8::Module::kErrored) { + return js.rejectedPromise(v8Module->GetException()); + } + + auto evaluatePromise = + evaluate(js, v8Module, moduleDef, getObserver(), inner.getEvaluator()); + auto isWasm = moduleDef.isWasm(); if (!sourcePhase) { return evaluatePromise; @@ -442,8 +526,9 @@ class IsolateModuleRegistry final { return js.v8Ref(defaultExport); } } - KJ_FAIL_REQUIRE(v8::Exception::SyntaxError, - "Source phase import not available for module: ", normalizedSpecifier.getHref()); + js.throwException(js.v8Ref(v8::Exception::SyntaxError( + js.str(kj::str("Source phase import not available for module: "_kj, + normalizedSpecifier.getHref()))))); }); } return js.rejectedPromise(js.v8Ref(v8::Exception::SyntaxError(js.strIntern(kj::str( @@ -469,20 +554,74 @@ class IsolateModuleRegistry final { } enum class RequireOption { - DEFAULT, - RETURN_EMPTY, + DEFAULT = 0, + RETURN_EMPTY = 1 << 0, + NO_TOP_LEVEL_AWAIT = 1 << 1, + // When set, the default export is returned instead of the module namespace. + // This matches Node.js require() semantics where require() returns the + // default export (module.exports for CJS, default export for ESM builtins, + // parsed value for JSON, etc.). + UNWRAP_DEFAULT = 1 << 2, }; + friend constexpr RequireOption operator|(RequireOption a, RequireOption b) { + return static_cast(static_cast(a) | static_cast(b)); + } + friend constexpr RequireOption operator&(RequireOption a, RequireOption b) { + return static_cast(static_cast(a) & static_cast(b)); + } + // Used to implement the synchronous dynamic import of modules in support of APIs // like the CommonJS require. Returns the instantiated/evaluated module namespace. // If an empty v8::MaybeLocal is returned and the default option is given, then an // exception has been scheduled. - v8::MaybeLocal require( + v8::MaybeLocal require( Lock& js, const ResolveContext& context, RequireOption option = RequireOption::DEFAULT) { - static constexpr auto evaluate = [](Lock& js, Entry& entry, const Url& id, - const CompilationObserver& observer, - const Module::Evaluator& maybeEvaluate) { - auto module = entry.key.getHandle(js); + // Returns either the module namespace or, when UNWRAP_DEFAULT is set and + // the module is not ESM, the default export from the namespace. This matches + // Node.js require() semantics: require('esm') returns the namespace, + // require('data.json') returns the parsed value. + // When UNWRAP_DEFAULT is set, returns the default export for all module types + // except user bundle ESM, which returns the namespace (matching Node.js require(esm) + // behavior). Builtin ESM returns default because workerd wraps CJS-style APIs in + // ESM default exports. Synthetic modules (CJS, JSON, Text, etc.) return default + // because that's where their value lives. + static constexpr auto maybeUnwrapDefault = + [](Lock& js, v8::Local module, const Module& moduleDef, + RequireOption option) -> v8::MaybeLocal { + auto ns = module->GetModuleNamespace().As(); + if ((option & RequireOption::UNWRAP_DEFAULT) == RequireOption::UNWRAP_DEFAULT) { + // User bundle ESM returns the full namespace, matching Node.js require(esm), + // unless the module has __cjsUnwrapDefault set (a convention used by bundlers + // like esbuild when transpiling CJS to ESM), in which case we return the + // default export. + if (moduleDef.type() == Module::Type::BUNDLE && moduleDef.isEsm()) { + auto unwrap = ns->Get(js.v8Context(), js.strIntern("__cjsUnwrapDefault"_kj)); + v8::Local unwrapValue; + if (unwrap.ToLocal(&unwrapValue) && unwrapValue->BooleanValue(js.v8Isolate)) { + return check(ns->Get(js.v8Context(), js.strIntern("default"_kj))); + } + return ns; + } + // Everything else (builtins, synthetic modules) returns the default export. + // Note: The default export may be a primitive (e.g. Text module returns a string). + // We cast to v8::Object here because require() returns MaybeLocal, but + // callers immediately convert to JsValue. The cast is safe because v8::Local is + // just a pointer wrapper. + return check(ns->Get(js.v8Context(), js.strIntern("default"_kj))); + } + return ns; + }; + + // Note: This lambda takes v8::Local and const Module& directly + // rather than Entry& because the lookupCache table may rehash during + // ensureInstantiated() or evaluate() (when V8 resolves static import + // dependencies via resolveModuleCallback -> resolveWithCaching -> upsert), + // which would invalidate any Entry& reference into the table. + static constexpr auto evaluate = + [](Lock& js, v8::Local module, const Module& moduleDef, const Url& id, + const CompilationObserver& observer, const Module::Evaluator& maybeEvaluate, + RequireOption option) -> v8::MaybeLocal { auto status = module->GetStatus(); // If status is kErrored, that means a prior attempt to evaluate the module @@ -496,7 +635,7 @@ class IsolateModuleRegistry final { // because v8 will not allow us to grab the default export while the module // is still evaluating. - if (entry.module.isEsm() && status == v8::Module::kEvaluating) { + if (moduleDef.isEsm() && status == v8::Module::kEvaluating) { JSG_FAIL_REQUIRE(Error, "Circular dependency when resolving module: ", id); } @@ -508,45 +647,66 @@ class IsolateModuleRegistry final { // to a degree. Just like in Node.js, however, such circular dependencies // can still be problematic depending on how they are used. if (status == v8::Module::kEvaluated || status == v8::Module::kEvaluating) { - return module->GetModuleNamespace().As(); + return maybeUnwrapDefault(js, module, moduleDef, option); + } + + // Matches the require(esm) behavior implemented in Node.js, which is to + // throw if the module being imported uses top-level await. + if ((option & RequireOption::NO_TOP_LEVEL_AWAIT) == RequireOption::NO_TOP_LEVEL_AWAIT) { + // We have to ensure the module is instantiated before we can check for top-level await. + JSG_REQUIRE(ensureInstantiated(js, module, observer, moduleDef), Error, + "Failed to instantiate module: ", id); + JSG_REQUIRE(!module->IsGraphAsync(), Error, + "Top-level await is not supported in this context for module: ", id); } // Evaluate the module and grab the default export from the module namespace. auto promise = - check(entry.module.evaluate(js, module, observer, maybeEvaluate)).As(); + check(moduleDef.evaluate(js, module, observer, maybeEvaluate)).As(); // Run the microtasks to ensure that any promises that happen to be scheduled - // during the evaluation of the top-level scope have a chance to be settled, - js.runMicrotasks(); - - static const auto kTopLevelAwaitError = - "Use of top-level await in a synchronously required module is restricted to " - "promises that are resolved synchronously. This includes any top-level awaits " - "in the entrypoint module for a worker."_kj; - - switch (promise->State()) { - case v8::Promise::kFulfilled: { - // This is what we want. The module namespace should be fully populated - // and evaluated at this point. - return module->GetModuleNamespace().As(); + // during the evaluation of the top-level scope have a chance to be settled. + // We only pump the microtasks queue if NO_TOP_LEVEL_AWAIT is not set. + if ((option & RequireOption::NO_TOP_LEVEL_AWAIT) != RequireOption::NO_TOP_LEVEL_AWAIT) { + js.runMicrotasks(); + + static const auto kTopLevelAwaitError = + "Use of top-level await in a synchronously required module is restricted to " + "promises that are resolved synchronously. This includes any top-level awaits " + "in the entrypoint module for a worker."_kj; + + switch (promise->State()) { + case v8::Promise::kFulfilled: { + // This is what we want. The module namespace should be fully populated + // and evaluated at this point. + return maybeUnwrapDefault(js, module, moduleDef, option); + } + case v8::Promise::kRejected: { + // Oops, there was an error. We should throw it. + js.throwException(JsValue(promise->Result())); + break; + } + case v8::Promise::kPending: { + // The module evaluation could not complete in a single drain of the + // microtask queue. This means we've got a pending promise somewhere + // that is being awaited preventing the module from being ready to + // go. We can't have that! Throw! Throw! + JSG_FAIL_REQUIRE(Error, kTopLevelAwaitError, " Specifier: \"", id, "\"."); + } } - case v8::Promise::kRejected: { - // Oops, there was an error. We should throw it. + } else { + KJ_ASSERT(promise->State() != v8::Promise::kPending, + "Top-level await is not supported in this context, so the module promise " + "should never be pending"); + if (promise->State() == v8::Promise::kRejected) { js.throwException(JsValue(promise->Result())); - break; - } - case v8::Promise::kPending: { - // The module evaluation could not complete in a single drain of the - // microtask queue. This means we've got a pending promise somewhere - // that is being awaited preventing the module from being ready to - // go. We can't have that! Throw! Throw! - JSG_FAIL_REQUIRE(Error, kTopLevelAwaitError, " Specifier: \"", id, "\"."); } + return maybeUnwrapDefault(js, module, moduleDef, option); } KJ_UNREACHABLE; }; - return js.tryCatch([&]() -> v8::MaybeLocal { + return js.tryCatch([&]() -> v8::MaybeLocal { KJ_IF_SOME(processUrl, maybeRedirectNodeProcess(js, context.normalizedSpecifier.getHref())) { ResolveContext newContext{ .type = ResolveContext::Type::BUILTIN_ONLY, @@ -560,23 +720,29 @@ class IsolateModuleRegistry final { // Do we already have a cached module for this context? KJ_IF_SOME(found, lookupCache.find>(context)) { - return evaluate( - js, found, context.normalizedSpecifier, getObserver(), inner.getEvaluator()); + // Extract module handle and Module& before calling evaluate, since + // evaluate may trigger table rehashing that invalidates the Entry&. + auto foundModule = found.key.getHandle(js); + auto& foundModuleDef = found.module; + return evaluate(js, foundModule, foundModuleDef, context.normalizedSpecifier, getObserver(), + inner.getEvaluator(), option); } KJ_IF_SOME(found, resolveWithCaching(js, context)) { - return evaluate( - js, found, context.normalizedSpecifier, getObserver(), inner.getEvaluator()); + auto foundModule = found.key.getHandle(js); + auto& foundModuleDef = found.module; + return evaluate(js, foundModule, foundModuleDef, context.normalizedSpecifier, getObserver(), + inner.getEvaluator(), option); } - if (option == RequireOption::RETURN_EMPTY) { - return v8::MaybeLocal(); + if ((option & RequireOption::RETURN_EMPTY) == RequireOption::RETURN_EMPTY) { + return {}; } JSG_FAIL_REQUIRE(Error, kj::str("Module not found: ", context.normalizedSpecifier.getHref())); - }, [&](Value exception) { + }, [&](Value exception) -> v8::MaybeLocal { // Use the isolate to rethrow the exception here instead of using the lock. js.v8Isolate->ThrowException(exception.getHandle(js)); - return v8::MaybeLocal(); + return {}; }); } @@ -678,22 +844,20 @@ class IsolateModuleRegistry final { v8::MaybeLocal SyntheticModule::evaluationSteps( v8::Local context, v8::Local module) { - try { - auto& js = Lock::current(); + auto& js = Lock::current(); + KJ_TRY { auto& registry = IsolateModuleRegistry::from(js.v8Isolate); - KJ_IF_SOME(found, registry.lookup(js, module)) { - return found.module.evaluate( - js, module, registry.getObserver(), registry.inner.getEvaluator()); + return found.module.actuallyEvaluate(js, module, registry.getObserver()); } - - // This case really should never actually happen but we handle it anyway. KJ_LOG(ERROR, "Synthetic module not found in registry for evaluation"); - js.v8Isolate->ThrowError(js.str("Requested module does not exist"_kj)); - return v8::MaybeLocal(); - } catch (...) { - kj::throwFatalException(kj::getCaughtExceptionAsKj()); + return {}; + } + KJ_CATCH(exception) { + auto ex = js.exceptionToJsValue(kj::mv(exception)); + js.v8Isolate->ThrowException(ex.getHandle(js)); + return {}; } } @@ -778,27 +942,19 @@ v8::MaybeLocal dynamicImportModuleCallback(v8::Local c v8::Local resolver; if (!v8::Promise::Resolver::New(js.v8Context()).ToLocal(&resolver) || resolver->Reject(js.v8Context(), error).IsNothing()) { - return v8::MaybeLocal(); + return {}; } return resolver->GetPromise(); }; auto& registry = IsolateModuleRegistry::from(js.v8Isolate); - try { + KJ_TRY { return js.tryCatch([&]() -> v8::MaybeLocal { auto spec = specifierToString(js, specifier); - // The proposed specification for import attributes strongly recommends that - // embedders reject import attributes and types they do not understand/implement. - // This is because import attributes can alter the interpretation of a module. - // Throwing an error for things we do not understand is the safest thing to do - // for backwards compatibility. - // - // For now, we do not support any import attributes, so if there are any at all - // we will reject the import. - if (!import_attributes.IsEmpty() && import_attributes->Length() > 0) { - return rejected(js, js.typeError("Import attributes are not supported")); - }; + // Parse import attributes. Throws for unrecognized attribute keys. + // Returns the "type" value if specified, or kj::none. + auto importType = parseImportAttributes(js, import_attributes); Url referrer = ([&] { if (resource_name.IsEmpty()) { @@ -820,13 +976,12 @@ v8::MaybeLocal dynamicImportModuleCallback(v8::Local c KJ_IF_SOME(processUrl, maybeRedirectNodeProcess(js, spec.asPtr())) { auto processSpec = kj::str(processUrl.getHref()); return registry.dynamicResolve( - js, processUrl.clone(), kj::mv(referrer), processSpec, isSourcePhase); + js, processUrl.clone(), kj::mv(referrer), processSpec, isSourcePhase, importType); } KJ_IF_SOME(url, referrer.tryResolve(spec.asPtr())) { - auto normalized = url.clone(Url::EquivalenceOption::NORMALIZE_PATH); - return registry.dynamicResolve( - js, kj::mv(normalized), kj::mv(referrer), spec, isSourcePhase); + return registry.dynamicResolve(js, url.clone(Url::EquivalenceOption::NORMALIZE_PATH), + kj::mv(referrer), spec, isSourcePhase, importType); } // We were not able to parse the specifier. We'll return a rejected promise. @@ -838,8 +993,10 @@ v8::MaybeLocal dynamicImportModuleCallback(v8::Local c // anyway. return rejected(js, jsg::JsValue(exception.getHandle(js))); }); - } catch (...) { - kj::throwFatalException(kj::getCaughtExceptionAsKj()); + } + KJ_CATCH(exception) { + auto ex = js.exceptionToJsValue(kj::mv(exception)); + return rejected(js, ex.getHandle(js)); } } @@ -899,19 +1056,20 @@ v8::MaybeLocal> resolv // Throwing an error for things we do not understand is the safest thing to do // for backwards compatibility. // - // For now, we do not support any import attributes, so if there are any at all - // we will reject the import. - if (!import_attributes.IsEmpty() && import_attributes->Length() > 0) { - js.throwException(js.typeError("Import attributes are not supported")); - } + // Parse import attributes. Throws for unrecognized attribute keys. + auto importType = parseImportAttributes(js, import_attributes); ResolveContext::Type type = ResolveContext::Type::BUNDLE; - auto& referrerUrl = registry.lookup(js, referrer) - .map([&](IsolateModuleRegistry::Entry& entry) -> const Url& { + // Clone the referrer URL out of the lookup cache entry rather than holding + // a reference into it. The lookupCache table may rehash during resolve() + // (via resolveWithCaching -> upsert), which would invalidate any reference + // into the table's storage. + Url referrerUrl = registry.lookup(js, referrer) + .map([&](IsolateModuleRegistry::Entry& entry) -> Url { type = moduleTypeToResolveContextType(entry.module.type()); - return entry.context.id; - }).orDefault(registry.getBundleBase()); + return entry.context.id.clone(); + }).orDefault(registry.getBundleBase().clone()); // If Node.js Compat v2 mode is enable, we have to check to see if the specifier // is a bare node specifier and resolve it to a full node: URL. @@ -932,8 +1090,25 @@ v8::MaybeLocal> resolv .referrerNormalizedSpecifier = referrerUrl, .rawSpecifier = processSpec.asPtr(), }; - - return registry.resolve(js, resolveContext); + auto maybeResolved = registry.resolve(js, resolveContext); + v8::Local resolved; + if (!maybeResolved.ToLocal(&resolved)) { + return {}; + } + if (resolved->GetStatus() == v8::Module::kErrored) { + js.throwException(JsValue(resolved->GetException())); + return {}; + } + if (resolved->GetStatus() == v8::Module::kEvaluating) { + js.throwException( + js.typeError(kj::str("Circular dependency when resolving module: ", spec))); + return {}; + } + // Validate import type attribute against the resolved module's content type. + KJ_IF_SOME(entry, registry.lookup(js, resolved)) { + validateImportType(js, importType, entry.module, spec); + } + return resolved; } } @@ -949,11 +1124,27 @@ v8::MaybeLocal> resolv }; auto maybeResolved = registry.resolve(js, resolveContext); - if (maybeResolved.IsEmpty()) { + + v8::Local resolved; + if (!maybeResolved.ToLocal(&resolved)) { + return {}; + } + + // If the resolved module is in an errored state, we will rethrow the same exception here. + if (resolved->GetStatus() == v8::Module::kErrored) { + js.throwException(JsValue(resolved->GetException())); + return {}; + } + if (resolved->GetStatus() == v8::Module::kEvaluating) { + js.throwException( + js.typeError(kj::str("Circular dependency when resolving module: ", spec))); return v8::MaybeLocal(); } - auto resolved = check(maybeResolved); + // Validate import type attribute against the resolved module's content type. + KJ_IF_SOME(entry, registry.lookup(js, resolved)) { + validateImportType(js, importType, entry.module, spec); + } if constexpr (!IsSourcePhase) { return resolved; @@ -968,35 +1159,42 @@ v8::MaybeLocal> resolv // with the compiled record, so that we can just directly read `sourceObject_` off of // entry.module instead. if (entry.module.isWasm()) { - v8::Local moduleNamespace; + v8::Local moduleNamespace; if (registry .require(js, resolveContext, IsolateModuleRegistry::RequireOption::RETURN_EMPTY) .ToLocal(&moduleNamespace)) { v8::Local defaultExport; - if (moduleNamespace->Get(js.v8Context(), js.strIntern("default"_kj)) + if (moduleNamespace.As() + ->Get(js.v8Context(), js.strIntern("default"_kj)) .ToLocal(&defaultExport)) { if (defaultExport->IsWasmModuleObject()) { return defaultExport.As(); } } } + // If require() failed with an exception (e.g. WASM compilation error), + // propagate that instead of masking it with the generic message below. + if (js.v8Isolate->HasPendingException()) { + return {}; + } } } js.throwException(js.v8Ref(v8::Exception::SyntaxError( - js.strIntern(kj::str("Source phase import not available for module: "_kj, spec))))); - return v8::MaybeLocal(); + js.str(kj::str("Source phase import not available for module: "_kj, spec))))); + return {}; } + KJ_UNREACHABLE; } js.throwException(js.error(kj::str("Invalid module specifier: "_kj, specifier))); - return v8::MaybeLocal(); + return {}; }, [&](Value exception) -> v8::MaybeLocal { // If there are any synchronously thrown exceptions, we want to catch them // here and convert them into a rejected promise. The only exception are // fatal cases where the isolate is terminating which won't make it here // anyway. js.v8Isolate->ThrowException(exception.getHandle(js)); - return v8::MaybeLocal(); + return {}; }); } @@ -1333,15 +1531,17 @@ const jsg::Url processModuleName(kj::StringPtr name, const jsg::Url& base) { } // namespace -ModuleBundle::BundleBuilder& ModuleBundle::BundleBuilder::addSyntheticModule( - kj::StringPtr name, EvaluateCallback callback, kj::Array namedExports) { +ModuleBundle::BundleBuilder& ModuleBundle::BundleBuilder::addSyntheticModule(kj::StringPtr name, + EvaluateCallback callback, + kj::Array namedExports, + Module::ContentType contentType) { const auto url = processModuleName(name, bundleBase); add(url, [url = url.clone(), callback = kj::mv(callback), namedExports = kj::mv(namedExports), - type = type()](const ResolveContext& context) mutable + type = type(), contentType](const ResolveContext& context) mutable -> kj::Maybe>> { - kj::Own mod = - Module::newSynthetic(kj::mv(url), type, kj::mv(callback), kj::mv(namedExports)); + kj::Own mod = Module::newSynthetic(kj::mv(url), type, kj::mv(callback), + kj::mv(namedExports), Module::Flags::NONE, contentType); return kj::Maybe>>(kj::mv(mod)); }); return *this; @@ -1380,8 +1580,8 @@ ModuleBundle::BundleBuilder& ModuleBundle::BundleBuilder::addWasmModule( [url = url.clone(), callback = kj::mv(callback), type = type()]( const ResolveContext& context) mutable -> kj::Maybe>> { - kj::Own mod = - Module::newSynthetic(kj::mv(url), type, kj::mv(callback), nullptr, EsModule::Flags::WASM); + kj::Own mod = Module::newSynthetic(kj::mv(url), type, kj::mv(callback), nullptr, + EsModule::Flags::WASM, Module::ContentType::WASM); return kj::Maybe>>(kj::mv(mod)); }); return *this; @@ -1600,11 +1800,12 @@ kj::Maybe ModuleRegistry::lookup(const ResolveContext& context) c return kj::none; } -kj::Maybe ModuleRegistry::tryResolveModuleNamespace(Lock& js, +kj::Maybe ModuleRegistry::tryResolveModuleNamespace(Lock& js, kj::StringPtr specifier, ResolveContext::Type type, ResolveContext::Source source, - kj::Maybe maybeReferrer) { + kj::Maybe maybeReferrer, + UnwrapDefault unwrapDefault) { auto& bound = IsolateModuleRegistry::from(js.v8Isolate); auto url = ([&] { KJ_IF_SOME(referrer, maybeReferrer) { @@ -1621,13 +1822,23 @@ kj::Maybe ModuleRegistry::tryResolveModuleNamespace(Lock& js, .rawSpecifier = specifier, }; v8::TryCatch tryCatch(js.v8Isolate); - auto ns = bound.require(js, context, IsolateModuleRegistry::RequireOption::RETURN_EMPTY); + auto option = IsolateModuleRegistry::RequireOption::RETURN_EMPTY; + // Following the behavior of Node.js' require(esm) implementation, we disallow top-level await + // in synchronously required modules. + if (source == ResolveContext::Source::REQUIRE) { + option = option | IsolateModuleRegistry::RequireOption::NO_TOP_LEVEL_AWAIT; + } + if (unwrapDefault == UnwrapDefault::YES) { + option = option | IsolateModuleRegistry::RequireOption::UNWRAP_DEFAULT; + } + + auto ns = bound.require(js, context, option); if (tryCatch.HasCaught()) { tryCatch.ReThrow(); throw JsExceptionThrown(); } if (ns.IsEmpty()) return kj::none; - return JsObject(check(ns)); + return JsValue(check(ns)); } JsValue ModuleRegistry::resolve(Lock& js, @@ -1636,7 +1847,8 @@ JsValue ModuleRegistry::resolve(Lock& js, ResolveContext::Type type, ResolveContext::Source source, kj::Maybe maybeReferrer) { - KJ_IF_SOME(ns, tryResolveModuleNamespace(js, specifier, type, source, maybeReferrer)) { + KJ_IF_SOME(val, tryResolveModuleNamespace(js, specifier, type, source, maybeReferrer)) { + auto ns = KJ_ASSERT_NONNULL(val.tryCast()); return ns.get(js, exportName); } JSG_FAIL_REQUIRE(Error, kj::str("Module not found: ", specifier)); @@ -1644,7 +1856,11 @@ JsValue ModuleRegistry::resolve(Lock& js, // ====================================================================================== -Module::Module(Url id, Type type, Flags flags): id_(kj::mv(id)), type_(type), flags_(flags) {} +Module::Module(Url id, Type type, Flags flags, ContentType contentType) + : id_(kj::mv(id)), + type_(type), + flags_(flags), + contentType_(contentType) {} kj::Maybe> Module::Evaluator::operator()(jsg::Lock& js, const Module& module, @@ -1689,9 +1905,14 @@ bool Module::evaluateContext(const ResolveContext& context) const { return true; } -kj::Own Module::newSynthetic( - Url id, Type type, EvaluateCallback callback, kj::Array namedExports, Flags flags) { - return kj::heap(kj::mv(id), type, kj::mv(callback), kj::mv(namedExports), flags); +kj::Own Module::newSynthetic(Url id, + Type type, + EvaluateCallback callback, + kj::Array namedExports, + Flags flags, + ContentType contentType) { + return kj::heap( + kj::mv(id), type, kj::mv(callback), kj::mv(namedExports), flags, contentType); } kj::Own Module::newEsm(Url id, Type type, kj::Array code, Flags flags) { diff --git a/src/workerd/jsg/modules-new.h b/src/workerd/jsg/modules-new.h index eebab368199..08756d25020 100644 --- a/src/workerd/jsg/modules-new.h +++ b/src/workerd/jsg/modules-new.h @@ -237,6 +237,17 @@ class Module { FALLBACK, }; + // The content type of the module, used to validate import attributes. + // For instance, `import data from './data.json' with { type: 'json' }` + // will only succeed if the module's content type is JSON. + enum class ContentType : uint8_t { + NONE, // No specific content type (ESM, CJS, builtin objects, etc.) + JSON, // JSON module + TEXT, // Text module + DATA, // Data/binary module + WASM, // WebAssembly module + }; + // The flags are set internally and are used to identify various properties // of the module. enum class Flags : uint8_t { @@ -300,6 +311,11 @@ class Module { // If isWasm() returns true, then the module is a WebAssembly module. inline bool isWasm() const; + // Returns the content type of the module. + inline ContentType contentType() const { + return contentType_; + } + // Returns a v8::Module representing this Module definition for the given isolate. // The return value follows the established v8 rules for Maybe. If the returned // maybe is empty, then an exception should have been scheduled on the isolate @@ -359,14 +375,22 @@ class Module { // is different from the Module::Evaluator, which is used to ensure that // evaluation of a module occurs outside of an IoContext. This callback // is always called to actually perform the evaluation of a synthetic module. + // If false is returned, then an exception should have been scheduled on the isolate. using EvaluateCallback = Function; + // Returns a new synthetic module. The callback is invoked to evaluate the module. Due to + // the nature of synthetic modules, the callback is expected to perform all necessary evaluation + // synchronously and return a boolean indicating whether the evaluation succeeded or not. The + // evaluation cannot be async because V8 does not wait for the synthetic module evaluation + // promises to resolve before it considers the module to be evaluated. The most it will do is + // track errors thrown synchronously from the callback to determine whether evaluation failed. static kj::Own newSynthetic(Url id, Type type, EvaluateCallback callback, kj::Array namedExports = nullptr, - Flags flags = Flags::NONE); + Flags flags = Flags::NONE, + ContentType contentType = ContentType::NONE); // Creates a new ESM module that takes ownership of the given code array. // This is generally used to construct ESM modules from a worker bundle. @@ -401,27 +425,6 @@ class Module { kj::Maybe compileExtensions, const CompilationObserver& observer) KJ_WARN_UNUSED_RESULT; - // Some modules may need to protect against being evaluated recursively. The - // EvaluateOnce class makes it possible to guard against that, returning false - // if evaluation has already been started. Once setEvaluating() returns true, - // subsequent calls will always return false — this is single-use by design - // (CJS modules should only be evaluated once). - class EvaluateOnce final { - public: - EvaluateOnce() = default; - KJ_DISALLOW_COPY_AND_MOVE(EvaluateOnce); - - // On the first call, this returns true. On subsequent calls, it returns false. - bool setEvaluating() { - if (evaluating) return false; - evaluating = true; - return true; - } - - private: - bool evaluating = false; - }; - // A CjsStyleModuleHandler is used for CommonJS style modules (including // The template type T must be a jsg::Object that implements a getExports(Lock&) // method returning a JsValue. This is set as the default export of the @@ -430,15 +433,9 @@ class Module { template static EvaluateCallback newCjsStyleModuleHandler( kj::StringPtr source, kj::StringPtr name) KJ_WARN_UNUSED_RESULT { - return [source, name, evaluateOnce = kj::heap()](Lock& js, const Url& id, - const Module::ModuleNamespace& ns, + return [source, name](Lock& js, const Url& id, const Module::ModuleNamespace& ns, const CompilationObserver& observer) mutable -> bool { return js.tryCatch([&] { - // A CJS module can only be evaluated once. Return early if evaluation - // has already been started. - if (!evaluateOnce->setEvaluating()) { - return true; - } auto& wrapper = TypeWrapper::from(js.v8Isolate); auto ext = js.alloc(js, id); ns.setDefault(js, ext->getExports(js)); @@ -474,12 +471,13 @@ class Module { } protected: - Module(Url id, Type type, Flags flags = Flags::NONE); + Module(Url id, Type type, Flags flags = Flags::NONE, ContentType contentType = ContentType::NONE); private: const Url id_; Type type_; Flags flags_; + ContentType contentType_; // TODO: Support source objects as optional instantiation-hook creations and move // Wasm compilation to start at instantiation-time instead of evaluation-time. @@ -545,7 +543,8 @@ class ModuleBundle { BundleBuilder& addSyntheticModule(kj::StringPtr name, EvaluateCallback callback, - kj::Array namedExports = nullptr) KJ_LIFETIMEBOUND; + kj::Array namedExports = nullptr, + Module::ContentType contentType = Module::ContentType::NONE) KJ_LIFETIMEBOUND; BundleBuilder& addEsmModule(kj::StringPtr name, kj::ArrayPtr code, @@ -651,6 +650,12 @@ class ModuleBundle { // and owned by a single Worker instance. In production, however, a // single ModuleRegistry instance may be shared by multiple replicas // of a Worker and therefore must be AtomicRefcounted. + +// When passed to tryResolveModuleNamespace, controls whether non-ESM +// (synthetic) modules return the default export instead of the full +// module namespace. Matches Node.js require() semantics. +WD_STRONG_BOOL(UnwrapDefault); + class ModuleRegistry final: public kj::AtomicRefcounted, public ModuleRegistryBase { private: enum BundleIndices { kBundle, kBuiltin, kBuiltinOnly, kFallback, kBundleCount }; @@ -726,11 +731,12 @@ class ModuleRegistry final: public kj::AtomicRefcounted, public ModuleRegistryBa // JsExceptionThrown exception if an error occurs while the module is being evaluated. // Modules resolved with this method must be capable of fully evaluating within one // drain of the microtask queue. - static kj::Maybe tryResolveModuleNamespace(Lock& js, + static kj::Maybe tryResolveModuleNamespace(Lock& js, kj::StringPtr specifier, ResolveContext::Type type = ResolveContext::Type::BUNDLE, ResolveContext::Source source = ResolveContext::Source::INTERNAL, - kj::Maybe maybeReferrer = kj::none); + kj::Maybe maybeReferrer = kj::none, + UnwrapDefault unwrapDefault = UnwrapDefault::NO); // The constructor is public because kj::heap requires is to be. Do not // use the constructor directly. Use the ModuleRegistry::Builder