Skip to content

Commit dabdd52

Browse files
committed
chore: update docs
1 parent b904a06 commit dabdd52

2 files changed

Lines changed: 49 additions & 12 deletions

File tree

AGENTS.md

Lines changed: 48 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,36 +13,60 @@
1313
- `build.zig.zon` — minimal manifest (name, version, fingerprint).
1414
- `.zig-cache/cargo-target/<triple>/` — per-target cargo target directory, isolated so parallel cross builds don't clobber each other's fingerprints.
1515
- `dist/rg-<triple>[.exe|.wasm]` — install outputs. The published package only ships the wasm one.
16-
- `build.ts` — inlines `dist/rg-wasm32-wasip1.wasm` into `lib/_rg.wasm.mjs` (brotli + z85). Run with `node build.ts`.
16+
- `build.ts` — inlines `dist/rg-wasm32-wasip1.wasm` into `lib/_rg.wasm.mjs` (brotli + z85) and stamps the wasm hash into `lib/_rg.mjs` for disk-cache invalidation. Run with `node build.ts`.
1717

1818
### npm package (`lib/`)
1919

20-
- `lib/index.mjs` — ESM entry. Exports `ripgrep(args, options)` and doubles as the `rg` bin (`bin: { rg: "./lib/index.mjs" }`). Compiles the wasm module lazily once via `WebAssembly.compile(getRgWasmBytes())` and caches the `Promise<WebAssembly.Module>`; instances are still created per-call since they own WASI state.
20+
- `lib/index.mjs` — ESM entry. Exports `ripgrep(args, options)` and `rgPath`. Delegates wasm loading and WASI runtime creation to `_rg.mjs`.
21+
- `lib/rg.mjs` — CLI bin entry (`bin: { rg, ripgrep }`). Calls `enableCompileCache()` from `node:module` for faster repeated runs, then forwards `process.argv` to `ripgrep()` and exits with its code.
22+
- `lib/_rg.mjs` — orchestration layer between the wasm blob and WASI runtimes:
23+
- `getRgWasmModule()` — memoizes the `Promise<WebAssembly.Module>`. Decompresses the blob via `brotliDecompressSync` and caches the raw `.wasm` bytes to disk (`os.tmpdir()/ripgrep-wasm-<hash>.wasm`) so subsequent runs skip decompression entirely.
24+
- `createWasiRuntime(options)` — picks the WASI backend: uses `node:wasi` by default on Node (suppresses `ExperimentalWarning`), falls back to the custom shim if `node:wasi` import fails or if custom `stdout`/`stderr` streams without a numeric `fd` are provided. On Bun and Deno the custom shim is always used.
2125
- `lib/_wasi.mjs` — minimal WASI preview1 shim. Implements only the ~23 syscalls ripgrep actually imports (`fd_read`, `fd_write`, `fd_readdir`, `fd_seek`, `fd_tell`, `fd_close`, `fd_fdstat_get`, `fd_fdstat_set_flags`, `fd_filestat_get`, `fd_prestat_*`, `path_open`, `path_filestat_get`, `path_readlink`, `args_*`, `environ_*`, `clock_time_get`, `random_get`, `proc_exit`, `sched_yield`, `poll_oneoff` stub). Backed by `node:fs` sync APIs so it works on Node, Bun, and Deno uniformly. `proc_exit` throws a `WASIExit` that `start()` catches to return the exit code.
22-
- `lib/_rg.wasm.mjs` — auto-generated by `build.ts`. Exports `getRgWasmBytes()` which z85-decodes + `brotliDecompressSync`s the blob into a `Uint8Array` on demand. The encoded blob lives inside a `getEncoded = () => "…"` function so V8 lazy-parses the large string literal and only allocates it on first call, not at import time.
23-
- `lib/index.d.mts` — hand-written types, including an `RgFlag` union of known ripgrep long/short flags for autocomplete on `ripgrep` args (still accepts arbitrary strings via `RgArg = RgFlag | (string & {})`).
24-
- `package.json``name: "ripgrep"`, `type: module`, `files: ["lib"]`, `bin.rg`, exports only `./lib/index.mjs`.
26+
- `lib/_rg.wasm.mjs` — auto-generated by `build.ts`. Exports `getCompressedBytes()` which z85-decodes the blob into raw brotli-compressed bytes. The encoded blob lives inside a `getEncoded = () => "…"` function so V8 lazy-parses the large string literal and only allocates it on first call, not at import time.
27+
- `lib/index.d.mts` — hand-written types. Includes an `RgFlag` union of known ripgrep long/short flags for autocomplete on `ripgrep` args (still accepts arbitrary strings via `RgArg = RgFlag | (string & {})`). Overloads `ripgrep()` for `{ buffer: true }``RipgrepBufferedResult` (with `stdout` and `stderr` strings).
28+
- `package.json``name: "ripgrep"`, `type: module`, `files: ["lib"]`, `bin: { rg, ripgrep }``./lib/rg.mjs`, exports only `./lib/index.mjs`.
2529

2630
### Runtime behavior
2731

2832
- Default preopens map `.``process.cwd()`; absolute paths passed as args are auto-added as preopens so they work without extra configuration.
29-
- ripgrep's TTY auto-detection doesn't work through WASI preview1 (it always sees a non-TTY), so `ripgrep` auto-injects `--color=ansi` when `process.stdout.isTTY` and the caller hasn't specified a color flag. Detection checks `--color`, `--color=…`, and `--no-color`.
30-
- `nodeWasi: true` (or `RIPGREP_NODE_WASI=1`) swaps the custom shim for Node's built-in `node:wasi`. Same `{ imports, start }` shape either way. Node's version prints an `ExperimentalWarning` on every run — that's why the custom shim is the default.
33+
- ripgrep's TTY auto-detection doesn't work through WASI preview1 (it always sees a non-TTY), so `ripgrep` auto-injects `--color=ansi` when `process.stdout.isTTY` and the caller hasn't specified a color flag and no custom `stdout` stream is provided. Detection checks `--color`, `--color=…`, and `--no-color`.
34+
- **WASI backend selection:** On Node, `node:wasi` is used by default (the `ExperimentalWarning` is silently suppressed). On Bun and Deno, the custom shim is used. Override with `nodeWasi: true/false` or `RIPGREP_NODE_WASI=1/0`. If `node:wasi` import fails at runtime, it falls back to the custom shim automatically. When custom `stdout`/`stderr` streams without a numeric `.fd` property are provided, the custom shim is forced regardless.
35+
- **Wasm disk cache:** The decompressed wasm bytes are cached in `os.tmpdir()/ripgrep-wasm-<hash>.wasm`. On subsequent runs, the cached file is read directly via `readFileSync`, skipping z85 decode + brotli decompression. The hash in the filename changes when the wasm binary is rebuilt.
36+
- **Buffered mode:** `ripgrep(args, { buffer: true })` captures stdout/stderr into strings returned as `result.stdout` / `result.stderr`. Custom streams take precedence over buffering for their respective fd.
3137
- `start()` returns `0` on clean exit or the exit code from `WASIExit`; `node:wasi`'s `start()` returns `undefined` on success, so the adapter coerces with `?? 0`. The public `ripgrep()` function wraps this into a `{ code }` result object.
3238

39+
### Testing
40+
41+
- `test/ripgrep.test.mjs` — vitest tests covering the programmatic API (`ripgrep()` with buffered/non-buffered/custom streams), `rgPath` export, and CLI execution via `spawn`.
42+
- `test/fixture/` — test data: `hello.txt`, `subdir/nested.txt`, `link.txt` → symlink to `hello.txt`.
43+
- Run: `pnpm vitest run test/ripgrep.test.mjs`.
44+
45+
### Benchmarks
46+
47+
- `bench/cli.mjs` — cold-start comparison: native `rg` binary vs `node lib/rg.mjs` (both via `execFileSync`). Uses [`mitata`](https://github.com/evanwashere/mitata).
48+
- `bench/api.mjs` — warm comparison: native `exec(rg)` vs `ripgrep()` API (wasm module pre-warmed, measures per-call overhead). Also uses `mitata`.
49+
- Both benchmark against `vendor/ripgrep/crates` searching for `fn main`.
50+
- Run: `node bench/cli.mjs` or `node bench/api.mjs`.
51+
3352
## Build flavors
3453

3554
There is only one cargo profile: cross-compile with the smallest-size settings. No mode option, no native install, no sub-steps for tweaking.
3655

3756
The cargo profile is `release-lto` (defined in `vendor/ripgrep/Cargo.toml`: fat LTO, `codegen-units=1`, `panic="abort"`), with size tuning layered on via `cargo --config`: `opt-level="z"`, `debug=false`, `strip="symbols"`.
3857

58+
For wasm targets, SIMD is enabled via `RUSTFLAGS="-C target-feature=+simd128"` — this unlocks memchr's simd128 vectorized search paths in the wasm binary.
59+
3960
## Commands
4061

62+
- `pnpm build` — build wasm + inline into JS (equivalent to `zig build wasi && node build.ts`)
4163
- `zig build wasi` — build `wasm32-wasip1``dist/rg-wasm32-wasip1.wasm` (default step)
42-
- `zig build native` — cross-compile all native targets → `dist/rg-<triple>[.exe]` (not shipped to npm; kept for local use / future flavors)
64+
- `zig build native` — cross-compile all native targets → `dist/rg-<triple>[.exe]` (not shipped to npm; kept for local use / benchmarks)
4365
- `zig build` — same as `zig build wasi`
44-
- `node build.ts` — inline the built wasm into `lib/_rg.wasm.mjs`. Must be re-run any time the wasm changes.
45-
- `node lib/index.mjs <rg args…>` — run the packaged CLI directly from source. Pass `--node-wasi` as the first arg to use Node's built-in WASI.
66+
- `node build.ts` — inline the built wasm into `lib/_rg.wasm.mjs` and stamp hash into `lib/_rg.mjs`. Must be re-run any time the wasm changes.
67+
- `node lib/rg.mjs <rg args…>` — run the packaged CLI directly from source.
68+
- `pnpm test` — run vitest tests
69+
- `pnpm fmt` — format with oxfmt
4670

4771
## Cross-compile targets
4872

@@ -77,21 +101,34 @@ The install loop uses `.{ .custom = "../dist" }` as the install directory — th
77101
## Gotchas / design notes
78102

79103
- **Wasm is inlined, not a separate file.** `lib/_rg.wasm.mjs` ships the compressed bytes inside an ESM module so the npm package is pure JS — no `.wasm` asset resolution, no postinstall step. The tradeoff is a larger JS file and a one-time decode on first call.
104+
- **Two-layer decompression.** `_rg.wasm.mjs` exports `getCompressedBytes()` (z85 decode only), and `_rg.mjs` handles brotli decompression + disk caching. This separation keeps the generated blob module simple and the caching logic in hand-written code.
105+
- **Disk cache in tmpdir.** The decompressed wasm is written to `os.tmpdir()/ripgrep-wasm-<hash>.wasm` on first run. Subsequent runs load the cached file directly via `readFileSync`, skipping z85 decode and brotli decompression. The `<hash>` is the first 16 hex chars of the wasm's SHA-256, stamped by `build.ts`.
80106
- **Lazy string literal.** The z85-encoded blob is wrapped in `const getEncoded = () => "…"` specifically so V8 doesn't eagerly parse/allocate it at import time. Don't inline it back into a top-level `const`.
81107
- **Wasm module is cached, instances aren't.** `getRgWasmModule()` memoizes the `Promise<WebAssembly.Module>`; each `ripgrep` call still creates a fresh `WebAssembly.Instance` because WASI state (memory, fds) is per-instance.
108+
- **node:wasi ExperimentalWarning suppression.** `createNodeWasi` temporarily replaces `process.emitWarning` with a no-op while constructing the WASI instance, then restores it. This avoids the warning spam without `--no-warnings`.
109+
- **Stream-based stdio forces custom shim.** `node:wasi` only accepts numeric fd values for stdout/stderr. If a custom stream without a `.fd` property is passed, `createWasiRuntime` automatically uses the custom shim instead.
82110
- **Custom shim grants all rights in `fd_fdstat_get`.** ripgrep only reads, so over-granting `fs_rights_base` / `fs_rights_inheriting` (`~0n`) is harmless and avoids tracking precise capability bits.
83111
- **`poll_oneoff` is stubbed to `NOTSUP`.** ripgrep only uses it for stdin-driven modes, which the shim doesn't support anyway (stdin reads return 0 / EOF).
84112
- **`path_open` ENOENT handling.** Only ENOENT is swallowed when `O_CREAT` is set; everything else propagates, so bad paths surface as real errors instead of silent creates.
113+
- **WASM SIMD.** `build.zig` sets `RUSTFLAGS="-C target-feature=+simd128"` for wasm targets, enabling memchr's simd128 vectorized paths for faster search.
85114
- **TOML-quoted `--config` values.** `cargo --config` parses values as TOML, so string values must include the quotes: `opt-level="z"` not `opt-level=z`.
86115
- **Env-var form doesn't work for hyphenated profile names.** `CARGO_PROFILE_RELEASE_LTO_OPT_LEVEL` is ambiguous — cargo parses it as profile `release` with key `lto_opt_level` and fails. Use `--config` instead.
87116
- **Per-target `CARGO_TARGET_DIR`.** Sharing one target dir across triples causes cargo to constantly rebuild dependencies. Keeping them separate under `.zig-cache/cargo-target/<triple>/` gives proper caching.
88117
- **Cargo's output path is `$CARGO_TARGET_DIR/<triple>/release-lto/rg[.exe|.wasm]`** — for custom profiles the profile dir equals the profile name.
89118
- **ripgrep is still a Rust project.** `rustc` does all the code generation; Zig only acts as the C compiler and linker. A pure `zig build` of ripgrep would require rewriting it.
119+
- **`enableCompileCache`.** `rg.mjs` calls `node:module`'s `enableCompileCache()` for faster cold starts on Node. The call is wrapped in try/catch since some Node-compatible runtimes don't support it.
90120

91121
## Dependencies
92122

93123
- `zig` (tested with 0.15.2)
94124
- `rustc` + `cargo` (tested with 1.94.1)
95125
- `cargo-zigbuild` (`cargo install cargo-zigbuild`)
96126
- Rust std for `wasm32-wasip1` (plus the native targets if building those)
97-
- Node.js 18+ / Bun / Deno at runtime (for `WebAssembly.compileStreaming`, web streams, and `node:fs` sync APIs)
127+
- Node.js 18+ / Bun / Deno at runtime (for `WebAssembly.compile`, `node:fs` sync APIs, and `node:zlib` brotli)
128+
129+
### Dev dependencies
130+
131+
- `vitest` — test runner
132+
- `mitata` — benchmarking
133+
- `oxfmt` — code formatter
134+
- `@vitest/coverage-v8` — coverage

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Absolute filesystem path to a JS shim that runs ripgrep via `ripgrep`. Drop-in r
6161
Requirements:
6262

6363
- `zig` (tested with 0.15.2)
64-
- `rustc` + `cargo` (tested with 1.90.0)
64+
- `rustc` + `cargo` (tested with 1.94.1)
6565
- [`cargo-zigbuild`](https://github.com/rust-cross/cargo-zigbuild): `cargo install cargo-zigbuild`
6666
- `rustup target add wasm32-wasip1`
6767

0 commit comments

Comments
 (0)