Skip to content

[SEA-NodeJS] (2/8) napi-rs binding consumer — TS loader + build script#380

Open
msrathore-db wants to merge 6 commits into
msrathore/sea-abstractionfrom
msrathore/sea-napi-binding
Open

[SEA-NodeJS] (2/8) napi-rs binding consumer — TS loader + build script#380
msrathore-db wants to merge 6 commits into
msrathore/sea-abstractionfrom
msrathore/sea-napi-binding

Conversation

@msrathore-db
Copy link
Copy Markdown
Contributor

@msrathore-db msrathore-db commented May 16, 2026

Stack

Linear stack of 8 PRs landing the M0 + M1 Phase 1 SEA NodeJS work. Merge in order from base ↑ to tip. The tip branch (msrathore/sea-auth-u2m, PR #383) is the single snapshot containing everything in flight — point your test or benchmark harness at it for an end-to-end check.

Pos PR Branch Scope
1/8 #378 sea-abstraction IBackend / ISessionBackend / IOperationBackend interfaces
2/8 #380 sea-napi-binding TS loader + build script for the kernel-provided .node artifact
3/8 #377 sea-errors-logging Kernel ErrorCode → JS error-class mapping (M0 minimum)
4/8 #379 sea-auth PAT auth via useSEA: true
5/8 #382 sea-execution executeStatement + openSession (sessionConfig, initialCatalog/Schema)
6/8 #381 sea-results CloudFetch + Inline Arrow result fetching
7/8 #384 sea-operation cancel / close / finished lifecycle + INTERVAL parity + napi-relocation acceptance (absorbed sea-integration content)
8/8 #383 sea-auth-u2m ← TIP M1 Phase 1 OAuth M2M + U2M (5 review rounds, ZERO HIGH at close)

Companion kernel stack (databricks/databricks-sql-kernel): 8 PRs — root #26 (async-public-api) → #27#25#29#28#30#24#23 (tip).

Policy: new PRs always stack on the current tip. No sibling/parallel topology. No force-pushes on existing PRs unless absolutely necessary; if a PR's content is wrong, add a fix-up commit on top of the stack tip rather than rewriting history.


This PR is position 2/8.

Summary

Thin TypeScript loader for the napi-rs binding (SeaNativeLoader), generated native/sea/index.d.ts, and the build:native npm script that compiles against the kernel workspace's napi/ sub-crate.

Companion kernel PR

The Rust napi crate itself lives in the kernel repo — see databricks/databricks-sql-kernel#25 (Database / Connection / Statement / ResultStream classes) and its skeleton predecessor #27.

Test plan

  • npm run build:native produces native/sea/index.linux-x64-gnu.node
  • ✅ Loader correctly require()s the platform-specific binary

Draft until you give the go for review.

Tracking

Co-authored-by: Isaac

@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@msrathore-db msrathore-db force-pushed the msrathore/sea-napi-binding branch from 84d0860 to 20231c8 Compare May 17, 2026 13:22
@msrathore-db msrathore-db changed the base branch from main to msrathore/sea-abstraction May 17, 2026 13:22
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

msrathore-db added a commit that referenced this pull request May 24, 2026
Addresses PR #380 review findings C1, C2 (partial), C3, H1, H2, H3,
H4, H5, H6, H7, H8 (runtime guard), H9, M1, M2, M3, M4 (linguist),
M5, M6, M7, M8, L1, L2, L4.

- C1 lint: drop `.js` ext from native/sea require so eslint
  import/extensions passes; gitignore the auto-generated index.js
  and *.node artifacts; prettier-ignore the napi-rs auto-generated
  index.d.ts / index.js.
- C2 publish: whitelist native/sea/index.{js,d.ts} in .npmignore;
  declare @databricks/sea-native-linux-x64-gnu in optionalDependencies.
  The kernel-side package-name rename (drop the linux-x64-gnu
  prefix) is tracked separately as a cross-PR ask.
- C3 test wiring: move tests/native/version.test.ts ->
  tests/unit/sea/, tests/native/e2e-smoke.test.ts -> tests/e2e/sea/;
  both now picked up by the existing mocharc globs and run via the
  existing `npm test` / `npm run e2e` jobs.
- H1 noise drop: version.test.ts now asserts one meaningful semver
  check; three tautological assertions removed.
- H2 + H3 + H7 lazy load: rewrite SeaNativeLoader.ts on the
  lib/utils/lz4.ts pattern. Lazy require behind getSeaNative();
  capability-detection helper tryGetSeaNative(); structured error
  messages that classify MODULE_NOT_FOUND vs ERR_DLOPEN_FAILED and
  include platform/arch/Node-version + install hint.
- H4 supply chain: pin @napi-rs/cli to 2.18.4 in devDependencies;
  build:native switches to `npx --no-install` so the pinned local
  install is used (no per-build network fetch).
- H5 path: switch build:native default kernel path to the canonical
  `../../databricks-sql-kernel/napi-binding` (the worktree-specific
  `-sea-WT` suffix is gone).
- H6 CI safety: e2e suite hard-fails when CI=true and any required
  env var is missing; dev machines still skip.
- H8 runtime guard: loader throws a structured error on Node <18
  instead of attempting a dlopen that would fail mysteriously.
  Driver's engines.node stays >=14 — Thrift consumers on older
  Node continue to work.
- H9 + M1 + M2 types: tsconfig adds a `@sea-native` path alias to
  native/sea/index.d.ts; loader imports the real Connection /
  Statement / ConnectionOptions / ExecuteOptions / ArrowBatch /
  ArrowSchema types and re-exports them. Tests use the re-exported
  types — the inline-shape duplication across three files collapses.
- M3 README: rewrite native/sea/README.md to match kernel reality.
  The napi crate is a standalone Cargo workspace (not a pyo3
  sibling); explain the tls-rustls choice that the standalone
  workspace exists to enable.
- M4 drift detection: mark native/sea/index.d.ts as
  linguist-generated in .gitattributes so GitHub collapses it in
  diffs and excludes from blame/language stats.
- M5 artifacts: gitignore native/sea/index.js, index.node,
  index.*.node.
- M6 + M7 e2e coverage: decode the IPC payload via apache-arrow's
  tableFromIPC and assert numRows + cell value (not just shape);
  add drain-past-null idempotence and schema-before-fetch
  coverage.
- L1 build scripts: collapse build:native + build:native:debug
  into one script taking BUILD_PROFILE (defaults to --release).
- L2 / L4 / M8: drop the version() alias and the "Round 2+ will…"
  comment debt; the loader now actually delivers the value the
  prior JSDoc was only promising.

Verified on this branch: npm run lint clean (0 errors); npm run
prettier clean for PR-owned files (3 unrelated pre-existing
warnings on PR #378 territory); tsc --project tsconfig.build.json
clean; mocha on tests/unit/sea passes (4/4); mocha on
tests/e2e/sea passes against a live pecotesting warehouse (2/2,
IPC decode confirms SELECT 1 returns 1).
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Creates the napi-rs binding skeleton: Cargo.toml + lib.rs + module
stubs for database/connection/statement/result/error/logger. Captures
napi-rs tokio Handle via OnceCell in runtime.rs. Single working
#[napi] fn version() proves the binding loads + executes end-to-end
in Node.

Depends on krn-async-public-api branch (path dep on kernel).

Round 2 will add open/execute/fetch methods.

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…wired

Adds real async methods on the opaque wrappers backing M0:
- openSession (free function) with PAT → kernel Session
- Connection::execute_statement → kernel ExecutedStatement
- Statement::fetch_next_batch / schema / cancel / close → kernel ResultStream
- Arrow batches returned as IPC bytes (per Layer 2 design)
- Error mapping preserves kernel ErrorCode + SQLSTATE for TS layer
- All entry points wrapped in catch_unwind

End-to-end smoke test against pecotesting passes.
No new dependencies beyond arrow-{ipc,array,schema} + futures.
Uses kernel async public API (no block_on).

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…indings

Round 1 scaffold declared tracing + tracing-subscriber as deps but
never used them. Removed. Logger bridge will re-add in round 3.

Other findings from 6b3affd-2026-05-15.md reviewed:
- Finding 2 (Database::Drop unreachable in Round 1b) — obsoleted by
  Round 2 (40d0b57): database.rs no longer declares a Database struct
  or Drop impl; it is now an `open_session` free function.
- Finding 3 (empty Connection::Drop) — obsoleted by Round 2: the Drop
  impl now spawns a real fire-and-forget close on the captured tokio
  handle.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Per D-006 architectural decision (Python team's workspace pattern):
all language bindings (PyO3, napi-rs) now live as workspace siblings
in the kernel repo at databricks-sql-kernel/{pyo3,napi}/.

What this commit removes from the nodejs repo:
- native/sea/Cargo.toml (path dep relocated; package now at
  databricks-sql-kernel/napi/Cargo.toml with path = "..")
- native/sea/build.rs
- native/sea/src/* (lib, runtime, database, connection, statement,
  result, error, logger, util — all 9 files)
- native/sea/package.json (the @databricks/sea-native-linux-x64-gnu
  sub-package moves to the kernel workspace too)
- native/sea/index.js (regenerated artifact)

What stays in nodejs:
- native/sea/index.d.ts — TS declarations consumed by lib/sea/ adapter
- native/sea/README.md (new) — explains the move; points readers at
  databricks-sql-kernel/napi/

What's updated:
- package.json: `build:native` and `build:native:debug` scripts now
  delegate to the kernel workspace via $DATABRICKS_SQL_KERNEL_REPO
  (defaults to ../../databricks-sql-kernel-sea-WT/napi-binding for the
  local dev worktree layout). Build copies index.node + index.d.ts
  back into native/sea/ for the loader to find.

Why workspace co-location:
- Arrow version pinning lockstep — no silent IPC version drift
- path = ".." (clean) vs ../../../../databricks-sql-kernel-sea-WT/...
- Single CI: cargo build --workspace covers kernel + pyo3 + napi
- Kernel API changes that break either binding caught at PR-review time
- Future cgo binding for Go SEA slots in as another workspace member

This branch (sea-napi-binding) is now a thin consumer of the kernel
napi crate. The actual Rust code lives at krn-napi-binding HEAD on
the kernel repo (commit debe3d7).

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…generated

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Addresses PR #380 review findings C1, C2 (partial), C3, H1, H2, H3,
H4, H5, H6, H7, H8 (runtime guard), H9, M1, M2, M3, M4 (linguist),
M5, M6, M7, M8, L1, L2, L4.

- C1 lint: drop `.js` ext from native/sea require so eslint
  import/extensions passes; gitignore the auto-generated index.js
  and *.node artifacts; prettier-ignore the napi-rs auto-generated
  index.d.ts / index.js.
- C2 publish: whitelist native/sea/index.{js,d.ts} in .npmignore;
  declare @databricks/sea-native-linux-x64-gnu in optionalDependencies.
  The kernel-side package-name rename (drop the linux-x64-gnu
  prefix) is tracked separately as a cross-PR ask.
- C3 test wiring: move tests/native/version.test.ts ->
  tests/unit/sea/, tests/native/e2e-smoke.test.ts -> tests/e2e/sea/;
  both now picked up by the existing mocharc globs and run via the
  existing `npm test` / `npm run e2e` jobs.
- H1 noise drop: version.test.ts now asserts one meaningful semver
  check; three tautological assertions removed.
- H2 + H3 + H7 lazy load: rewrite SeaNativeLoader.ts on the
  lib/utils/lz4.ts pattern. Lazy require behind getSeaNative();
  capability-detection helper tryGetSeaNative(); structured error
  messages that classify MODULE_NOT_FOUND vs ERR_DLOPEN_FAILED and
  include platform/arch/Node-version + install hint.
- H4 supply chain: pin @napi-rs/cli to 2.18.4 in devDependencies;
  build:native switches to `npx --no-install` so the pinned local
  install is used (no per-build network fetch).
- H5 path: switch build:native default kernel path to the canonical
  `../../databricks-sql-kernel/napi-binding` (the worktree-specific
  `-sea-WT` suffix is gone).
- H6 CI safety: e2e suite hard-fails when CI=true and any required
  env var is missing; dev machines still skip.
- H8 runtime guard: loader throws a structured error on Node <18
  instead of attempting a dlopen that would fail mysteriously.
  Driver's engines.node stays >=14 — Thrift consumers on older
  Node continue to work.
- H9 + M1 + M2 types: tsconfig adds a `@sea-native` path alias to
  native/sea/index.d.ts; loader imports the real Connection /
  Statement / ConnectionOptions / ExecuteOptions / ArrowBatch /
  ArrowSchema types and re-exports them. Tests use the re-exported
  types — the inline-shape duplication across three files collapses.
- M3 README: rewrite native/sea/README.md to match kernel reality.
  The napi crate is a standalone Cargo workspace (not a pyo3
  sibling); explain the tls-rustls choice that the standalone
  workspace exists to enable.
- M4 drift detection: mark native/sea/index.d.ts as
  linguist-generated in .gitattributes so GitHub collapses it in
  diffs and excludes from blame/language stats.
- M5 artifacts: gitignore native/sea/index.js, index.node,
  index.*.node.
- M6 + M7 e2e coverage: decode the IPC payload via apache-arrow's
  tableFromIPC and assert numRows + cell value (not just shape);
  add drain-past-null idempotence and schema-before-fetch
  coverage.
- L1 build scripts: collapse build:native + build:native:debug
  into one script taking BUILD_PROFILE (defaults to --release).
- L2 / L4 / M8: drop the version() alias and the "Round 2+ will…"
  comment debt; the loader now actually delivers the value the
  prior JSDoc was only promising.

Verified on this branch: npm run lint clean (0 errors); npm run
prettier clean for PR-owned files (3 unrelated pre-existing
warnings on PR #378 territory); tsc --project tsconfig.build.json
clean; mocha on tests/unit/sea passes (4/4); mocha on
tests/e2e/sea passes against a live pecotesting warehouse (2/2,
IPC decode confirms SELECT 1 returns 1).

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
@msrathore-db msrathore-db force-pushed the msrathore/sea-napi-binding branch from 6b70ce4 to 548a14b Compare May 24, 2026 23:23
Comment thread .npmignore
# selects the per-platform `.node` artifact from `@databricks/sea-native-*`
# optionalDependencies (populated when the kernel CI publishes them);
# the .d.ts is the consumer-facing type contract.
!native/sea/index.js
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High — release-engineering landmine that crashes every consumer the moment PR 3/8+ wires SEA in.

Verified at HEAD 548a14b8:

  • git ls-tree native/sea/ shows ONLY README.md and index.d.ts — no index.js.
  • .gitignore:17 ignores native/sea/index.js (it's regenerated by npm run build:native).
  • .npmignore:10 allow-lists !native/sea/index.js to ship in the tarball.
  • package.json scripts has no prepack or prepublishOnly hook that runs build:native.
  • The loader at lib/sea/SeaNativeLoader.ts:82 does require('../../native/sea') — Node's resolver looks for index.js in that directory.

Result: any npm publish from a workspace that didn't manually run npm run build:native first ships a tarball missing the napi-rs router. Consumers get MODULE_NOT_FOUND with a misleading hint that tells them to install the optional dep (which is already installed) instead of pointing at the missing router shim.

PR #380 doesn't crash anything today because nothing calls getSeaNative(). The moment PR 3/8 wires SeaBackend.connect() to the loader, every install bricks.

  • Suggested fix: either (a) commit native/sea/index.js (analogous to the committed index.d.ts — the file is small and stable), or (b) add a prepack script that runs build:native gated on DATABRICKS_RELEASE_BUILD=1, with an explicit [ -f native/sea/index.js ] || (echo "Missing native router; run npm run build:native first" && exit 1) assertion that fails the pack if it's still missing.

Found by: security, ops, maintainability, devils-advocate (4 reviewers).


Posted by code-review-squad · /full-review

Comment thread tsconfig.json
"forceConsistentCasingInFileNames": true
"forceConsistentCasingInFileNames": true,
"baseUrl": "./",
"paths": {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — published declaration files reference an unresolvable module specifier.

Verified: tsconfig.build.json extends tsconfig.json, so the paths: { "@sea-native": ["./native/sea/index.d.ts"] } mapping propagates into the production build. TypeScript does NOT rewrite path-aliased specifiers in emitted declarations.

When this loader is re-exported from the public surface (which is the goal in PR 3/8+), every downstream TS consumer of @databricks/sql will see import type { ... } from '@sea-native' in the .d.ts and fail type-check with "Cannot find module '@sea-native'".

Today it doesn't break anything because (a) only import type is used in lib/sea/SeaNativeLoader.ts:26 (erased at runtime) and (b) the loader isn't re-exported from lib/index.ts. But the bug is already in the published surface waiting to be triggered.

  • Suggested fix: Drop the paths mapping and use a relative import type { … } from '../../native/sea'. The alias is used by exactly one file — it buys nothing.

Found by: architecture, language, agent-compat, devils-advocate (4 reviewers).


Posted by code-review-squad · /full-review


function loadFailureHint(err: NodeJS.ErrnoException): string {
const platform = `${process.platform}-${process.arch}`;
const installHint = `Install the matching optional dependency (e.g. @databricks/sea-native-${platform}).`;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — every customer who hits this error is sent on a wild-goose chase.

Verified: line 55 builds platform = '${process.platform}-${process.arch}'. On linux/x64 that produces linux-x64; on darwin/arm64 it produces darwin-arm64. The error message then says:

Install the matching optional dependency (e.g. @databricks/sea-native-linux-x64).

But the actual published package (per package.json:95) is @databricks/sea-native-linux-x64-**gnu**. The hint's recommended npm install will 404.

Same issue cross-platform:

  • linux/x64 → suggests @databricks/sea-native-linux-x64; actual is -linux-x64-gnu or -linux-x64-musl

  • linux/arm64 → suggests @databricks/sea-native-linux-arm64; actual is -linux-arm64-gnu

  • win32/x64 → suggests @databricks/sea-native-win32-x64; actual is -win32-x64-msvc

  • Suggested fix: either compute the napi-rs triple correctly (mirror the router's platform→triple table — easier said than done) or drop the package-name example: "Install the matching @databricks/sea-native-* optional dependency for your platform; see the README for the supported triple list."

Found by: ops, agent-compat.


Posted by code-review-squad · /full-review

Comment thread package.json
"optionalDependencies": {
"lz4": "^0.6.5"
"lz4": "^0.6.5",
"@databricks/sea-native-linux-x64-gnu": "0.1.0"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — by far the highest-consensus finding; 7 reviewers flagged it.

Only @databricks/sea-native-linux-x64-gnu: 0.1.0 is in optionalDependencies. macOS-arm64, macOS-x64, linux-arm64-gnu, linux-x64-musl (Alpine), win32-x64-msvc, etc. all have zero installable binding.

Three coupled consequences:

  1. Silent degradation — on every non-linux-x64-gnu platform, the loader will hit MODULE_NOT_FOUND. Users may not realize they're stuck on Thrift.
  2. Misleading hint (see M2) — the loader tells them to install a package that doesn't exist.
  3. Supply-chain squat risk — the unlisted names (@databricks/sea-native-darwin-arm64, …-darwin-x64, etc.) are not on the npm registry. Confirm @databricks org scope is locked to Databricks publishers, OR pre-register empty placeholder packages now, so an external party can't typosquat @databricks/sea-native-darwin-arm64 before Databricks does and have it autoload via the napi-rs router for darwin-arm64 consumers.
  • Suggested fix: Decide between two coherent models:
    • (a) Intentional M0 scope — explicitly document "M0 = linux-x64-gnu only; other platforms continue using Thrift exclusively" in the README and update the loader's error message to match.
    • (b) Multi-platform support — list all planned triples in optionalDependencies (npm tolerates missing optionalDeps on the registry — they skip silently with a warning).

Whichever you pick, address the supply-chain squat question separately (lock the scope or pre-register placeholders).

Found by: security, ops, devils-advocate, performance, architecture, agent-compat, language (7 reviewers).


Posted by code-review-squad · /full-review

export type Connection = NativeConnection;
export type Statement = NativeStatement;

export interface SeaNativeBinding {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — single-source-of-truth violation.

Lines 39-44 declare:

export interface SeaNativeBinding {
  version(): string;
  openSession(options: ConnectionOptions): Promise<NativeConnection>;
  Connection: typeof NativeConnection;
  Statement: typeof NativeStatement;
}

This is a manual subset of what native/sea/index.d.ts already declares. When the kernel adds a new free function (e.g. cancelStatement, future async APIs), the .d.ts regenerates but this interface stays stale until someone notices. The unchecked cast at line 82 (require('../../native/sea') as SeaNativeBinding) hides the drift.

  • Suggested fix: derive the type from the generated module instead of redeclaring it:
import type * as SeaNative from '../../native/sea';
export type SeaNativeBinding = typeof SeaNative;

Then the cast at line 82 stays automatically correct, and the README's claim that "the .d.ts is the contract" is enforced by the type system.

Found by: architecture.


Posted by code-review-squad · /full-review

return `SEA native binding failed to load on platform ${platform} / Node ${process.version}: ${err.message}`;
}

let cached: SeaNativeBinding | null | undefined;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — turns every SeaBackend unit test into an integration test once SEA is wired in.

cached and cachedError are module-level let bindings (lines 65-66). getSeaNative() / tryGetSeaNative() are free functions that close over them. There's no constructor, no DI seam, no exported reset hook.

Consequence for PR 3/8: when SeaBackend.connect() stops throwing NOT_IMPLEMENTED and starts calling getSeaNative(), any unit test that touches SeaBackend will require a real .node to be loadable on the test machine — otherwise it skips. sinon.stub(SeaNativeLoader, 'getSeaNative') works once per file but the cached value persists across files in the same mocha run. Order-dependence is baked in.

The lz4 pattern this is "mirroring" gets away with module-state because the existing codebase wraps it (e.g., Lz4Util and adapters at the call sites accept the resolved module). Here SeaNativeLoader IS the wrapper, and the wrapper has no seam.

  • Suggested fix: expose the loader as a class with an injectable seam:
export class SeaNativeLoader {
  constructor(private readonly load: () => SeaNativeBinding = defaultRequire) {}
  get(): SeaNativeBinding { /* ... */ }
  tryGet(): SeaNativeBinding | undefined { /* ... */ }
}

Then SeaBackend takes loader: SeaNativeLoader via constructor (matches the existing IClientContext pattern in lib/DBSQLClient.ts). Keep getSeaNative() as a thin convenience over a process-global default instance.

Much cheaper to change now (0 callers) than after PRs 3–8 land.

Found by: architecture, devils-advocate, maintainability adjacent (3 reviewers).


Posted by code-review-squad · /full-review

ArrowBatch,
ArrowSchema,
} from '@sea-native';

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — name collisions will surface as IDE autocomplete ambiguity and silent field-shape bugs once SEA is wired in.

Lines 34-37 re-export:

export type { ConnectionOptions, ExecuteOptions, ArrowBatch, ArrowSchema };
export type Connection = NativeConnection;
export type Statement = NativeStatement;

Collisions with existing public types:

  • ConnectionOptions (SEA: hostName/httpPath/token) vs ConnectionOptions from lib/contracts/IDBSQLClient.ts (Thrift: host/path/token + AuthOptions union). The token field is shared → silent shape-mismatch bugs when an IDE auto-imports the wrong one.
  • ArrowBatch (SEA: {ipcBytes: Buffer}) vs ArrowBatch from lib/result/utils.ts (Thrift: {batches: Buffer[]; rowCount: number}).

With 0 callers today this is harmless. The moment SEA is wired into DBSQLClient and these types are imported anywhere outside lib/sea/, the bare names become ambiguous in autocomplete and in code review.

  • Suggested fix: rename the SEA-side exports now, before any production consumer imports them:
export type { ConnectionOptions as SeaConnectionOptions, ExecuteOptions as SeaExecuteOptions, ArrowBatch as SeaArrowBatch, ArrowSchema as SeaArrowSchema };
export type SeaConnection = NativeConnection;
export type SeaStatement = NativeStatement;

The kernel-generated .d.ts can keep the napi-rs default names (it's the kernel's contract). The TS-side wrapper is where disambiguation should happen, and doing it before any consumer imports the bare names is a one-way door.

Found by: agent-compat, architecture, devils-advocate, language, maintainability (5 reviewers).


Posted by code-review-squad · /full-review


describe('SEA native binding — smoke test', function smoke() {
const binding = tryGetSeaNative();
if (binding === undefined) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — coverage gap that masks broken-binding regressions.

Two coupled problems:

  1. tests/unit/sea/version.test.ts:20-30 — silent skip when tryGetSeaNative() returns undefined. No CI escalation. On a linux-x64-gnu CI runner where the optional dep should install, a silently broken @databricks/sea-native-linux-x64-gnu would produce a green build. Compare to tests/e2e/sea/e2e-smoke.test.ts:55-62, which DOES fail loud in CI when env vars are missing — but applies the same logic only to env vars, not to the binding.

  2. Line 33 — the only assertion is expect(binding.version()).to.match(/^\d+\.\d+\.\d+$/). No shape-check on binding.openSession, binding.Connection, binding.Statement. If the kernel renames any of these, the test still passes — green.

  • Suggested fixes:
    • Mirror the CI fail-loud logic from the e2e test: on a runner whose triple is supposed to have a published .node (e.g., process.env.CI === 'true' and process.platform === 'linux' && process.arch === 'x64'), treat binding === undefined as a hard failure.
    • Add shape checks inside tryLoad() (lib/sea/SeaNativeLoader.ts:82 — verify typeof binding.openSession === 'function', binding.Connection exists, binding.Statement exists) so kernel-side shape drift is caught.
    • Add a tests/unit/sea/loader.test.ts with proxyquire covering the loadFailureHint branches (MODULE_NOT_FOUND, ERR_DLOPEN_FAILED, generic), the Node-version gate, and caching. Pure logic tests; run everywhere regardless of binding availability.

Found by: test, devils-advocate, ops, agent-compat.


Posted by code-review-squad · /full-review


function tryLoad(): SeaNativeBinding | undefined {
const nodeMajor = detectNodeMajor();
if (Number.isFinite(nodeMajor) && nodeMajor < MIN_NODE_MAJOR) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — defensive guard works backwards.

if (Number.isFinite(nodeMajor) && nodeMajor < MIN_NODE_MAJOR) evaluates to false when nodeMajor === NaN, so the load proceeds. The intent reads "fail closed if we can't tell what Node we're on" but the code does the opposite.

In practice process.version is always parseable, so the practical risk is near-zero. But it's the kind of guard that future runtime quirks will surface.

Suggested fix:

if (!Number.isFinite(nodeMajor) || nodeMajor < MIN_NODE_MAJOR) {
  // ...
}

Posted by code-review-squad · /full-review

Comment thread native/sea/index.d.ts
executeStatement(sql: string, options: ExecuteOptions): Promise<Statement>
/**
* Explicit close. Marks the connection wrapper as closed so
* subsequent calls on this `Connection` return `InvalidArg`, then
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — agent-friendliness / signal-to-noise.

The docstring on Connection.close discusses tracing::EnteredSpan, !Send futures, napi-rs's execute_tokio_future glue rejects non-Send futures, etc. This is excellent for a Rust reviewer but counterproductive for JS consumers — an LLM agent answering "why doesn't connection.close() await?" will paste the Rust paragraph into a JS issue/Slack where it has no context.

When the kernel fixes this (via cloned span or similar), the .d.ts is regenerated and the now-stale Rust rationale silently disappears — any agent that scraped it for caching is now serving wrong advice.

  • Suggested fix: in the Rust source, move the kernel-internal reasoning into a Rust comment (// SAFETY: style) and keep the #[napi] JSDoc terse and behavior-focused:

Returns immediately after marking the wrapper closed. The server-side close runs in the background; failures are not surfaced to JS. Use Statement.close() (which awaits) if you need to observe close errors per-statement.

Same lens applies to the class-level docs on Connection and StatementArc<Mutex<Option<...>>>, result_stream_mut, ExecutedStatementHandle are all Rust-side concerns the JS audience doesn't need.


Posted by code-review-squad · /full-review

if (err.code === 'MODULE_NOT_FOUND') {
return `SEA native binding not installed for platform ${platform} on Node ${process.version}. ${installHint}`;
}
if (err.code === 'ERR_DLOPEN_FAILED') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — debuggability.

When dlopen fails (binding installed but won't load), real-world causes include musl-vs-glibc, Node ABI version mismatch, architecture mismatch (x64 binary on arm64 via Rosetta). The current hint says "likely a libc or Node ABI mismatch" without:

  • Including the underlying err.message (which carries the actual dlerror string, e.g. GLIBC_2.32 not found).
  • Suggesting concrete remediation (rm -rf node_modules, install musl variant, etc.).

Suggested fix:

return `SEA native binding present but failed to dlopen on platform ${platform} / Node ${process.version}: ${err.message}. Common causes: glibc/musl mismatch (e.g., Alpine Linux), Node ABI mismatch (try \`rm -rf node_modules && npm install\`), or architecture mismatch.`;

Found by: ops, devils-advocate.


Posted by code-review-squad · /full-review

Comment thread native/sea/index.d.ts
token: string
}
/**
* Open a Databricks SQL session over PAT auth and return an opaque
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low (forward-looking; not blocking M0).

ArrowBatch.ipcBytes is documented as "a complete Arrow IPC stream (schema header + 1 record-batch message)". But Statement.schema() already exists and returns the schema-only payload. Re-encoding the schema in every batch means:

  • Extra bytes serialized in Rust, copied across napi, held in V8 heap on every batch.
  • Consumers construct a fresh RecordBatchReader per batch (re-parses schema each time) — see e2e-smoke.test.ts:87.

For wide schemas (hundreds of columns / nested types) the schema bytes are non-trivial and pure overhead per batch.

  • Suggested for M1: have fetchNextBatch() return just the record-batch IPC message (no schema header), or expose a streaming reader handle that emits a single schema followed by record-batch messages. The schema-only schema() API is already in place to support this.

Found by: performance.


Posted by code-review-squad · /full-review

// (process.env.CI === 'true') missing secrets are fatal — a silent
// skip would let credential-rotation regressions reach prod.

const REQUIRED_ENV = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — credential-source proliferation.

This test uses DATABRICKS_PECOTESTING_SERVER_HOSTNAME / ..._HTTP_PATH / ..._TOKEN_PERSONAL. Every other e2e test uses E2E_HOST / E2E_PATH / E2E_ACCESS_TOKEN via tests/e2e/utils/config.ts.

Two separate credential sources in the same npm run e2e invocation means CI may have one set but not the other, leading to mismatched skip behavior, and dev machines need to source two different credential files.

  • Suggested fix: reuse tests/e2e/utils/config.ts (preferred — single source of truth) or document the divergence in native/sea/README.md.

Found by: test.


Posted by code-review-squad · /full-review

@vikrantpuppala
Copy link
Copy Markdown
Collaborator

Code Review Squad — Review

Score: 52/100 — HIGH RISK (driven by supply-chain readiness, not code correctness)

The loader's structural code (SeaNativeLoader.ts) faithfully mirrors the established lib/utils/lz4.ts pattern: lazy load, cached, structured errors, fail-tolerant for Thrift-only consumers. With 0 in-repo callers today, the PR can't crash anything in production yet — but the release-engineering envelope around it has several landmines that will bite the moment PR 3/8 wires SEA into DBSQLClient.

Headline issues: missing native/sea/index.js in the published tarball (gitignored + no prepack hook); the @sea-native path alias leaking into emitted .d.ts; the install-hint suggesting a package name that doesn't exist on the registry (actual name has -gnu suffix). Address H1, M1, M2 + the platform fan-out (M3) before this PR or the next one is wired into the public surface. The remainder are forward-looking design notes that are cheaper to fix now (0 callers) than after the public API is locked in.

Reviewers: 9/9 delivered (security, devils-advocate, language, test, architecture, maintainability, agent-compat, ops, performance).


Could not post inline comments for: M4, L2 — see body below.

Medium Findings

M4 — engines.node = '>=14.0.0' but loader requires Node 18+ — silent contract mismatch at install time

Medium — npm install gives no signal; users find out at runtime.

package.json:12 declares engines.node = ">=14.0.0". lib/sea/SeaNativeLoader.ts:46 defines const MIN_NODE_MAJOR = 18;. The two contracts are not aligned.

A consumer on Node 14 or 16 runs npm install @databricks/sql. npm doesn't warn (14 >= 14). The driver works fine for Thrift. The moment they try SEA — or upgrade to a future driver version where SEA is the default — they get a runtime error.

The defensible read is "SEA is opt-in; Thrift-only consumers are fine on older Node," but that contract isn't documented anywhere user-visible, and there's no install-time signal.

  • Suggested fix: one of:
    • (a) bump engines.node to >=18.0.0 if SEA is the new default backend (will break Thrift-only customers on older Node — probably not yet).
    • (b) extract MIN_NODE_MAJOR to a single source of truth (e.g., lib/version.ts already exists and is generated — could include MIN_SEA_NODE_MAJOR) and reference it from loader + README + CI matrix.
    • (c) at minimum, document the per-feature Node floor in the README.

Found by: ops, devils-advocate, architecture, maintainability (4 reviewers).

Low Findings

Low findings — click to expand

L2 — build:native script is fragile in several dimensions

Low — every developer who runs this on a non-author machine will hit one of these.

Several issues:

  1. bash -c excludes Windows contributors entirely. Use bin/build-native.js (Node script) for cross-platform.
  2. BUILD_PROFILE is misleading — it's substituted as the value of the --platform flag, NOT as a profile name. BUILD_PROFILE=debug produces napi build --platform debug (interprets debug as the target platform). The README documents only BUILD_PROFILE= (empty) for debug, but a contributor reading the var name will reach for debug.
  3. cp index.* glob is overbroad — captures any future stray files in the napi crate dir. Use explicit cp index.js index.d.ts index.*.node native/sea/.
  4. No set -euo pipefail — if cd fails (kernel checkout missing at the default path), bash continues with subsequent commands silently. Add a pre-check: [ -d "${DATABRICKS_SQL_KERNEL_REPO:-../../databricks-sql-kernel/napi-binding}/napi" ] || { echo "Set DATABRICKS_SQL_KERNEL_REPO to your kernel checkout — see native/sea/README.md"; exit 1; }.

Found by: language, maintainability, devils-advocate, ops (4 reviewers).

L7 — Buffer return on fetchNextBatch() forces a copy out of Rust on every batch (FFI hot path)

native/sea/index.d.ts:46-52fetchNextBatch() returns ArrowBatch { ipcBytes: Buffer }. napi-rs's Buffer is an owning Node-allocated buffer; every batch is memcpy'd from Rust into V8 heap.

For the result-streaming hot path this is the dominant per-batch cost beyond the schema duplication (L5). Zero-copy alternatives to consider before the surface is wired into the result layer: napi_create_external_arraybuffer with a finalizer that keeps Rust-side bytes::Bytes alive — Node observes the bytes in place, drop runs on GC.

Cheap to change now (0 callers); breaking after. Forward-looking, not blocking M0.

Found by: performance.

L8 — Per-batch promise round-trip; no async iterator / readable stream

native/sea/index.d.ts:122-128fetchNextBatch(): Promise<ArrowBatch | null> means N batches → N independent promise resolutions across the napi tokio→libuv→V8 boundary. Standard pattern, but the lower-overhead alternative is an async iterator or push-style channel (ThreadsafeFunction calling a JS callback per batch) so the producer doesn't stall waiting for JS to request the next batch.

For pipelined fetches against a fast warehouse the per-batch event-loop ping-pong dominates once the per-batch payload is small. Not blocking M0; revisit in the M1 streaming work flagged in the docstrings.

Found by: performance.

L9 — No batch-size / max-rows hint on executeStatement ExecuteOptions

native/sea/index.d.ts:6-22ExecuteOptions = { initialCatalog?, initialSchema?, sessionConfig? } doesn't surface any batch-size knob. Consumers can't trade memory for throughput. Server-side default may be fine for M0 but worth tracking — once the surface is exported it's harder to extend without a breaking option-shape change. sessionConfig is a partial escape hatch but not discoverable.

Found by: performance.

L10 — No logging/warning when binding load fails; failure is silent until first getSeaNative() call

lib/sea/SeaNativeLoader.ts:65-90lib/utils/lz4.ts at least console.warns on unknown errors. SeaNativeLoader.ts does not — it caches the error and exposes it only via getSeaNative(). So a consumer who calls tryGetSeaNative() at startup and silently gets undefined has no log line indicating why. In production, this manifests as "SEA suddenly stopped working after a deploy, no logs" — extremely hard to diagnose.

Better: emit a telemetry event tagged {backend: 'sea' | 'thrift', binding_loaded: bool, load_error_code, platform, node_version} so fleet-wide rollout visibility exists without per-customer log capture.

Found by: ops, devils-advocate.

L11 — Tests use this.pending = true; redundantly with it.skip(...); duplicated binding-absent preamble across files

tests/unit/sea/version.test.ts:23,27this.pending = true followed by it.skip(...). The it.skip() call already marks the suite as skipped; the this.pending = true line has an eslint-disable no-invalid-this because the function isn't an arrow. tests/e2e/sea/e2e-smoke.test.ts uses just it.skip — pick one style.

As tests/sea/ grows past these two files, factor the "binding absent → skip" preamble (unit:21-30 and e2e:48-53) into a tests/helpers/sea.ts (skipIfSeaUnavailable(suite)).

Found by: test, maintainability, ops, devils-advocate (4 reviewers).

L12 — cachedError lives on as module state after cached=null; pair into a discriminated union

lib/sea/SeaNativeLoader.ts:65-66cached and cachedError are separate let bindings with a 3-state implicit cache (undefined / null / value). Today's pairing is correct, but if a future refactor ever resets cached to undefined without also clearing cachedError, a subsequent success would carry a stale error around.

  • Suggested fix: discriminated union — let state: { ok: true; binding } | { ok: false; err: Error } | undefined.

Found by: language, maintainability.

L13 — linguist-generated=true for native/sea/index.d.ts hides PR diffs

.gitattributes:36 marks native/sea/index.d.ts as linguist-generated=true. This is the canonical "auto-generated" flag, but its side effect is that GitHub collapses diffs of this file by default. Since this file IS the public TS contract for the napi-rs binding, reviewers will NOT see changes to it on subsequent PRs without explicitly expanding the diff.

Consider linguist-detectable=false instead (excludes from language stats only, keeps diff visible).

Found by: devils-advocate.

L14 — No musl vs glibc detection; Alpine consumers will hit ERR_DLOPEN_FAILED with no Alpine-specific hint

Per M3, only linux-x64-gnu is in optionalDependencies. The router would (assuming napi-rs's per-triple package even installs on Alpine via os/cpu gating — depends on the per-triple package's metadata) try to load against ld-linux-x86-64.so.2, which Alpine doesn't have. Result: ERR_DLOPEN_FAILED with a generic "libc or Node ABI mismatch" hint.

  • Suggested fix: either ship a linux-x64-musl build, or detect musl at runtime (read /etc/os-release or check process.report.getReport().header.glibcVersionRuntime) and surface a precise message: "Alpine Linux detected — no musl build is available for this driver. Continue using the Thrift backend on this runtime."

Found by: devils-advocate.

@msrathore-db msrathore-db marked this pull request as ready for review May 29, 2026 10:53
msrathore-db added a commit that referenced this pull request May 29, 2026
… IOperationBackend (#378)

* sea-abstraction: introduce IBackend / ISessionBackend / IOperationBackend

Refactors DBSQLClient/Session/Operation to dispatch through three
backend interfaces. ThriftBackend (lib/thrift-backend/) contains the
relocated existing thrift logic. SeaBackend (lib/sea/) is a stub for
M0; the sea-napi-binding feature wires the real impl.

Public surface (lib/index.ts) unchanged.
No new dependencies. All existing tests pass.

Files:
- lib/contracts/IBackend.ts (new)
- lib/contracts/ISessionBackend.ts (new)
- lib/contracts/IOperationBackend.ts (new)
- lib/contracts/IDBSQLClient.ts (adds useSEA?: boolean to ConnectionOptions)
- lib/thrift-backend/ThriftBackend.ts (new)
- lib/thrift-backend/ThriftSessionBackend.ts (new)
- lib/thrift-backend/ThriftOperationBackend.ts (new)
- lib/sea/SeaBackend.ts (new, M0 stub)
- lib/DBSQLClient.ts (dispatch through IBackend; useSEA picks SeaBackend)
- lib/DBSQLSession.ts (facade over ISessionBackend; staging stays here)
- lib/DBSQLOperation.ts (facade over IOperationBackend; iterators/fetchAll stay here)
- tests/unit/DBSQLClient.test.ts (retarget internal state lookup through backend; pre-seed client.backend in tests that bypass connect())
- tests/unit/DBSQLOperation.test.ts (retarget internal state lookup through backend)

* sea-abstraction: cleanup — restore JSDoc, dedupe test pre-seed, fix inline type

Addresses code-bloat-watchdog findings from commit 0085928:
- Restores public-API JSDoc on DBSQLSession + DBSQLOperation methods
  (was deleted as scope creep; contracts unchanged so docs still apply)
- Adds makeStubbedClient() helper to tests/unit/DBSQLClient.test.ts;
  replaces 14× duplicated ThriftBackend pre-seed
- Imports WaitUntilReadyOptions instead of inline option types in
  IOperationBackend + DBSQLOperation.waitUntilReady

* sea-abstraction: address full-review findings (F1-F17 except F5)

Round-N fixes from the 9-reviewer pre-review. Public IOperation/DBSQLOperation
surface preserved byte-identical; backend interfaces (IBackend / ISessionBackend
/ IOperationBackend) made fully neutral so both Thrift and SEA can implement
the same contract.

F1 — neutral DTOs at IOperationBackend with Thrift-shape preservation on the
  public facade (adapter pattern):
- lib/contracts/OperationStatus.ts (new) — neutral OperationStatus + OperationState
  enum mirroring databricks-sql-python's CommandState and kernel pyo3's
  StatementStatus taxonomy.
- lib/contracts/ResultMetadata.ts (new) — neutral ResultMetadata + ResultFormat
  enum mirroring the three TSparkRowSetType cases.
- IOperationBackend.status()/getResultMetadata() return the neutral DTOs.
- ThriftOperationBackend.status() adapts at the boundary via adaptOperationStatus
  / adaptResultMetadata; module-level helpers thriftStateToOperationState and
  thriftRowSetTypeToResultFormat do the enum maps.
- ThriftOperationBackend exposes thriftStatusResponse() and
  thriftResultMetadataResponse() as public Thrift-only accessors used by the
  facade's zero-loss fast path (kept for internal state-machine + result-handler
  dispatch as well).
- lib/utils/thriftWireSynthesis.ts (new) — synthesizeThriftStatus and
  synthesizeThriftResultSetMetadata: convert neutral DTOs back to Thrift wire
  shape for the non-Thrift backend path. Lossy on Thrift-only fields
  (taskStatus, numModifiedRows, cacheLookupResult, etc.).
- DBSQLOperation.status() and getMetadata() preserved Thrift return shape:
  Thrift backend path returns the real wire response (zero loss); non-Thrift
  backend path synthesizes via the new helpers.
- DBSQLOperation.getResultMetadata() — new additive neutral accessor on
  IOperation; DBSQLSession.handleStagingOperation uses it instead of the
  deprecated Thrift-shaped getMetadata().

F2 — IBackend.connect() is now zero-arg. Backend reads everything it needs
  from IClientContext / constructor; matches Python connector's pattern of
  passing session_configuration via constructor not method-arg.

F3 — Restore the 'Server protocol version' debug log dropped by the original
  PR-378 refactor. Re-added to ThriftSessionBackend.constructor with the
  LogLevel.debug + IClientContext.getLogger() pattern; matches the pre-refactor
  log site at main:lib/DBSQLSession.ts:175.

F4 + F11 + F14 — SeaBackend stub safety:
- close() is a no-op so DBSQLClient.close()'s state-clearing block can finish
  even after a useSEA: true connect() failure.
- connect() and openSession() throw HiveDriverError instead of generic Error,
  matching the rest of the codebase.
- connect(options: ConnectionOptions) and openSession(request: OpenSessionRequest)
  declare their parameters (with @typescript-eslint/no-unused-vars disable)
  so IDE autocomplete prompts the M1 SEA implementer.

F6 + F7 + F9 + F10 — JSDoc on backend interfaces:
- IBackend: connect/openSession/close docstrings; close() doc explicitly
  states transport-layer cleanup is owned by DBSQLClient.
- ISessionBackend: copy IDBSQLSession's per-method one-liner JSDoc.
- IOperationBackend: doc hasResultSet (readonly external; mutates internally),
  waitUntilReady (MUST throw OperationStateError on terminal non-success).

F8 — tests/unit/sea/SeaBackend.test.ts (new) locks in the stub contract:
  connect() rejects HiveDriverError, openSession() rejects HiveDriverError,
  close() resolves no-op. ~30 LOC.

F12 — Drop legacy { handle, ... } ctor branch from DBSQLOperation and
  DBSQLSession:
- Facades accept only { backend, context }.
- DBSQLSession no longer imports ThriftSessionBackend at all.
- DBSQLOperation imports ThriftOperationBackend solely for the F1 typed
  downcast (zero-loss Thrift fast path); this is a deliberate, scoped
  coupling tied to the back-compat decision.
- tests/unit/.stubs/createSessionForTest.ts and createOperationForTest.ts
  (new) wrap the legacy shape; all 48 + 54 test sites mechanically migrated.

F15 — ThriftOperationBackend.waitUntilReady uses imported WaitUntilReadyOptions
  type instead of an anonymous inline shape.

F16 — useSEA flag moved out of public ConnectionOptions:
- Removed useSEA?: boolean from the exported lib/contracts/IDBSQLClient.ts
  ConnectionOptions; no longer ships in the public .d.ts.
- lib/contracts/InternalConnectionOptions.ts (new) declares the flag as a
  non-exported internal extension; DBSQLClient.connect() reads via a typed
  cast. Mirrors Python's kwargs.get('use_sea', False) pattern at
  databricks-sql-python/src/databricks/sql/session.py:111.

F17 — Missing return; after case 'timeout' in forwardConnectionEvent so a
  future fifth case doesn't silently fall through. The trailing return; in
  the last case triggers no-useless-return — quieted with a localized
  eslint-disable-next-line + intent comment.

F5 — deferred per owner instruction (test-only as any cast tightening).

Verification:
- yarn lint clean (3 pre-existing warnings in tests/e2e/protocol_versions.test.ts).
- yarn build clean.
- tsc --noEmit -p tsconfig.json clean (apart from pre-existing
  examples/tokenFederation/* import errors that exist on main).
- Runtime smoke test of SeaBackend stub + Thrift-wire synthesis round-trip
  passes 5/5 assertions.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>

* sea-abstraction: address PR #378 review-comment fixes (H1 / M1-M4 / L1-L10)

Addresses 15 review findings from the code-review-squad pass on PR #378.
L11 (backend kind field on the three interfaces) is deliberately deferred
to avoid a cross-stack cascade ripple while the downstream PRs are still
in flight.

H1 — fetchChunk lost mid-flight failIfClosed regression.
  Add optional `isClosed?: () => boolean` to IOperationBackend.fetchChunk's
  options bag. ThriftOperationBackend.fetchChunk probes it after the
  setTimeout(0) macrotask yield and returns [] when set; the facade's
  post-fetch failIfClosed then raises the user-visible OperationStateError.
  Restores the guard that the refactor split across the facade/backend
  boundary so a cancel/close arriving during the yield window no longer
  runs the data RPC to completion needlessly.

M1 — neutralize WaitUntilReadyOptions callback shape.
  Introduce IOperationBackendWaitOptions { callback?: (status:
  OperationStatus) => unknown } on the backend interface. Facade keeps
  the public Thrift-typed OperationStatusCallback and adapts at the
  boundary by wrapping the user's callback with synthesizeThriftStatus.
  ThriftOperationBackend.waitUntilReady consumes the neutral options and
  passes adaptOperationStatus(response) to the callback.

M2 — synthesizeOkStatus maps OperationState to TStatusCode.
  Add synthesizeStatusFromOperation that returns ERROR_STATUS for
  Failed/Cancelled/Closed (carrying errorMessage + sqlState) and
  SUCCESS_STATUS otherwise. Wire it into synthesizeThriftStatus so
  legacy Status.assert(resp.status) sees the right code on non-Thrift
  backends.

M3 — TelemetryEvent + DriverConfiguration carry a backend tag.
  Add optional backend?: 'thrift' | 'sea' | 'kernel' on both interfaces
  so dashboards can slice latency/error rate by backend without a
  metrics-schema migration once non-Thrift emission goes live.

M4 — test coverage for the synthesize helpers + useSEA failure path.
  New tests/unit/thrift-backend/wireSynthesis.test.ts covering all
  OperationState/ResultFormat mappings, ERROR_STATUS carries
  errorMessage/sqlState, hasResultSet round-trip, schema/arrowSchema/
  lz4Compressed/isStagingOperation preservation, and the L3 throw on
  unknown ResultFormat. New test in DBSQLClient.test.ts asserts that a
  useSEA:true connect failure leaves this.backend === undefined and the
  next openSession() surfaces "not connected" rather than the
  SeaBackend's "not implemented" error.

L1 — forwardConnectionEvent normalizes payload to Error.
  Replace `payload as Error` with `payload instanceof Error ? payload
  : new Error(String(payload))` so a backend that emits a non-Error
  through the cross-backend onConnectionEvent doesn't crash the
  logger.log call.

L2 — DBSQLClient.connect publishes this.backend only on success.
  Construct the backend locally, await connect() in a try/catch, run a
  best-effort backend.close() (per IBackend.close()'s
  safe-on-partial-init contract) and rethrow on failure. Only assign
  this.backend after a clean connect so a failed connect surfaces
  "DBSQLClient: not connected" on the next openSession.

L3 — resultFormatToThrift throws on unknown ResultFormat.
  Replace the silent default fallback to COLUMN_BASED_SET with a
  HiveDriverError. Prevents a future ResultFormat enum extension from
  silently routing results through JsonResultHandler and surfacing
  garbled rows.

L4 — DBSQLOperation.getMetadata carries @deprecated.
  Adds the canonical TypeScript JSDoc tag so IDEs (strikethrough), tsc,
  ESLint plugins, and agentic codegen pick up the soft deprecation in
  favour of getResultMetadata.

L5 — numberToInt64 re-export carries @deprecated.
  Re-export through a named const with a JSDoc block (rather than a
  bare `export { ... } from`) so the @deprecated tag attaches to the
  symbol consumers see in their IDE / .d.ts.

L6 — DBSQLSession.runBackend helper.
  Collapse 11 duplicated `failIfClosed → backend.X → failIfClosed`
  brackets into a single private runBackend<T>(fn) so the
  open-flag-before-and-after contract has a name and can't be forgotten
  in a new delegation method.

L7 — restore three why-comments deleted from DBSQLSession.
  Staging-detection invariant in executeStatement, AWS-vs-Azure 404
  difference on staging-remove, and the Content-Length-required note on
  staging-upload. Verbatim from main; these document non-obvious
  intentional behaviour the refactor inadvertently dropped.

L8 — hasResultSet becomes a method on IOperationBackend.
  The value is state-dependent (the Thrift impl mutates the underlying
  operation handle inside processOperationStatusResponse), so the
  property+readonly+disclaimer-JSDoc pattern was misleading. Method
  form makes the live-read semantics obvious to a fresh implementer.
  3 facade call sites updated.

L9 — wireSynthesis moves under thrift-backend.
  The file imports Thrift IDL types and produces Thrift-typed values;
  it belongs next to ThriftOperationBackend, not in the neutral
  lib/utils/ tree where it would creep into the dependency cone of
  future backend-neutral helpers. Same reasoning that placed
  numberToInt64 and getDirectResultsOptions under thrift-backend/.

L10 — interface-level downcast policy.
  Add a JSDoc paragraph on IOperationBackend grandfathering the two
  existing `instanceof ThriftOperationBackend` downcasts in
  DBSQLOperation.status/getMetadata and prohibiting new ones. Future
  zero-loss back-compat needs should extend the interface (or add an
  optional method) rather than spawn a per-backend branch matrix.

Gates: yarn build (exit 0), yarn lint (0 errors, 3 pre-existing
warnings in tests/e2e/protocol_versions.test.ts), yarn test on touched
files (163 passing, +12 net new tests from M4 work; 2 failures pre-
existing on PR head unchanged: getSchema-directResults and the
LZ4-cloud-fetch flag — both flagged in the team-lead playbook as
known prior regressions).

Cascade implications for downstream PRs (#380 #377 #379 #382 #381
#384 #383): L8 converts hasResultSet from a property to a method,
M1 swaps WaitUntilReadyOptions for IOperationBackendWaitOptions on
the backend interface. Both are mechanical renames at downstream
backend impls when they rebase.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>

* Fix SEA abstraction merge fallout

Restore Thrift compatibility paths needed by existing schema and result-handler tests after merging main telemetry changes.

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>

* Restore Thrift result-handler compatibility hooks

Keep existing e2e-only inspection hooks available through the facade while the new backend abstraction owns result handling.

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>

---------

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants