Skip to content

test(integration): port Northwind test suite to self-contained framework#778

Draft
kriszyp wants to merge 4 commits into
mainfrom
claude/api-tests-northwind
Draft

test(integration): port Northwind test suite to self-contained framework#778
kriszyp wants to merge 4 commits into
mainfrom
claude/api-tests-northwind

Conversation

@kriszyp
Copy link
Copy Markdown
Member

@kriszyp kriszyp commented May 25, 2026

Summary

  • Migrates 7 legacy Northwind-dependent test files (2_dataLoad.mjs, 3_sqlTests.mjs, 4_noSqlTests.mjs, 5_noSqlRoleTesting.mjs, 6_sqlRoleTesting.mjs, 7_jobsAndJobRoleTesting.mjs, 10_otherRoleTests.mjs) into a single self-contained northwind.test.mjs
  • All seven sub-suites share one Harper instance to avoid repeating the expensive CSV load; schemas, tables, and large CSVs are set up once in before()
  • Shim functions (csvFileUpload, csvDataLoad, csvUrlLoad, insert, searchByHash, checkJob, checkJobCompleted) close over client to bridge the legacy utility API to the new createApiClient/awaitJob framework
  • JSON fixture data (long-text remarks, dog/breed/owner records, etc.) defined at module level and inserted by the 2. Data Load sub-suite so those insert tests remain valid

Test plan

  • node --check integrationTests/apiTests/northwind.test.mjs — passes
  • npm run lint:required — 0 errors
  • CI integration run against a live Harper instance

🤖 Generated with Claude Code

— Claude

kriszyp and others added 2 commits May 25, 2026 08:13
Migrates 7 legacy Northwind-dependent test files (2_dataLoad.mjs,
3_sqlTests.mjs, 4_noSqlTests.mjs, 5_noSqlRoleTesting.mjs,
6_sqlRoleTesting.mjs, 7_jobsAndJobRoleTesting.mjs, 10_otherRoleTests.mjs)
into a single self-contained northwind.test.mjs.

All sub-suites share one Harper instance to avoid repeating the expensive
CSV load. Shim functions bridge the legacy utility API to the new
createApiClient/awaitJob framework.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add dev.rando to before() devTables list (was inserted by tests but never created)
- Create schema '123' and table '123.4' in before() (NoSQL tests use them)
- Fix csvDataLoad shim to delegate to awaitJobCompleted so _expectedError and
  _expectedMessage are actually asserted rather than silently ignored

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp kriszyp requested a review from Ethan-Arrowood May 25, 2026 14:28
Comment on lines +2073 to +2124
for (const [table, pk, csv] of northwindTables) {
await client.req().send({ operation: 'create_table', schema: 'northnwd', table, primary_key: pk }).expect(200);
const csvR = await client
.req()
.send({
operation: 'csv_file_load',
action: 'insert',
schema: 'northnwd',
table,
file_path: csvPath + csv,
})
.expect(200);
await awaitJobCompleted(client, getJobId(csvR.body), { timeoutSeconds: 60 });
}

// ── dev tables (created empty; JSON data inserted by 2_dataLoad tests) ─
const devTables = [
['long_text', 'id'],
['AttributeDropTest', 'hashid'],
['invalid_attribute', 'id'],
['remarks_blob', 'id'],
['time_functions', 'id'],
['dog', 'id'],
['breed', 'id'],
['owner', 'id'],
['sql_function', 'id'],
['leading_zero', 'id'],
['dog_conditions', 'id'],
['rando', 'id'],
];
for (const [table, pk] of devTables) {
await client.req().send({ operation: 'create_table', schema: 'dev', table, primary_key: pk }).expect(200);
}
// Large dev CSVs (books, ratings, movie, credits) — no test in 2_dataLoad re-inserts these
for (const [table, pk, csv, timeout] of [
['books', 'id', 'Books.csv', 30],
['ratings', 'id', 'BooksRatings.csv', 60],
['movie', 'id', 'movies.csv', 60],
['credits', 'movie_id', 'credits.csv', 120],
]) {
await client.req().send({ operation: 'create_table', schema: 'dev', table, primary_key: pk }).expect(200);
const r = await client
.req()
.send({
operation: 'csv_file_load',
action: 'insert',
schema: 'dev',
table,
file_path: csvPath + csv,
})
.expect(200);
await awaitJobCompleted(client, getJobId(r.body), { timeoutSeconds: timeout });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double CSV loading makes "2. Data Load" tests hollow.

before() inserts every Northwind table via csv_file_load action: 'insert' (lines 2073-2086) and loads all four large dev CSVs (lines 2107-2124). Then the "2. Data Load" suite re-uploads the exact same files in tests 1-15.

Harper's insert path silently skips records whose primary key already exists (bulkLoad.ts:756-765, buildResponseMsg"successfully loaded 0 of N records"). The job status is still COMPLETE, so awaitJobCompleted asserts nothing wrong — but every record is skipped. Tests 1-15 exercise only the "graceful no-op on duplicate keys" branch, not the actual insertion path.

The comment on line 2106 makes this explicit in the wrong direction: "Large dev CSVs (books, ratings, movie, credits) — no test in 2_dataLoad re-inserts these" — but tests 12-15 do exactly that.

Suggested fix: Create the tables in before() but do not load their CSV data; leave that to the "2. Data Load" suite so those tests actually exercise the insert path with empty tables. Subsequent suites depend on suite 2 running first — add a comment noting that dependency. This matches the PR description's stated goal of loading CSV data once.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 25, 2026

Reviewed; no blockers found. Commits 2–4 address prior findings correctly: csvPath now resolves to the actual CSV directory, missing tables (rando/books/ratings/movie/credits, schema 123/table 4) are created in before(), csvDataLoad properly propagates expectedError to awaitJobCompleted, and checkJobCompleted guards against false-passes on error paths.

- Remove CSV loads from before() to fix duplicate-PK conflicts: Northwind
  and dev CSV tables are now created empty; 2_dataLoad sub-suite loads all
  data first, relying on sequential sub-suite execution for 3-10
- Fix checkJobCompleted shim: assert job.status !== COMPLETE when expecting
  an error (not just message substring match), and guard against empty body

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp
Copy link
Copy Markdown
Member Author

kriszyp commented May 25, 2026

Cross-model review follow-up (Claude + Codex + Gemini/agy)

Three commits on this branch address review findings:

Commit 2 (Codex findings):

Commit 3 (Gemini/agy findings):

  • Removed CSV loads from before() — loading Northwind + dev CSVs in before() then re-loading them in 2. Data Load sub-suite tests caused duplicate-PK conflicts (job failures). Tables are now created empty; 2. Data Load loads all data first, and sequential sub-suite execution ensures 3–10 have the data.
  • Fixed checkJobCompleted: added assert.notEqual(job.status, 'COMPLETE') when expecting errors (previously only checked message substring — a passing job with a matching string would false-pass), and added a guard assertion for empty job response body.

Remaining significant concern (not addressed — intentional scope decision):
User/role cleanup currently happens in test() blocks at the end of sub-suites. If a test fails before reaching the cleanup block, the user/role persists and causes "already exists" errors in the next sub-suite. Moving cleanup to after() hooks would fix this but requires a larger refactor across all 7 sub-suites. Tracking as a follow-up.

— Claude

- Fix csvPath: join(__dirname, '../data') pointed one level too high;
  correct path is join(__dirname, 'data') matching other test files
- Apply prettier formatting (format check was failing in CI)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant