fix(firestore-bigquery-change-tracker): unmask BQ errors and stop spurious table updates#2817
fix(firestore-bigquery-change-tracker): unmask BQ errors and stop spurious table updates#2817
Conversation
…rious table updates Fixes #2801, fixes #2194. #2801 - handleFailedTransactions The dead-letter path called `db.settings({ ignoreUndefinedProperties: true })` on every invocation. `Firestore.settings()` throws if the singleton has already been used, so on warm Cloud Functions containers the second invocation crashed with "Firestore has already been initialized", masking the original BigQuery insert error that triggered the dead-letter path in the first place. Guard the call with a per-instance Set and a try/catch: settings is applied once per process per instance, and any throw is swallowed so the dead-letter write to the backup collection still runs and the real BQ error surfaces. #2194 - tableRequiresUpdate always returning true Two bugs: 1. Clustering check: `JSON.stringify(config.clustering)` returns "null" when clustering is not configured, but `JSON.stringify(metadata.clustering?.fields || [])` returns "[]". They never match, so the table is always flagged as needing update. Normalise null/undefined to `[]` before stringifying. 2. `pathParamsColExists` was assigned the result of `fields.find()` (which is `Field | undefined`), then strict-compared against a boolean (`!!config.wildcardIds !== pathParamsColExists`). When the column is absent, `false !== undefined` is always true. The function signature already declares these params as `boolean`; coerce in the caller. Same fix applied to `documentIdColExists` and `oldDataColExists` for type-honesty (they already worked correctly under truthy checks but the type was being violated).
There was a problem hiding this comment.
Code Review
This pull request improves the robustness of the BigQuery change tracker by handling undefined clustering configurations and safely applying Firestore settings during failed transactions to avoid runtime errors. It also ensures boolean values for column existence checks. Feedback suggests enhancing type safety for the Firestore instance ID and using the more idiomatic Array.prototype.some() method instead of !!find() for checking the presence of specific columns.
| const settingsApplied = new Set<string>(); | ||
|
|
||
| const getConfiguredFirestore = (instanceId: string) => { |
There was a problem hiding this comment.
The instanceId parameter and the settingsApplied Set should ideally account for undefined values to be type-safe, as config.firestoreInstanceId is optional in the configuration. While the current caller ensures a default value, making this utility more robust prevents potential type errors in other contexts.
| const settingsApplied = new Set<string>(); | |
| const getConfiguredFirestore = (instanceId: string) => { | |
| const settingsApplied = new Set<string | undefined>(); | |
| const getConfiguredFirestore = (instanceId?: string) => { |
| db.settings({ | ||
| ignoreUndefinedProperties: true, | ||
| }); | ||
| const db = getConfiguredFirestore(config.firestoreInstanceId!); |
| const documentIdColExists = !!fields.find( | ||
| (column) => column.name === "document_id" | ||
| ); |
There was a problem hiding this comment.
Using Array.prototype.some() is more idiomatic and readable when you only need to check for the existence of an element that matches a condition, as it returns a boolean directly.
| const documentIdColExists = !!fields.find( | |
| (column) => column.name === "document_id" | |
| ); | |
| const documentIdColExists = fields.some( | |
| (column) => column.name === "document_id" | |
| ); |
| const pathParamsColExists = !!fields.find( | ||
| (column) => column.name === "path_params" | ||
| ); |
There was a problem hiding this comment.
Using Array.prototype.some() is more idiomatic and readable when you only need to check for the existence of an element that matches a condition, as it returns a boolean directly.
| const pathParamsColExists = !!fields.find( | |
| (column) => column.name === "path_params" | |
| ); | |
| const pathParamsColExists = fields.some( | |
| (column) => column.name === "path_params" | |
| ); |
| const oldDataColExists = !!fields.find( | ||
| (column) => column.name === "old_data" | ||
| ); |
There was a problem hiding this comment.
Using Array.prototype.some() is more idiomatic and readable when you only need to check for the existence of an element that matches a condition, as it returns a boolean directly.
| const oldDataColExists = !!fields.find( | |
| (column) => column.name === "old_data" | |
| ); | |
| const oldDataColExists = fields.some( | |
| (column) => column.name === "old_data" | |
| ); |
Summary
Two small but high-impact fixes in
firestore-bigquery-change-tracker. Both bugs were diagnosed in detail by the issue reporters with the offending lines called out.#2801: dead-letter path crashes on warm containers
bigquery/handleFailedTransactions.tscalleddb.settings({ ignoreUndefinedProperties: true })inside the exported async body. The Firestore Node SDK contract requiressettings()be called at most once and before any other method runs on the singleton. On warm Cloud Functions containers both conditions are violated by the second invocation; the resultingFirestore has already been initializedthrow masks the real BigQuery insert failure that put us on the dead-letter path in the first place.Fix: guard with a per-instance `Set`, plus a `try/catch` so that even if the singleton was used elsewhere in the process before this module's first call, the dead-letter write to the backup collection still proceeds and the real BQ error surfaces upstream.
#2194: `tableRequiresUpdate` always returns true
Two independent bugs in the same function:
Clustering check:
```ts
const configCluster = JSON.stringify(config.clustering); // "null" when unconfigured
const tableCluster = JSON.stringify(metadata.clustering?.fields || []); // "[]"
if (configCluster !== tableCluster) return true; // always true
```
Fixed by normalising null/undefined to `[]` before stringifying: `JSON.stringify(config.clustering || [])`.
`pathParamsColExists` strict-equality against `undefined`:
`bigquery/index.ts` assigned the result of `fields.find(...)` (which is `Field | undefined`) to a parameter typed `boolean`. The check `!!config.wildcardIds !== pathParamsColExists` then resolved to `false !== undefined` -> always true.
Fixed by coercing in the caller (`!!fields.find(...)`). Applied to `documentIdColExists` and `oldDataColExists` too since the function signature already promises `boolean` for all three (downstream checks happened to tolerate `undefined` for those two, but the type was being violated).
Test plan
The diff is intentionally tight (3 files, +23 / -8). I haven't added new integration tests since they all target a live BigQuery project; happy to add unit-level mocks for both fixes in a follow-up if reviewers prefer.