Skip to content

Harden Node archive Starlark generation#2102

Open
yoshihiro999 wants to merge 2 commits into
GoogleContainerTools:mainfrom
yoshihiro999:harden-node-archive-generation-v2
Open

Harden Node archive Starlark generation#2102
yoshihiro999 wants to merge 2 commits into
GoogleContainerTools:mainfrom
yoshihiro999:harden-node-archive-generation-v2

Conversation

@yoshihiro999

Copy link
Copy Markdown

This hardens the Node archive updater by validating release metadata before generation and consistently quoting generated Starlark strings.

Changes:

  • Validate Node.js versions from the release index with a conservative x.y.z allowlist.
  • Validate archive suffixes, distroless architecture names, and SHA256 values before writing Starlark.
  • Quote generated Starlark string values instead of interpolating raw metadata into string literals.
  • Fail fast if an expected checksum cannot be fetched.
  • Use synchronous generated-file writes so write errors propagate.
  • Add a small regression test for invalid metadata values and Starlark string escaping.

Local checks run:

node knife.d/update_node_archives_test.js
node --check knife.d/update_node_archives.js
node --check knife.d/update_node_archives_test.js
bash -n knife
git diff --check

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces input validation and sanitization when updating Node.js archives, including regex checks for versions, architectures, and checksums, escaping Starlark strings, and switching to synchronous file writing. It also adds a unit test suite for these validations. The review feedback highlights a critical issue where calculateChecksum does not validate the HTTP response status code, which could result in hashing error pages on non-200 responses, and provides a code suggestion to reject the promise for non-200 status codes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

};
} catch (error) {
console.error(error);
throw new Error(`Failed to fetch checksum for ${url}: ${error}`);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The calculateChecksum function (defined around line 52) does not validate the HTTP response status code (res.statusCode).\n\nIf the server returns a non-200 status code (such as 404 Not Found or 500 Internal Server Error), https.get will still succeed without triggering the 'error' event. As a result, calculateChecksum will calculate and return the SHA256 hash of the error response body (e.g., the HTML error page) instead of rejecting. This defeats the "fail fast" behavior intended by this PR, as invalid checksums will be silently generated and written to the Starlark file.\n\nTo fix this, please update calculateChecksum to reject the promise if res.statusCode !== 200. For example:\n\njavascript\nconst calculateChecksum = (url) => {\n return new Promise((resolve, reject) => {\n https\n .get(url, (res) => {\n if (res.statusCode !== 200) {\n reject(new Error(\"Request failed with status code \" + res.statusCode));\n return;\n }\n const hash = crypto.createHash(\"sha256\");\n res.on(\"data\", (data) => {\n hash.update(data);\n });\n res.on(\"end\", () => {\n resolve(hash.digest(\"hex\"));\n });\n })\n .on(\"error\", (err) => {\n reject(\"Error downloading file: \" + err.message);\n });\n });\n};\n

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant