-
Notifications
You must be signed in to change notification settings - Fork 8
feat(component): Add deterministic component fingerprints #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| // Copyright (c) Microsoft Corporation. | ||
| // Licensed under the MIT License. | ||
|
|
||
| // Package fingerprint computes deterministic identity fingerprints for components. | ||
| // A fingerprint captures all resolved build inputs so that changes to any input | ||
| // (config fields, spec content, overlay files, distro context, upstream refs, or | ||
| // Affects commit count) produce a different fingerprint. | ||
| // | ||
| // The primary entry point is [ComputeIdentity], which takes a resolved | ||
| // [projectconfig.ComponentConfig] and additional context, and returns a | ||
| // [ComponentIdentity] containing the overall fingerprint hash plus a breakdown | ||
| // of individual input hashes for debugging. | ||
| package fingerprint | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,170 @@ | ||||||||||||||||
| // Copyright (c) Microsoft Corporation. | ||||||||||||||||
| // Licensed under the MIT License. | ||||||||||||||||
|
|
||||||||||||||||
| package fingerprint | ||||||||||||||||
|
|
||||||||||||||||
| import ( | ||||||||||||||||
| "crypto/sha256" | ||||||||||||||||
| "encoding/hex" | ||||||||||||||||
| "fmt" | ||||||||||||||||
| "io" | ||||||||||||||||
| "sort" | ||||||||||||||||
| "strconv" | ||||||||||||||||
|
|
||||||||||||||||
| "github.com/microsoft/azure-linux-dev-tools/internal/global/opctx" | ||||||||||||||||
| "github.com/microsoft/azure-linux-dev-tools/internal/projectconfig" | ||||||||||||||||
| "github.com/microsoft/azure-linux-dev-tools/internal/utils/fileutils" | ||||||||||||||||
| "github.com/mitchellh/hashstructure/v2" | ||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| // hashstructureTagName is the struct tag name used by hashstructure to determine | ||||||||||||||||
| // field inclusion. Fields tagged with `fingerprint:"-"` are excluded. | ||||||||||||||||
| const hashstructureTagName = "fingerprint" | ||||||||||||||||
|
|
||||||||||||||||
| // ComponentIdentity holds the computed fingerprint for a single component plus | ||||||||||||||||
| // a breakdown of individual input hashes for debugging. | ||||||||||||||||
| type ComponentIdentity struct { | ||||||||||||||||
| // Fingerprint is the overall SHA256 hash combining all inputs. | ||||||||||||||||
| Fingerprint string `json:"fingerprint"` | ||||||||||||||||
| // Inputs provides the individual input hashes that were combined. | ||||||||||||||||
| Inputs ComponentInputs `json:"inputs"` | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // ComponentInputs contains the individual input hashes that comprise a component's | ||||||||||||||||
| // fingerprint. | ||||||||||||||||
| type ComponentInputs struct { | ||||||||||||||||
| // ConfigHash is the hash of the resolved component config fields (uint64 from hashstructure). | ||||||||||||||||
| ConfigHash uint64 `json:"configHash"` | ||||||||||||||||
|
Comment on lines
+36
to
+37
|
||||||||||||||||
| // ConfigHash is the hash of the resolved component config fields (uint64 from hashstructure). | |
| ConfigHash uint64 `json:"configHash"` | |
| // ConfigHash is a cryptographic digest (for example, SHA256 over a canonical encoding) | |
| // of the resolved component config fields. | |
| ConfigHash string `json:"configHash"` |
Copilot
AI
Mar 31, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeField uses a newline-delimited label=value\n encoding without length-prefixing or escaping the value. If any value can contain \n (or other delimiters), different logical inputs can produce the same byte stream and therefore the same fingerprint. To make the encoding unambiguous, consider length-prefixing (or otherwise escaping) the value as well (e.g., len(label):label=len(value):value).
| // Use label=value\n format. Length-prefixing the label prevents | |
| // collisions between field names that are prefixes of each other. | |
| fmt.Fprintf(writer, "%d:%s=%s\n", len(label), label, value) | |
| // Use len(label):label=len(value):value\n format. Length-prefixing both the | |
| // label and the value prevents collisions even when values contain delimiters | |
| // such as '=', ':' or '\n'. | |
| fmt.Fprintf(writer, "%d:%s=%d:%s\n", len(label), label, len(value), value) |
Copilot
AI
Mar 31, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeField ignores the error returned by fmt.Fprintf. With errcheck enabled in this repo, this will fail linting. Please handle the error (or redesign writeField so writes cannot fail) and propagate/record failures appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Package docs state that a fingerprint captures "spec content" and "upstream refs" as part of the computed fingerprint. In the current implementation, spec/upstream identity appears to come only from
IdentityOptions.SourceIdentity(and config hashing), whileComputeIdentityitself only hashes config + overlay source file contents. Please align the package documentation with the actual inputs, or extendComputeIdentityto hash the additional claimed inputs directly.