Backport 2206 to release/0.4.z#2297
Backport 2206 to release/0.4.z#2297jcrossley3 wants to merge 6 commits intoguacsec:release/0.4.zfrom
Conversation
Specifically files, machine learning models, and cryptographic assets. We've introduced place holders for the latter 2 until migrations for new tables are added in a subsequent commit. This indirectly fixes guacsec#2205 because it prevents cryptographic assets from being displayed as packages for CBOM's, but there's not yet a service API that would allow you to fetch those assets. That'll come in a separate commit. (cherry picked from commit 06364ad)
(cherry picked from commit cb0107b)
Also made the use of the `refs` field a bit more clear and reduced some cloning. (cherry picked from commit 6c84171)
Also a bit of renaming and commentary (cherry picked from commit 948f9aa)
Weirdly, I had to revert the tc2758 test expectation. Still not sure why. (cherry picked from commit 44fab35)
Reviewer's GuideBackports extended CycloneDX SBOM ingestion support for additional component types (files, machine-learning models, cryptographic assets), refactors component creation into an owning ComponentCreator that manages references per-node, updates package reference handling APIs, and adjusts tests and formatting for the 0.4.z branch. Sequence diagram for CycloneDX SBOM ingestion with ComponentCreatorsequenceDiagram
participant Caller
participant Creator as CycloneDxCreator
participant ComponentCreator
participant Processors as ProcessorList
participant DB as Database
Caller->>Creator: create_components(db, processors)
Creator->>ComponentCreator: new(sbom_id, components.len)
loop for each component
Creator->>ComponentCreator: add_component(component)
alt component has cpe/purl
ComponentCreator->>ComponentCreator: add_cpe/add_purl(node_id, parsed)
end
alt component has pedigree ancestors
loop for each ancestor
ComponentCreator->>ComponentCreator: add_component(ancestor)
ComponentCreator->>ComponentCreator: add_relation(ancestor_id, AncestorOf, node_id)
end
end
alt component has variants
loop for each variant
ComponentCreator->>ComponentCreator: add_component(variant)
ComponentCreator->>ComponentCreator: add_relation(node_id, Variant, variant_id)
end
end
end
loop for each relation tuple
Creator->>ComponentCreator: add_relation(left, rel, right)
end
Creator->>ComponentCreator: post_process(processors)
ComponentCreator->>Processors: PostContext.run(processors)
Creator->>ComponentCreator: validate()
ComponentCreator->>ComponentCreator: build References from packages, files, models, crypto
ComponentCreator->>ComponentCreator: relationships.validate(sources)
Creator->>ComponentCreator: create(db)
ComponentCreator->>DB: licenses.create
ComponentCreator->>DB: purls.create
ComponentCreator->>DB: cpes.create
ComponentCreator->>DB: packages.create
ComponentCreator->>DB: files.create
ComponentCreator->>DB: models.create
ComponentCreator->>DB: crypto.create
ComponentCreator->>DB: relationships.create
ComponentCreator-->>Creator: Result
Creator-->>Caller: Result
Class diagram for updated SBOM component ingestion (CycloneDX)classDiagram
class Creator {
+sbom_id Uuid
+components Vec~Component~
+relations Vec~(String, Relationship, String)~
+create_components(db, processors) Result
}
class ComponentCreator {
-cpes CpeCreator
-purls PurlCreator
-licenses LicenseCreator
-packages PackageCreator
-files FileCreator
-models MachineLearningModelCreator
-crypto CryptographicAssetCreator
-relationships RelationshipCreator~CycloneDxProcessor~
-refs HashMap~String, Vec~PackageReference~~
+new(sbom_id, capacity) ComponentCreator
+add_component(comp) Result
+add_relation(left, rel, right) void
+post_process(processors) void
+validate() Result
+create(db) Result
-add_cpe(node_id, cpe) void
-add_purl(node_id, purl) void
-add_license(node_id, license) Uuid
}
class PackageCreator {
-sbom_id Uuid
+new(sbom_id) PackageCreator
+with_capacity(sbom_id, capacity) PackageCreator
+add(node_info, refs_iter, checksums_iter) void
}
class FileCreator {
+new(sbom_id) FileCreator
+with_capacity(sbom_id, capacity) FileCreator
+add(node_id, name, checksums_iter) void
+create(db) Result
}
class MachineLearningModelCreator {
-nodes NodeCreator
+new(sbom_id) MachineLearningModelCreator
+with_capacity(sbom_id, capacity) MachineLearningModelCreator
+add(node_id, name, checksums_iter) void
+create(db) Result
}
class CryptographicAssetCreator {
-nodes NodeCreator
+new(sbom_id) CryptographicAssetCreator
+with_capacity(sbom_id, capacity) CryptographicAssetCreator
+add(node_id, name, checksums_iter) void
+create(db) Result
}
class NodeCreator {
+new(sbom_id) NodeCreator
+with_capacity(sbom_id, capacity) NodeCreator
+add(node_id, name, checksums_iter) void
+create(db) Result
+references() Iterator~&str~
}
class PackageReference {
<<enumeration>>
Cpe(Uuid)
Purl(Purl)
}
class ComponentType {
<<enumeration>>
Application
Framework
Library
Container
Platform
OperatingSystem
Device
DeviceDriver
Firmware
File
MachineLearningModel
Data
CryptographicAsset
}
class RelationshipCreator_CycloneDxProcessor_ {
+new(sbom_id, CycloneDxProcessor) RelationshipCreator_CycloneDxProcessor_
+relate(left, rel, right) void
+rels Vec~Relationship~
+externals Vec~Relationship~
+validate(sources) Result
+create(db) Result
}
class PostContext {
+cpes CpeCreator*
+purls PurlCreator*
+packages PackageCreator*
+relationships Vec~Relationship~*
+externals Vec~Relationship~*
+run(processors) void
}
Creator --> ComponentCreator : uses
ComponentCreator --> PackageCreator : owns
ComponentCreator --> FileCreator : owns
ComponentCreator --> MachineLearningModelCreator : owns
ComponentCreator --> CryptographicAssetCreator : owns
ComponentCreator --> RelationshipCreator_CycloneDxProcessor_ : owns
ComponentCreator --> ComponentType : classifies
ComponentCreator --> PackageReference : maps node to
MachineLearningModelCreator --> NodeCreator : wraps
CryptographicAssetCreator --> NodeCreator : wraps
PackageCreator ..> PackageReference : uses
PostContext ..> CpeCreator
PostContext ..> PurlCreator
PostContext ..> PackageCreator
PostContext ..> RelationshipCreator_CycloneDxProcessor_
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
ComponentCreator::add_component, theconst EMPTY: Vec<PackageReference> = vec![];pattern won’t work asvec!isn’t const; consider using something likeself.refs.get(&node_id).into_iter().flatten()instead of anEMPTYfallback to avoid the need for a const Vec. - The new
ComponentType::from_str(&comp.type_)handling returns an error for unknown or unexpected component types, which changes ingestion behavior; consider whether it’s preferable to log and skip unsupported types (or fall back to a default) rather than failing the entire SBOM on an unrecognized type. - In
MachineLearningModelCreator::addandCryptographicAssetCreator::add, you can drop thenode_id.clone()when callingself.nodes.addsincenode_idis already owned and immediately moved, avoiding an unnecessary clone.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `ComponentCreator::add_component`, the `const EMPTY: Vec<PackageReference> = vec![];` pattern won’t work as `vec!` isn’t const; consider using something like `self.refs.get(&node_id).into_iter().flatten()` instead of an `EMPTY` fallback to avoid the need for a const Vec.
- The new `ComponentType::from_str(&comp.type_)` handling returns an error for unknown or unexpected component types, which changes ingestion behavior; consider whether it’s preferable to log and skip unsupported types (or fall back to a default) rather than failing the entire SBOM on an unrecognized type.
- In `MachineLearningModelCreator::add` and `CryptographicAssetCreator::add`, you can drop the `node_id.clone()` when calling `self.nodes.add` since `node_id` is already owned and immediately moved, avoiding an unnecessary clone.
## Individual Comments
### Comment 1
<location path="modules/ingestor/src/graph/sbom/cyclonedx.rs" line_range="415" />
<code_context>
+ match ty {
+ // We treat all these types as "packages"
+ Application | Framework | Library | Container | OperatingSystem => {
+ const EMPTY: Vec<PackageReference> = vec![];
+ self.packages.add(
+ NodeInfoParam {
</code_context>
<issue_to_address>
**issue (bug_risk):** Using `vec![]` in a `const` context will not compile; consider a different way to represent an empty reference set.
`vec!` cannot be used in a `const` because it’s not `const fn`, so this definition won’t compile. You can drop the `const` and fall back with something like `self.refs.get(&node_id).into_iter().flatten()` or `self.refs.get(&node_id).map(|v| v.iter()).unwrap_or_else(|| [].iter())`. Alternatively, define a static empty slice (`static EMPTY: [PackageReference; 0] = [];`) and use `EMPTY.iter()` as the fallback.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| match ty { | ||
| // We treat all these types as "packages" | ||
| Application | Framework | Library | Container | OperatingSystem => { | ||
| const EMPTY: Vec<PackageReference> = vec![]; |
There was a problem hiding this comment.
issue (bug_risk): Using vec![] in a const context will not compile; consider a different way to represent an empty reference set.
vec! cannot be used in a const because it’s not const fn, so this definition won’t compile. You can drop the const and fall back with something like self.refs.get(&node_id).into_iter().flatten() or self.refs.get(&node_id).map(|v| v.iter()).unwrap_or_else(|| [].iter()). Alternatively, define a static empty slice (static EMPTY: [PackageReference; 0] = [];) and use EMPTY.iter() as the fallback.
To address https://redhat.atlassian.net/browse/TC-3091
Summary by Sourcery
Extend SBOM ingestion to handle additional CycloneDX component types and centralize component creation and validation while adjusting reference handling.
New Features:
Bug Fixes:
Enhancements:
Build:
Tests: