feat: Add metadata support to custom detectors#4625
feat: Add metadata support to custom detectors#4625effortlessdevsec wants to merge 7 commits intotrufflesecurity:mainfrom
Conversation
|
You need to run the command |
camgunz
left a comment
There was a problem hiding this comment.
This is a good idea! Requesting changes for:
- running
make protosas Kashif suggested - adding at least 2 tests:
- an empty
metadatafield adds nothing toExtraData - a populated
metadatafield adds expected values toExtraData
- an empty
You could also move the allocation of the map inside the if metadata... block (maybe also add a len(metadata) > 0 check?), but I don't think it makes a huge difference.
- Add metadata field to CustomRegex proto message - Copy metadata from detector config to ExtraData in results - Only allocate ExtraData map when metadata exists (optimization) - Add tests for empty and populated metadata scenarios - Run make protos to regenerate proto code
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| if metadata := c.GetMetadata(); metadata != nil { | ||
| for key, value := range metadata { | ||
| result.ExtraData[key] = value | ||
| } |
There was a problem hiding this comment.
Metadata keys silently overwritten by system-set keys
Low Severity
User-provided metadata is copied into ExtraData first, but the keys name (always set at line 200 in FromData) and response (set at line 329 on successful verification) will silently overwrite any same-named metadata entries. Users who configure metadata with those keys will experience silent data loss with no warning.
Additional Locations (1)
There was a problem hiding this comment.
Stale comment
Security Review: Metadata Key Collision with Internal
ExtraDataFieldsSeverity: Low | Category: Data Integrity / Result Spoofing
This PR introduces a
metadatamap on custom detector configurations that is copied intoExtraDataon every result. The implementation has one concrete concern related to key collision with internally-managedExtraDatakeys.Finding: User-supplied metadata keys can shadow reserved internal keys
The code copies user-provided metadata into
ExtraDatabefore internal keys are assigned. The internal key"name"is always overwritten afterward (line 202), so it is protected. However, the"response"key is only set in the verification success path (line 333). This means:
- A config with
metadata: {"response": "200 OK"}would cause unverified results to contain a"response"entry inExtraDatathat looks like it came from webhook verification.- Downstream consumers (dashboards, SIEM integrations) that inspect
ExtraData["response"]to infer verification status could be misled.Mitigating factors:
- The
Verifiedfield on theResultstruct is the authoritative indicator of verification, not ExtraData.- The config author is typically the same operator running the scan, limiting the attack surface.
- In team/shared-config environments, the trust boundary between config authors and result consumers could make this more relevant.
Remediation
Consider adding a blocklist check that rejects or warns on metadata keys that collide with internally-used names (
"name","response"), or copy metadata after all internal keys are set so internal values always take precedence.
| if metadata := c.GetMetadata(); len(metadata) > 0 { | ||
| result.ExtraData = make(map[string]string, len(metadata)) | ||
| for key, value := range metadata { | ||
| result.ExtraData[key] = value |
There was a problem hiding this comment.
Low: Metadata keys can collide with reserved internal ExtraData keys.
User-supplied metadata is copied into ExtraData here with no key restrictions. The "name" key is safe because it's unconditionally overwritten later (line 202). However, "response" is only set inside the verification success path (line 333). A config like metadata: {"response": "200 OK"} would persist in unverified results, potentially confusing downstream consumers.
Consider either:
- Rejecting reserved keys (
"name","response") during config validation orNewWebhookCustomRegex. - Prefixing user metadata keys (e.g.,
meta.environment). - Copying metadata after internal keys are set, with internal keys taking precedence.
var reservedKeys = map[string]struct{}{"name": {}, "response": {}}
for key, value := range metadata {
if _, reserved := reservedKeys[key]; reserved {
continue // or return an error
}
result.ExtraData[key] = value
}| } | ||
| } | ||
|
|
||
| // no validation rules for Metadata |
There was a problem hiding this comment.
Note: No validation rules are generated for Metadata. While protobuf map fields don't have built-in size constraints, consider adding application-level validation in NewWebhookCustomRegex to bound the number of metadata entries and key/value lengths. Unbounded metadata could increase memory usage proportionally to the number of scan results.
There was a problem hiding this comment.
Security Review: Low-Severity Concerns
No critical or high-severity vulnerabilities found. One low-severity design concern is noted below regarding metadata key validation.
Metadata Key Collision with Reserved Internal Keys (Low)
The metadata map from YAML configuration is copied into ExtraData without validating keys against internally reserved names ("name", "response").
"name"— Safe. Always overwritten inFromDataafter metadata is set, so a user-suppliedmetadata["name"]value cannot persist."response"— Concern. The"response"key is only set insidecreateResultswhen webhook verification succeeds (HTTP 200). If a config setsmetadata["response"], that value persists into the final result whenever verification is not attempted (verify=false) or all verification endpoints fail. Whileresult.Verifiedremainsfalsein these cases (so programmatic consumers should not be misled), theExtraData["response"]field is rendered in plain-text output, JSON output, and GitHub Actions annotations, and could mislead human reviewers or integrations that inspectExtraDatakeys.
Threat model context: Since the config file is loaded from a local path by the person running trufflehog (config.Read), this is a trusted input. The risk is limited to accidental misconfiguration or future changes to config loading (e.g., remote/shared configs).
Suggested remediation: Consider either (a) rejecting reserved keys ("name", "response") during validation in NewWebhookCustomRegex, or (b) applying metadata after internal keys are set so internal values always win. Option (a) is simpler and more explicit.
|
|
||
| // Copy metadata from detector configuration to ExtraData | ||
| if metadata := c.GetMetadata(); len(metadata) > 0 { | ||
| result.ExtraData = make(map[string]string, len(metadata)) |
There was a problem hiding this comment.
Low: No validation on metadata keys against reserved ExtraData keys.
Metadata is copied into ExtraData before internal keys are set. The "name" key is safely overwritten later in FromData, but "response" is only set on successful webhook verification. A config with metadata["response"] would inject a value that persists in unverified results, potentially confusing downstream consumers.
Consider adding validation in NewWebhookCustomRegex to reject reserved keys:
reservedKeys := map[string]struct{}{"name": {}, "response": {}}
for key := range pb.Metadata {
if _, ok := reservedKeys[key]; ok {
return nil, fmt.Errorf("metadata key %q is reserved", key)
}
}| } | ||
| } | ||
|
|
||
| // no validation rules for Metadata |
There was a problem hiding this comment.
Note: The generated validation has no rules for Metadata. While protobuf map fields are inherently unbounded, consider adding application-level validation (key count, key/value length limits) in NewWebhookCustomRegex to prevent excessively large metadata from inflating result payloads.
|
@camgunz Thanks! Glad you liked the suggestions. I've added the changes—please have a look. |




Feature Request: Add Metadata Support to Custom Detectors
Summary
Add support for a
metadatafield in custom detector configurations that automatically populates theExtraDatafield in detector results. This would allow users to attach custom key-value metadata directly in the YAML configuration without requiring code changes.Use Case
Currently, custom detectors can only set metadata programmatically in the code. Users should be able to define metadata directly in their custom detector YAML configuration files, which would then be automatically included in the
ExtraDatafield of all results from that detector.This would be useful for:
environment: production)team: security)severity: high)Proposed Implementation
1. Proto Definition Update
Add a
metadatafield to theCustomRegexmessage inproto/custom_detectors.proto:2. Code Implementation
Update
pkg/custom_detectors/custom_detectors.goto copy metadata from the detector configuration toExtraDatawhen creating results:Example Usage
YAML Configuration
Result
All results from this detector would automatically include the metadata in
ExtraData:{ "DetectorName": "my-api-key-detector", "ExtraData": { "name": "my-api-key-detector", "environment": "production", "team": "security", "severity": "high", "rotation_guide": "https://example.com/rotate-api-keys", "custom_field": "any value" } }Note
Medium Risk
Adds a new user-configurable field to the custom detector protobuf and threads it into emitted results; low algorithmic risk but changes output shape and could impact downstream consumers relying on
ExtraDatacontents.Overview
Custom detector configs now accept a
metadatamap (proto fieldCustomRegex.metadata) and generated pb/validate code has been updated accordingly.CustomRegexWebhookresults now copy configured metadata intoResult.ExtraDatawhen creating results, while ensuringExtraDatais non-nil when adding the detectornameand when storing webhook verificationresponse. New tests cover empty vs populated metadata being present in emittedExtraData.Written by Cursor Bugbot for commit 8dd1ccc. This will update automatically on new commits. Configure here.