Skip to content

feat: Add metadata support to custom detectors#4625

Open
effortlessdevsec wants to merge 7 commits intotrufflesecurity:mainfrom
effortlessdevsec:main
Open

feat: Add metadata support to custom detectors#4625
effortlessdevsec wants to merge 7 commits intotrufflesecurity:mainfrom
effortlessdevsec:main

Conversation

@effortlessdevsec
Copy link
Copy Markdown

@effortlessdevsec effortlessdevsec commented Dec 30, 2025

Feature Request: Add Metadata Support to Custom Detectors

Summary

Add support for a metadata field in custom detector configurations that automatically populates the ExtraData field in detector results. This would allow users to attach custom key-value metadata directly in the YAML configuration without requiring code changes.

Use Case

Currently, custom detectors can only set metadata programmatically in the code. Users should be able to define metadata directly in their custom detector YAML configuration files, which would then be automatically included in the ExtraData field of all results from that detector.

This would be useful for:

  • Adding environment tags (e.g., environment: production)
  • Adding team ownership information (e.g., team: security)
  • Adding severity levels (e.g., severity: high)
  • Adding rotation guides or documentation links
  • Adding any custom contextual information that should be associated with detected secrets

Proposed Implementation

1. Proto Definition Update

Add a metadata field to the CustomRegex message in proto/custom_detectors.proto:

message CustomRegex {
  // ... existing fields ...
  map<string, string> metadata = 12;
}

2. Code Implementation

Update pkg/custom_detectors/custom_detectors.go to copy metadata from the detector configuration to ExtraData when creating results:

// In createResults function
if metadata := c.GetMetadata(); metadata != nil {
    for key, value := range metadata {
        result.ExtraData[key] = value
    }
}

Example Usage

YAML Configuration

detectors:
- name: my-api-key-detector
  keywords:
  - api
  - key
  regex:
    api_key: "your-regex-here"
  metadata:
    environment: "production"
    team: "security"
    severity: "high"
    rotation_guide: "https://example.com/rotate-api-keys"
    custom_field: "any value"

Result

All results from this detector would automatically include the metadata in ExtraData:

{
  "DetectorName": "my-api-key-detector",
  "ExtraData": {
    "name": "my-api-key-detector",
    "environment": "production",
    "team": "security",
    "severity": "high",
    "rotation_guide": "https://example.com/rotate-api-keys",
    "custom_field": "any value"
  }
}

Note

Medium Risk
Adds a new user-configurable field to the custom detector protobuf and threads it into emitted results; low algorithmic risk but changes output shape and could impact downstream consumers relying on ExtraData contents.

Overview
Custom detector configs now accept a metadata map (proto field CustomRegex.metadata) and generated pb/validate code has been updated accordingly.

CustomRegexWebhook results now copy configured metadata into Result.ExtraData when creating results, while ensuring ExtraData is non-nil when adding the detector name and when storing webhook verification response. New tests cover empty vs populated metadata being present in emitted ExtraData.

Written by Cursor Bugbot for commit 8dd1ccc. This will update automatically on new commits. Configure here.

@effortlessdevsec effortlessdevsec requested a review from a team December 30, 2025 03:00
@effortlessdevsec effortlessdevsec requested review from a team as code owners December 30, 2025 03:00
@kashifkhan0771
Copy link
Copy Markdown
Contributor

You need to run the command make protos after making changes in the .proto file.

Copy link
Copy Markdown
Contributor

@camgunz camgunz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea! Requesting changes for:

  • running make protos as Kashif suggested
  • adding at least 2 tests:
    • an empty metadata field adds nothing to ExtraData
    • a populated metadata field adds expected values to ExtraData

You could also move the allocation of the map inside the if metadata... block (maybe also add a len(metadata) > 0 check?), but I don't think it makes a huge difference.

effortlessdevsec and others added 3 commits February 26, 2026 20:15
- Add metadata field to CustomRegex proto message
- Copy metadata from detector config to ExtraData in results
- Only allocate ExtraData map when metadata exists (optimization)
- Add tests for empty and populated metadata scenarios
- Run make protos to regenerate proto code
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

if metadata := c.GetMetadata(); metadata != nil {
for key, value := range metadata {
result.ExtraData[key] = value
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metadata keys silently overwritten by system-set keys

Low Severity

User-provided metadata is copied into ExtraData first, but the keys name (always set at line 200 in FromData) and response (set at line 329 on successful verification) will silently overwrite any same-named metadata entries. Users who configure metadata with those keys will experience silent data loss with no warning.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review: Metadata Key Collision with Internal ExtraData Fields

Severity: Low | Category: Data Integrity / Result Spoofing

This PR introduces a metadata map on custom detector configurations that is copied into ExtraData on every result. The implementation has one concrete concern related to key collision with internally-managed ExtraData keys.

Finding: User-supplied metadata keys can shadow reserved internal keys

The code copies user-provided metadata into ExtraData before internal keys are assigned. The internal key "name" is always overwritten afterward (line 202), so it is protected. However, the "response" key is only set in the verification success path (line 333). This means:

  • A config with metadata: {"response": "200 OK"} would cause unverified results to contain a "response" entry in ExtraData that looks like it came from webhook verification.
  • Downstream consumers (dashboards, SIEM integrations) that inspect ExtraData["response"] to infer verification status could be misled.

Mitigating factors:

  • The Verified field on the Result struct is the authoritative indicator of verification, not ExtraData.
  • The config author is typically the same operator running the scan, limiting the attack surface.
  • In team/shared-config environments, the trust boundary between config authors and result consumers could make this more relevant.

Remediation

Consider adding a blocklist check that rejects or warns on metadata keys that collide with internally-used names ("name", "response"), or copy metadata after all internal keys are set so internal values always take precedence.


Open in Web View Automation 

Comment on lines +231 to +234
if metadata := c.GetMetadata(); len(metadata) > 0 {
result.ExtraData = make(map[string]string, len(metadata))
for key, value := range metadata {
result.ExtraData[key] = value
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Metadata keys can collide with reserved internal ExtraData keys.

User-supplied metadata is copied into ExtraData here with no key restrictions. The "name" key is safe because it's unconditionally overwritten later (line 202). However, "response" is only set inside the verification success path (line 333). A config like metadata: {"response": "200 OK"} would persist in unverified results, potentially confusing downstream consumers.

Consider either:

  1. Rejecting reserved keys ("name", "response") during config validation or NewWebhookCustomRegex.
  2. Prefixing user metadata keys (e.g., meta.environment).
  3. Copying metadata after internal keys are set, with internal keys taking precedence.
var reservedKeys = map[string]struct{}{"name": {}, "response": {}}
for key, value := range metadata {
    if _, reserved := reservedKeys[key]; reserved {
        continue // or return an error
    }
    result.ExtraData[key] = value
}

}
}

// no validation rules for Metadata
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: No validation rules are generated for Metadata. While protobuf map fields don't have built-in size constraints, consider adding application-level validation in NewWebhookCustomRegex to bound the number of metadata entries and key/value lengths. Unbounded metadata could increase memory usage proportionally to the number of scan results.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review: Low-Severity Concerns

No critical or high-severity vulnerabilities found. One low-severity design concern is noted below regarding metadata key validation.

Metadata Key Collision with Reserved Internal Keys (Low)

The metadata map from YAML configuration is copied into ExtraData without validating keys against internally reserved names ("name", "response").

  • "name" — Safe. Always overwritten in FromData after metadata is set, so a user-supplied metadata["name"] value cannot persist.
  • "response" — Concern. The "response" key is only set inside createResults when webhook verification succeeds (HTTP 200). If a config sets metadata["response"], that value persists into the final result whenever verification is not attempted (verify=false) or all verification endpoints fail. While result.Verified remains false in these cases (so programmatic consumers should not be misled), the ExtraData["response"] field is rendered in plain-text output, JSON output, and GitHub Actions annotations, and could mislead human reviewers or integrations that inspect ExtraData keys.

Threat model context: Since the config file is loaded from a local path by the person running trufflehog (config.Read), this is a trusted input. The risk is limited to accidental misconfiguration or future changes to config loading (e.g., remote/shared configs).

Suggested remediation: Consider either (a) rejecting reserved keys ("name", "response") during validation in NewWebhookCustomRegex, or (b) applying metadata after internal keys are set so internal values always win. Option (a) is simpler and more explicit.


Open in Web View Automation 

Comment on lines +229 to +232

// Copy metadata from detector configuration to ExtraData
if metadata := c.GetMetadata(); len(metadata) > 0 {
result.ExtraData = make(map[string]string, len(metadata))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: No validation on metadata keys against reserved ExtraData keys.

Metadata is copied into ExtraData before internal keys are set. The "name" key is safely overwritten later in FromData, but "response" is only set on successful webhook verification. A config with metadata["response"] would inject a value that persists in unverified results, potentially confusing downstream consumers.

Consider adding validation in NewWebhookCustomRegex to reject reserved keys:

reservedKeys := map[string]struct{}{"name": {}, "response": {}}
for key := range pb.Metadata {
    if _, ok := reservedKeys[key]; ok {
        return nil, fmt.Errorf("metadata key %q is reserved", key)
    }
}

}
}

// no validation rules for Metadata
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: The generated validation has no rules for Metadata. While protobuf map fields are inherently unbounded, consider adding application-level validation (key count, key/value length limits) in NewWebhookCustomRegex to prevent excessively large metadata from inflating result payloads.

@effortlessdevsec
Copy link
Copy Markdown
Author

@camgunz Thanks! Glad you liked the suggestions. I've added the changes—please have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants