Skip to content

Conversation

@pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Dec 15, 2025

Description

This is an alternative PR #14261

metadata.yaml :

schema:
  enabled: true

Notable implementation details

format: duration
  • JSON Schema duration: Expects an ISO 8601 string (e.g., "PT5M").
  • This is not compatible with golang Duration
  • Go time.Duration (Default): Marshals into an integer representing nanoseconds (e.g., 300000000000).
  • To satisfy "type": "string", "format": "duration" you need a custom wrapper around time.Duration that implements the json.Marshaler interface to convert the time into ISO 8601 format.

Solution

{
  "type": "string",
  "pattern": "^([+-]?(\\d+(\\.\\d*)?|\\.\\d+)(ns|us|µs|ms|s|m|h))+$",
  "example": "1h30m10s",
  "description": "A duration string (e.g., '10s', '1.5h'). Valid units: ns, us, ms, s, m, h."
}
mapstructure:",squash
  • The generator detects squash in the tag and treat such fields the same way it treats embedded structs - merging their properties into the parent schema instead of creating a nested object.
Open questions
  • How to handle required fields?
  • How to show default fields?
  • How to handle config declared in a different package (e.g. datadogexporter, pkg/datadog/config.go)

Link to tracking issue

Fixes #9769
Fixes open-telemetry/opentelemetry-collector-contrib#42214
Implements: #13784

Testing

Documentation

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

❌ Patch coverage is 70.47354% with 106 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.04%. Comparing base (250a1ca) to head (db85700).

Files with missing lines Patch % Lines
cmd/mdatagen/internal/schemagen/analyzer.go 67.94% 46 Missing and 21 partials ⚠️
cmd/mdatagen/internal/schemagen/generator.go 79.54% 15 Missing and 12 partials ⚠️
cmd/mdatagen/internal/command.go 38.46% 7 Missing and 1 partial ⚠️
cmd/mdatagen/internal/samplereceiver/config.go 0.00% 4 Missing ⚠️

❌ Your patch check has failed because the patch coverage (70.47%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14288      +/-   ##
==========================================
- Coverage   92.23%   92.04%   -0.20%     
==========================================
  Files         676      679       +3     
  Lines       41649    42007     +358     
==========================================
+ Hits        38415    38665     +250     
- Misses       2196     2269      +73     
- Partials     1038     1073      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example to look at the generated schema

@pavolloffay pavolloffay force-pushed the ocb-component-schema-alternative branch from c7b9168 to a88d7b8 Compare December 16, 2025 15:03
@codspeed-hq
Copy link

codspeed-hq bot commented Dec 17, 2025

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing pavolloffay:ocb-component-schema-alternative (db85700) with main (250a1ca)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 64 untouched benchmarks
⏩ 20 skipped benchmarks1

Footnotes

  1. 20 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@niwoerner
Copy link
Member

I really like that this approach attempts to capture the required fields. This would be useful for the validation. Unfortunately I fear that the detection would contain a considerable amount of false positives because the omitempty tag probably couldn't be taken as reliable source for that information. It's not set consistently in the components

@jkoronaAtCisco
Copy link

It seems that we are working on solving the same problem, but using slightly different methods.

I have started working on a script that generates config schemas based on Go structs using AST parsing. This is part of a larger plan to introduce schemas to the OpenTelemetry collector. You can find more details in my PRs:

I think it makes sense to start a discussion on the preferred solution.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2026

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 7, 2026
Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach better and I think this can be merged roughly as is

@github-actions github-actions bot removed the Stale label Jan 8, 2026
@pavolloffay pavolloffay force-pushed the ocb-component-schema-alternative branch 2 times, most recently from 3fa491f to a166f90 Compare January 9, 2026 15:53
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
@pavolloffay pavolloffay force-pushed the ocb-component-schema-alternative branch from a166f90 to f190d98 Compare January 9, 2026 15:57
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
@dmitryax
Copy link
Member

dmitryax commented Jan 9, 2026

Given that we’ve gone through many iterations, and that a similar solution has recently been merged (the schemagen tool), I’d like to propose a call to discuss the end result we want to have, align on a single approach, and avoid stepping on each other’s toes.

There are a few remaining points we need to clarify:

  1. Do we need to involve mdatagen?
  2. What should be the source of truth for the generated Go code, schema definitions, and documentation?
  3. Should we use references?

@pavolloffay @mx-psi @evan-bradley @jkoronaAtCisco how does it sound?

Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose component configuration with a JSON schema Improve otel collector configuration w/ JSON schema

6 participants