Skip to content

Fix flaky metrics 7617#7720

Open
chinmay3012 wants to merge 5 commits intojaegertracing:mainfrom
chinmay3012:fix-flaky-metrics-7617
Open

Fix flaky metrics 7617#7720
chinmay3012 wants to merge 5 commits intojaegertracing:mainfrom
chinmay3012:fix-flaky-metrics-7617

Conversation

@chinmay3012
Copy link
Copy Markdown
Contributor

Which problem is this PR solving?

Resolves #7617

Description of the changes

  • Update scripts/e2e/compare_metrics.py to support GLOBAL transient label patterns.
    Add global suppression for otel_scope_version (normalizing it to fixed string "version") to prevent spurious diffs when OpenTelemetry dependencies are upgraded.
    Add global suppression for namespace and k8s_namespace_name (normalizing to "namespace") to handle randomized namespaces in e2e tests.
    Fix logic in suppress_transient_labels to correctly apply these global patterns

How was this change tested?

Manually created checking script with dummy metric files containing different otel_scope_version values (e.g., 0.63.0 vs 0.64.0) and confirmed they are now reported as identical.
Verified that files with actual differences (e.g. different metric values or label keys) are still correctly flagged as different.
Verified that randomized namespace labels are correctly normalized and ignored in comparisons.

@chinmay3012 chinmay3012 requested a review from a team as a code owner December 9, 2025 18:35
@chinmay3012 chinmay3012 requested a review from jkowall December 9, 2025 18:35
@dosubot dosubot Bot added the enhancement label Dec 9, 2025
@chinmay3012 chinmay3012 force-pushed the fix-flaky-metrics-7617 branch from 148ebf0 to ccd62cc Compare December 9, 2025 18:36
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.48%. Comparing base (4f2b5d3) to head (dab6854).
⚠️ Report is 199 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7720   +/-   ##
=======================================
  Coverage   95.48%   95.48%           
=======================================
  Files         316      316           
  Lines       16732    16732           
=======================================
  Hits        15977    15977           
  Misses        590      590           
  Partials      165      165           
Flag Coverage Δ
badger_v1 9.12% <ø> (ø)
badger_v2 1.32% <ø> (ø)
cassandra-4.x-v1-manual 13.30% <ø> (ø)
cassandra-4.x-v2-auto 1.31% <ø> (ø)
cassandra-4.x-v2-manual 1.31% <ø> (ø)
cassandra-5.x-v1-manual 13.30% <ø> (ø)
cassandra-5.x-v2-auto 1.31% <ø> (ø)
cassandra-5.x-v2-manual 1.31% <ø> (ø)
clickhouse 1.40% <ø> (ø)
elasticsearch-6.x-v1 16.89% <ø> (ø)
elasticsearch-7.x-v1 16.92% <ø> (ø)
elasticsearch-8.x-v1 17.07% <ø> (ø)
elasticsearch-8.x-v2 1.32% <ø> (-0.05%) ⬇️
elasticsearch-9.x-v2 1.32% <ø> (ø)
grpc_v1 8.08% <ø> (ø)
grpc_v2 1.32% <ø> (ø)
kafka-3.x-v2 1.32% <ø> (ø)
memory_v2 1.32% <ø> (ø)
opensearch-1.x-v1 16.96% <ø> (ø)
opensearch-2.x-v1 16.96% <ø> (ø)
opensearch-2.x-v2 1.32% <ø> (ø)
opensearch-3.x-v2 1.32% <ø> (ø)
query 1.32% <ø> (ø)
tailsampling-processor 0.52% <ø> (ø)
unittests 94.17% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 9, 2025

Metrics Comparison Summary

ERROR: No summary files were generated. Expected at least 8 diff files from CI.

This indicates a failure in the E2E test execution or metrics collection process.

➡️ View full metrics file

Comment thread cmd/query/app/query_parser.go Outdated
Comment thread scripts/e2e/compare_metrics.py Outdated
@chinmay3012 chinmay3012 force-pushed the fix-flaky-metrics-7617 branch from 64b510e to a2b20b2 Compare December 10, 2025 04:32
Signed-off-by: Chinmay Mehrotra <mehrotrachinmay6@gmail.com>
@chinmay3012 chinmay3012 force-pushed the fix-flaky-metrics-7617 branch from a2b20b2 to 3aadd9c Compare December 10, 2025 04:37
@chinmay3012
Copy link
Copy Markdown
Contributor Author

can you please review the PR ?

Comment thread scripts/e2e/compare_metrics.py
Comment thread scripts/e2e/compare_metrics.py Outdated
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. You may re-open it if you need more time.

@github-actions github-actions Bot added the stale The issue/PR has become stale and may be auto-closed label Feb 16, 2026
Signed-off-by: Chinmay Mehrotra <mehrotrachinmay4@gmail.com>
Copilot AI review requested due to automatic review settings February 16, 2026 03:12
@chinmay3012 chinmay3012 force-pushed the fix-flaky-metrics-7617 branch from 300ac34 to b68c3a7 Compare February 16, 2026 03:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes flaky metrics comparison in CI by normalizing transient labels that change between test runs. The issue was caused by randomized namespaces in e2e tests and OpenTelemetry version labels appearing in metric labels, causing spurious differences in metric comparisons.

Changes:

  • Added GLOBAL transient label patterns to normalize otel_scope_version, k8s_namespace_name, and namespace labels across all metrics
  • Modified suppress_transient_labels function to apply GLOBAL patterns to all metrics regardless of metric name
  • Removed outdated example comments from the configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/e2e/compare_metrics.py Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Chinmay Mehrotra <88617477+chinmay3012@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 16, 2026 03:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +17 to +25
'pattern': r'.*',
'replacement': 'version'
},
'k8s_namespace_name': {
'pattern': r'.*',
'replacement': 'namespace'
},
'namespace': {
'pattern': r'.*',
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using regex pattern .* with re.sub will also match the empty string at the end, producing duplicated replacements (e.g., versionversion / namespacenamespace). Use an anchored pattern like ^.*$, a .+ pattern, or pass count=1 to re.sub so the label is normalized to a single fixed value as intended.

Suggested change
'pattern': r'.*',
'replacement': 'version'
},
'k8s_namespace_name': {
'pattern': r'.*',
'replacement': 'namespace'
},
'namespace': {
'pattern': r'.*',
'pattern': r'^.*$',
'replacement': 'version'
},
'k8s_namespace_name': {
'pattern': r'^.*$',
'replacement': 'namespace'
},
'namespace': {
'pattern': r'^.*$',

Copilot uses AI. Check for mistakes.
@chinmay3012
Copy link
Copy Markdown
Contributor Author

@yurishkuro responded to the comments , open to review for further any changes or clarification .

@github-actions github-actions Bot removed the stale The issue/PR has become stale and may be auto-closed label Feb 23, 2026
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. You may re-open it if you need more time.

@github-actions github-actions Bot added the stale The issue/PR has become stale and may be auto-closed label Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement stale The issue/PR has become stale and may be auto-closed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ci]: Metrics comparison is flaky

3 participants