Skip to content

feat: ROX-35431: add Go memory lifecycle panels to Central dashboard#344

Merged
ebensh merged 4 commits into
masterfrom
ebenshet/ROX-35431-go-memory-dashboard
Jul 3, 2026
Merged

feat: ROX-35431: add Go memory lifecycle panels to Central dashboard#344
ebensh merged 4 commits into
masterfrom
ebenshet/ROX-35431-go-memory-dashboard

Conversation

@ebensh

@ebensh ebensh commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Add 6 new panels to the Go Metrics row in rhacs-central.json:

  • Allocation & Free Rate (Panel 200)
  • Live Heap Objects (Panel 201)
  • Heap Memory Breakdown - stacked area (Panel 202)
  • GC Pressure with NextGC threshold (Panel 203)
  • Go Heap vs Container Memory (Panel 204)
  • Scavenger Effectiveness (Panel 205)

These panels visualize the full Go heap memory lifecycle from allocation through GC to OS return, mapping each stage to its Prometheus metrics. Enables engineers to diagnose memory leaks, GC pressure, and OOM proximity at per-pod granularity with 10s resolution.

Ticket: ROX-35431

Created with Claude Code

Screenshot of it loaded into prod grafana on IBM instance:
image

Add 6 new panels to the Go Metrics row in rhacs-central.json:
- Allocation & Free Rate (Panel 200)
- Live Heap Objects (Panel 201)
- Heap Memory Breakdown - stacked area (Panel 202)
- GC Pressure with NextGC threshold (Panel 203)
- Go Heap vs Container Memory (Panel 204)
- Scavenger Effectiveness (Panel 205)

These panels visualize the full Go heap memory lifecycle from allocation
through GC to OS return, mapping each stage to its Prometheus metrics.
Enables engineers to diagnose memory leaks, GC pressure, and OOM
proximity at per-pod granularity with 10s resolution.

Ticket: ROX-35431
@ebensh ebensh requested a review from a team as a code owner June 29, 2026 15:19
@ebensh ebensh requested review from GrimmiMeloni and removed request for a team June 29, 2026 15:19
@ebensh ebensh changed the title feat: add Go memory lifecycle panels to Central dashboard feat: ROX-35431: add Go memory lifecycle panels to Central dashboard Jun 29, 2026
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@ebensh, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 43 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9f13b886-c02e-42fa-91de-1848ff4c2803

📥 Commits

Reviewing files that changed from the base of the PR and between fe29ce5 and 824d78b.

📒 Files selected for processing (1)
  • resources/prometheus/federation-config.yaml
📝 Walkthrough

Walkthrough

The rhacs-central.json Grafana dashboard updates several Go Metrics panels with new PromQL targets, series overrides, axis units, and panel styling. Heap, GC, scavenger, and container-memory visualizations are reworked with new series definitions and layout changes.

Changes

Dashboard panel updates

Layer / File(s) Summary
Allocation and free rate panel
resources/grafana/sources/rhacs-central.json
Adds malloc, free, and allocated-bytes rate series, and moves allocated bytes to the right axis with Bps units.
Live heap objects panel
resources/grafana/sources/rhacs-central.json
Adds heap object and average object size series with updated legends and refIds.
Heap memory breakdown panel
resources/grafana/sources/rhacs-central.json
Plots live, retained, and released heap bytes and updates stacking, fill opacity, tooltip sorting, and series colors.
GC pressure panel
resources/grafana/sources/rhacs-central.json
Replaces the series with heap alloc bytes, next GC threshold bytes, and GC cycle rate, plus dashed-line and right-axis overrides.
Go heap vs container memory panel
resources/grafana/sources/rhacs-central.json
Compares heap sys bytes, total Go sys bytes, process RSS, and container working set with updated legends and byte units.
Scavenger effectiveness panel
resources/grafana/sources/rhacs-central.json
Plots heap idle, released, and retained bytes and changes the retained series styling.

Estimated code review effort: 3 (Moderate) | ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding Go memory lifecycle panels to the Central dashboard.
Description check ✅ Passed The description accurately matches the changeset by listing the new Go Metrics panels and their purpose.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ebenshet/ROX-35431-go-memory-dashboard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
resources/grafana/sources/rhacs-central.json (1)

2251-2391: 🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

Regenerate resources/grafana/generated/dashboards/rhacs-central.yaml
The committed dashboard artifact is missing the new Go metrics panels, so Grafana will keep using the stale dashboard until the generated file is rebuilt and committed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@resources/grafana/sources/rhacs-central.json` around lines 2251 - 2391, The
committed Grafana artifact is stale and missing the new Go metrics panel defined
in the rhacs-central dashboard JSON. Regenerate and commit the matching
generated dashboard output from the same source panel changes so the generated
artifact stays in sync with the dashboard definition; use the rhacs-central
dashboard source and its generated dashboard file as the key symbols to locate
the update.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@resources/grafana/sources/rhacs-central.json`:
- Around line 2251-2391: The committed Grafana artifact is stale and missing the
new Go metrics panel defined in the rhacs-central dashboard JSON. Regenerate and
commit the matching generated dashboard output from the same source panel
changes so the generated artifact stays in sync with the dashboard definition;
use the rhacs-central dashboard source and its generated dashboard file as the
key symbols to locate the update.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b9e7d8fc-5293-491a-ba60-3d7c20bb6ec6

📥 Commits

Reviewing files that changed from the base of the PR and between d867e60 and 3ff14a4.

📒 Files selected for processing (1)
  • resources/grafana/sources/rhacs-central.json

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
resources/grafana/sources/rhacs-central.json (1)

1610-1641: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Preserve pod identity in the new Go panels.

These targets use fixed legends like "mallocs/sec" and "live objects", so multi-pod Central series become indistinguishable in Grafana. That undercuts the PR goal of per-pod leak/GC/OOM diagnosis. Aggregate by the pod-identifying label and include it in legendFormat.

Example pattern
- "expr": "rate(go_memstats_mallocs_total{namespace=\"rhacs-$instance_id\",job=\"central\"}[1m])",
+ "expr": "sum by (pod) (rate(go_memstats_mallocs_total{namespace=\"rhacs-$instance_id\",job=\"central\"}[1m]))",
...
- "legendFormat": "mallocs/sec",
+ "legendFormat": "{{pod}} mallocs/sec",
- "expr": "go_memstats_heap_alloc_bytes{namespace=\"rhacs-$instance_id\",job=\"central\"} / go_memstats_heap_objects{namespace=\"rhacs-$instance_id\",job=\"central\"}",
+ "expr": "sum by (pod) (go_memstats_heap_alloc_bytes{namespace=\"rhacs-$instance_id\",job=\"central\"}) / sum by (pod) (go_memstats_heap_objects{namespace=\"rhacs-$instance_id\",job=\"central\"})",
...
- "legendFormat": "avg object size",
+ "legendFormat": "{{pod}} avg object size",

Also applies to: 1751-1768, 1907-1938, 2063-2094, 2187-2232, 2342-2373

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@resources/grafana/sources/rhacs-central.json` around lines 1610 - 1641, The
new Go panels in the Grafana dashboard are using fixed legend strings, which
collapses multiple Central pods into indistinguishable series. Update the
affected query definitions in the dashboard JSON so the PromQL aggregates
include the pod-identifying label and the `legendFormat` references that label,
using the existing panel/query blocks around the Go memory/GC metrics to
preserve per-pod identity across all listed sections.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@resources/grafana/sources/rhacs-central.json`:
- Around line 1610-1641: The new Go panels in the Grafana dashboard are using
fixed legend strings, which collapses multiple Central pods into
indistinguishable series. Update the affected query definitions in the dashboard
JSON so the PromQL aggregates include the pod-identifying label and the
`legendFormat` references that label, using the existing panel/query blocks
around the Go memory/GC metrics to preserve per-pod identity across all listed
sections.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 80502a01-e618-46a0-8506-64dc45c5a552

📥 Commits

Reviewing files that changed from the base of the PR and between 3ff14a4 and fe29ce5.

⛔ Files ignored due to path filters (1)
  • resources/grafana/generated/dashboards/rhacs-central.yaml is excluded by !**/generated/**
📒 Files selected for processing (1)
  • resources/grafana/sources/rhacs-central.json

@ebensh ebensh requested review from kurlov and removed request for GrimmiMeloni July 2, 2026 10:40
@ebensh ebensh enabled auto-merge July 2, 2026 13:23
@ebensh ebensh merged commit c31cfe3 into master Jul 3, 2026
2 checks passed
@ebensh ebensh deleted the ebenshet/ROX-35431-go-memory-dashboard branch July 3, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants