Skip to content

api,metrics: add changefeed operation history (#5095)#5105

Open
ti-chi-bot wants to merge 1 commit into
pingcap:release-8.5from
ti-chi-bot:cherry-pick-5095-to-release-8.5
Open

api,metrics: add changefeed operation history (#5095)#5105
ti-chi-bot wants to merge 1 commit into
pingcap:release-8.5from
ti-chi-bot:cherry-pick-5095-to-release-8.5

Conversation

@ti-chi-bot
Copy link
Copy Markdown
Member

This is an automated cherry-pick of #5095

What problem does this PR solve?

Issue Number: close #5087

What is changed and how it works?

  • Add structured audit logs for public changefeed mutation APIs: create, update, pause, resume, and delete.
  • Keep a bounded in-memory metric history for the latest 100 operations so the default Prometheus dashboard can show recent investigation context without requiring a log datasource.
  • Add a Changefeed Operation History table panel with operation time, result, username, non-sensitive details, and error summary.
  • Include operation-specific summaries such as changed update fields, resume checkpoint overwrite state, and delete pre-state/checkpoint context.

Check List

Tests

  • Unit test
  • Manual test
CleanShot 2026-05-19 at 14 41 43@2x

Questions

Will it cause performance regression or break compatibility?

No compatibility change. The dashboard-facing metric cache is bounded to the latest 100 operations to avoid unbounded cardinality growth.

Do you need to update user documentation, design documentation or monitoring documentation?

The Grafana dashboard is updated in this PR. No separate user or design documentation change is required.

Release note

Add a Changefeed Operation History panel to help investigate recent user-initiated changefeed operations.

Summary by CodeRabbit

Release Notes

  • New Features
    • Changefeed operations (create, update, pause, resume, delete) are now recorded and tracked with metrics, capturing success/failure status, authenticated user, operation timing, and relevant details.
    • Added "Changefeed Operation History" dashboard panels to Grafana for monitoring recent changefeed operations across all environments.

Review Change Stack

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot ti-chi-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR. labels May 20, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 20, 2026

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

  1. It must be LGTMed and approved by the reviewers firstly.
  2. For pull requests to TiDB-x branches, it must have no failed tests.
  3. AFTER it has lgtm and approved labels, please wait for the cherry-pick merging approval from triage owners.
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 20, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lidezhu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link
Copy Markdown
Member Author

@wlwilliamx This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 20, 2026

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c9cda6d3-94c6-452f-afad-bfd2fc1adb5d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new ChangefeedOperationMiddleware to track user-initiated changefeed mutations for logging and Grafana dashboards, including a bounded in-memory store and new Prometheus metrics. However, the PR contains critical issues: several files, including metric definitions and Grafana JSON configurations, contain unresolved git conflict markers that will cause compilation and parsing errors. Additionally, the middleware references an undefined constant ClientVersionHeader, and it is recommended to use seconds instead of milliseconds for Prometheus gauge values to follow industry best practices.

Comment thread pkg/metrics/changefeed.go
Comment on lines +76 to +98
<<<<<<< HEAD
=======
// ChangefeedErrorInfoGauge records the current warning or failed reason and its occurrence time
// for each changefeed.
ChangefeedErrorInfoGauge = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: "ticdc",
Subsystem: "owner",
Name: "changefeed_error_info",
Help: "The current warning or failed reason and occurrence time of changefeeds",
}, []string{getKeyspaceLabel(), "changefeed", "state", "error_time", "code", "message"})

// ChangefeedOperationTimeGauge records a bounded set of recent user initiated
// changefeed operation timestamps for the Grafana investigation panel.
ChangefeedOperationTimeGauge = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: "ticdc",
Subsystem: "owner",
Name: "changefeed_operation_time",
Help: "Recent user initiated changefeed operation timestamps in Unix milliseconds",
}, []string{getKeyspaceLabel(), "changefeed", "operation", "result", "username", "details", "error", "event_id"})

>>>>>>> 3a652c164 (api,metrics: add changefeed operation history (#5095))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This file contains git conflict markers (<<<<<<< HEAD, =======, >>>>>>>). This will cause a compilation error. It appears the cherry-pick was not resolved correctly before committing.

	// ChangefeedErrorInfoGauge records the current warning or failed reason and its occurrence time
	// for each changefeed.
	ChangefeedErrorInfoGauge = prometheus.NewGaugeVec(
		prometheus.GaugeOpts{
			Namespace: "ticdc",
			Subsystem: "owner",
			Name:      "changefeed_error_info",
			Help:      "The current warning or failed reason and occurrence time of changefeeds",
		}, []string{getKeyspaceLabel(), "changefeed", "state", "error_time", "code", "message"})

	// ChangefeedOperationTimeGauge records a bounded set of recent user initiated
	// changefeed operation timestamps for the Grafana investigation panel.
	ChangefeedOperationTimeGauge = prometheus.NewGaugeVec(
		prometheus.GaugeOpts{
			Namespace: "ticdc",
			Subsystem: "owner",
			Name:      "changefeed_operation_time",
			Help:      "Recent user initiated changefeed operation timestamps in Unix milliseconds",
		}, []string{getKeyspaceLabel(), "changefeed", "operation", "result", "username", "details", "error", "event_id"})

Comment thread pkg/metrics/changefeed.go
Comment on lines +125 to +129
<<<<<<< HEAD
=======
registry.MustRegister(ChangefeedErrorInfoGauge)
registry.MustRegister(ChangefeedOperationTimeGauge)
>>>>>>> 3a652c164 (api,metrics: add changefeed operation history (#5095))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This section contains git conflict markers. Please remove them to ensure the code compiles and correctly registers the new metrics.

	registry.MustRegister(ChangefeedErrorInfoGauge)
	registry.MustRegister(ChangefeedOperationTimeGauge)

Comment on lines +4594 to +4907
<<<<<<< HEAD
=======
},
{
"datasource": "${DS_TEST-CLUSTER}",
"description": "Current warning or failed reason of each changefeed. The metric message is normalized to a single line and truncated to 256 characters.",
"fieldConfig": {
"defaults": {
"custom": {
"align": null,
"filterable": false
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "namespace"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "changefeed"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.width",
"value": 100
}
]
},
{
"matcher": {
"id": "byName",
"options": "code"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "error_time"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
}
]
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 26
},
"id": 62010,
"options": {
"showHeader": true,
"sortBy": []
},
"pluginVersion": "7.5.17",
"targets": [
{
"expr": "max by (namespace, changefeed, state, code, error_time, message) (ticdc_owner_changefeed_error_info{k8s_cluster=\"$k8s_cluster\", tidb_cluster=\"$tidb_cluster\", namespace=~\"$namespace\", changefeed=~\"$changefeed\"})",
"format": "time_series",
"instant": true,
"refId": "A"
}
],
"title": "Changefeed Error Details",
"transformations": [
{
"id": "labelsToFields",
"options": {}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Metric": true,
"Time": true,
"Value": true,
"__name__": true
},
"indexByName": {
"namespace": 0,
"changefeed": 1,
"state": 2,
"error_time": 3,
"code": 4,
"message": 5
},
"renameByName": {}
}
}
],
"type": "table"
},
{
"datasource": "${DS_TEST-CLUSTER}",
"description": "Recent user initiated changefeed mutations retained in memory on the coordinator for oncall investigation. Use TiCDC logs for durable history beyond the latest 100 operations.",
"fieldConfig": {
"defaults": {
"custom": {
"align": null,
"filterable": false
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "namespace"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "changefeed"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "operation_time"
},
"properties": [
{
"id": "custom.width",
"value": 180
},
{
"id": "unit",
"value": "dateTimeAsIso"
}
]
},
{
"matcher": {
"id": "byName",
"options": "operation"
},
"properties": [
{
"id": "custom.width",
"value": 100
}
]
},
{
"matcher": {
"id": "byName",
"options": "result"
},
"properties": [
{
"id": "custom.width",
"value": 90
}
]
},
{
"matcher": {
"id": "byName",
"options": "username"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "details"
},
"properties": [
{
"id": "custom.width",
"value": 320
}
]
}
]
},
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 34
},
"id": 62042,
"options": {
"showHeader": true,
"sortBy": [
{
"displayName": "operation_time",
"desc": true
}
]
},
"pluginVersion": "7.5.17",
"targets": [
{
"expr": "max by (namespace, changefeed, operation, result, username, details, error, event_id) (ticdc_owner_changefeed_operation_time{k8s_cluster=\"$k8s_cluster\", tidb_cluster=\"$tidb_cluster\", namespace=~\"$namespace\", changefeed=~\"$changefeed\"})",
"format": "time_series",
"instant": true,
"refId": "A"
}
],
"title": "Changefeed Operation History",
"transformations": [
{
"id": "labelsToFields",
"options": {}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Metric": true,
"Time": true,
"__name__": true,
"event_id": true
},
"indexByName": {
"namespace": 0,
"changefeed": 1,
"Value": 2,
"operation": 3,
"result": 4,
"username": 5,
"details": 6,
"error": 7
},
"renameByName": {
"Value": "operation_time"
}
}
}
],
"type": "table"
>>>>>>> 3a652c164 (api,metrics: add changefeed operation history (#5095))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The Grafana dashboard JSON contains git conflict markers (e.g., lines 4594, 4595, 4907). This makes the JSON invalid and will prevent the dashboard from being loaded. Please resolve the conflicts and remove the markers.

Comment on lines +4594 to +4907
<<<<<<< HEAD
=======
},
{
"datasource": "${DS_TEST-CLUSTER}",
"description": "Current warning or failed reason of each changefeed. The metric message is normalized to a single line and truncated to 256 characters.",
"fieldConfig": {
"defaults": {
"custom": {
"align": null,
"filterable": false
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "keyspace_name"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "changefeed"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.width",
"value": 100
}
]
},
{
"matcher": {
"id": "byName",
"options": "code"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "error_time"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
}
]
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 26
},
"id": 62010,
"options": {
"showHeader": true,
"sortBy": []
},
"pluginVersion": "7.5.17",
"targets": [
{
"expr": "max by (keyspace_name, changefeed, state, code, error_time, message) (ticdc_owner_changefeed_error_info{k8s_cluster=\"$k8s_cluster\", sharedpool_id=\"$tidb_cluster\", keyspace_name=~\"$keyspace_name\", changefeed=~\"$changefeed\"})",
"format": "time_series",
"instant": true,
"refId": "A"
}
],
"title": "Changefeed Error Details",
"transformations": [
{
"id": "labelsToFields",
"options": {}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Metric": true,
"Time": true,
"Value": true,
"__name__": true
},
"indexByName": {
"keyspace_name": 0,
"changefeed": 1,
"state": 2,
"error_time": 3,
"code": 4,
"message": 5
},
"renameByName": {}
}
}
],
"type": "table"
},
{
"datasource": "${DS_TEST-CLUSTER}",
"description": "Recent user initiated changefeed mutations retained in memory on the coordinator for oncall investigation. Use TiCDC logs for durable history beyond the latest 100 operations.",
"fieldConfig": {
"defaults": {
"custom": {
"align": null,
"filterable": false
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "keyspace_name"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "changefeed"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "operation_time"
},
"properties": [
{
"id": "custom.width",
"value": 180
},
{
"id": "unit",
"value": "dateTimeAsIso"
}
]
},
{
"matcher": {
"id": "byName",
"options": "operation"
},
"properties": [
{
"id": "custom.width",
"value": 100
}
]
},
{
"matcher": {
"id": "byName",
"options": "result"
},
"properties": [
{
"id": "custom.width",
"value": 90
}
]
},
{
"matcher": {
"id": "byName",
"options": "username"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "details"
},
"properties": [
{
"id": "custom.width",
"value": 320
}
]
}
]
},
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 34
},
"id": 62042,
"options": {
"showHeader": true,
"sortBy": [
{
"displayName": "operation_time",
"desc": true
}
]
},
"pluginVersion": "7.5.17",
"targets": [
{
"expr": "max by (keyspace_name, changefeed, operation, result, username, details, error, event_id) (ticdc_owner_changefeed_operation_time{k8s_cluster=\"$k8s_cluster\", sharedpool_id=\"$tidb_cluster\", keyspace_name=~\"$keyspace_name\", changefeed=~\"$changefeed\"})",
"format": "time_series",
"instant": true,
"refId": "A"
}
],
"title": "Changefeed Operation History",
"transformations": [
{
"id": "labelsToFields",
"options": {}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Metric": true,
"Time": true,
"__name__": true,
"event_id": true
},
"indexByName": {
"keyspace_name": 0,
"changefeed": 1,
"Value": 2,
"operation": 3,
"result": 4,
"username": 5,
"details": 6,
"error": 7
},
"renameByName": {
"Value": "operation_time"
}
}
}
],
"type": "table"
>>>>>>> 3a652c164 (api,metrics: add changefeed operation history (#5095))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The Grafana dashboard JSON contains git conflict markers. This will break the JSON structure and prevent it from being imported correctly.

Comment on lines +2337 to +2650
<<<<<<< HEAD
=======
},
{
"datasource": "${DS_TEST-CLUSTER}",
"description": "Current warning or failed reason of each changefeed. The metric message is normalized to a single line and truncated to 256 characters.",
"fieldConfig": {
"defaults": {
"custom": {
"align": null,
"filterable": false
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "keyspace_name"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "changefeed"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.width",
"value": 100
}
]
},
{
"matcher": {
"id": "byName",
"options": "code"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "error_time"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
}
]
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 26
},
"id": 62010,
"options": {
"showHeader": true,
"sortBy": []
},
"pluginVersion": "7.5.17",
"targets": [
{
"expr": "max by (keyspace_name, changefeed, state, code, error_time, message) (ticdc_owner_changefeed_error_info{k8s_cluster=\"$k8s_cluster\", tidb_cluster=\"$tidb_cluster\", keyspace_name=~\"$keyspace_name\", changefeed=~\"$changefeed\"})",
"format": "time_series",
"instant": true,
"refId": "A"
}
],
"title": "Changefeed Error Details",
"transformations": [
{
"id": "labelsToFields",
"options": {}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Metric": true,
"Time": true,
"Value": true,
"__name__": true
},
"indexByName": {
"keyspace_name": 0,
"changefeed": 1,
"state": 2,
"error_time": 3,
"code": 4,
"message": 5
},
"renameByName": {}
}
}
],
"type": "table"
},
{
"datasource": "${DS_TEST-CLUSTER}",
"description": "Recent user initiated changefeed mutations retained in memory on the coordinator for oncall investigation. Use TiCDC logs for durable history beyond the latest 100 operations.",
"fieldConfig": {
"defaults": {
"custom": {
"align": null,
"filterable": false
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "keyspace_name"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "changefeed"
},
"properties": [
{
"id": "custom.width",
"value": 180
}
]
},
{
"matcher": {
"id": "byName",
"options": "operation_time"
},
"properties": [
{
"id": "custom.width",
"value": 180
},
{
"id": "unit",
"value": "dateTimeAsIso"
}
]
},
{
"matcher": {
"id": "byName",
"options": "operation"
},
"properties": [
{
"id": "custom.width",
"value": 100
}
]
},
{
"matcher": {
"id": "byName",
"options": "result"
},
"properties": [
{
"id": "custom.width",
"value": 90
}
]
},
{
"matcher": {
"id": "byName",
"options": "username"
},
"properties": [
{
"id": "custom.width",
"value": 120
}
]
},
{
"matcher": {
"id": "byName",
"options": "details"
},
"properties": [
{
"id": "custom.width",
"value": 320
}
]
}
]
},
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 34
},
"id": 62042,
"options": {
"showHeader": true,
"sortBy": [
{
"displayName": "operation_time",
"desc": true
}
]
},
"pluginVersion": "7.5.17",
"targets": [
{
"expr": "max by (keyspace_name, changefeed, operation, result, username, details, error, event_id) (ticdc_owner_changefeed_operation_time{k8s_cluster=\"$k8s_cluster\", tidb_cluster=\"$tidb_cluster\", keyspace_name=~\"$keyspace_name\", changefeed=~\"$changefeed\"})",
"format": "time_series",
"instant": true,
"refId": "A"
}
],
"title": "Changefeed Operation History",
"transformations": [
{
"id": "labelsToFields",
"options": {}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Metric": true,
"Time": true,
"__name__": true,
"event_id": true
},
"indexByName": {
"keyspace_name": 0,
"changefeed": 1,
"Value": 2,
"operation": 3,
"result": 4,
"username": 5,
"details": 6,
"error": 7
},
"renameByName": {
"Value": "operation_time"
}
}
}
],
"type": "table"
>>>>>>> 3a652c164 (api,metrics: add changefeed operation history (#5095))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The Grafana dashboard JSON contains git conflict markers. This will break the JSON structure and prevent it from being imported correctly.

zap.String("username", username),
zap.String("ip", c.ClientIP()),
zap.String("userAgent", c.Request.UserAgent()),
zap.String("clientVersion", c.Request.Header.Get(ClientVersionHeader)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The constant ClientVersionHeader is used but not defined in this file or imported. If it is defined in pkg/api, it should be referenced as api.ClientVersionHeader.

Suggested change
zap.String("clientVersion", c.Request.Header.Get(ClientVersionHeader)),
zap.String("clientVersion", c.Request.Header.Get(api.ClientVersionHeader)),

// The dashboard only needs a recent investigation window. Keep this cache
// bounded so user names and detail strings do not become unbounded metric
// cardinality over long-running clusters.
metrics.ChangefeedOperationTimeGauge.WithLabelValues(labels.labelValues()...).Set(float64(operationTime.UnixMilli()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Prometheus best practices generally recommend using seconds for timestamps in gauges. While the dashboard is configured for dateTimeAsIso, Grafana usually expects the value to be in seconds for this unit. Using milliseconds might result in incorrect date displays in some Grafana versions or configurations unless explicitly handled.

Suggested change
metrics.ChangefeedOperationTimeGauge.WithLabelValues(labels.labelValues()...).Set(float64(operationTime.UnixMilli()))
metrics.ChangefeedOperationTimeGauge.WithLabelValues(labels.labelValues()...).Set(float64(operationTime.Unix()))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/cherry-pick-not-approved do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants