Skip to content

feat: metrics for error codes#2393

Draft
cstockton wants to merge 2 commits intomasterfrom
cs/error-code-metrics-record
Draft

feat: metrics for error codes#2393
cstockton wants to merge 2 commits intomasterfrom
cs/error-code-metrics-record

Conversation

@cstockton
Copy link
Contributor

This is a draft PR I've gone through to get a better understanding of our error handling and how to implement metrics for errors. I'll go over the changes in detail in a follow up pr.

Chris Stockton added 2 commits February 24, 2026 14:52
Replace type switch based error detection with shared sentinel errors
and errors.Is. Each model error now implements Is() to map to the
appropriate sentinel.

Removes large IsNotFoundError and IsUniqueConstraintViolatedError switches.
This is a draft pr I will expand on later.
@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/api/errors.go (1)

48-66: ⚠️ Potential issue | 🔴 Critical

Critical: Metric recording executes on every request, not just panics.

Lines 49-50 are placed outside the if rvr := recover() block, causing RecordPostgresCode("25005") and RecordErrorCode(ErrorCodeUserSSOManaged) to execute on every request that passes through the recoverer middleware, not just when a panic occurs. This will corrupt metrics with massive false positives.

Proposed fix - move metric calls inside the panic handler
 	fn := func(w http.ResponseWriter, r *http.Request) {
 		defer func() {
-			apierrors.RecordPostgresCode(r.Context(), "25005")
-			apierrors.RecordErrorCode(r.Context(), apierrors.ErrorCodeUserSSOManaged)
 			if rvr := recover(); rvr != nil {
+				apierrors.RecordPostgresCode(r.Context(), "25005")
+				apierrors.RecordErrorCode(r.Context(), apierrors.ErrorCodeUserSSOManaged)
 				logEntry := observability.GetLogEntry(r)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/api/errors.go` around lines 48 - 66, The metrics
RecordPostgresCode(r.Context(), "25005") and RecordErrorCode(r.Context(),
apierrors.ErrorCodeUserSSOManaged) are executed unconditionally; move those two
calls inside the panic branch so they run only when rvr != nil (i.e., within the
if rvr := recover(); rvr != nil { ... } block). Update the defer in errors.go so
that RecordPostgresCode and RecordErrorCode are invoked immediately after
entering the panic handler (before logging and before creating the
HTTPError/HandleResponseError), keeping the existing observability logging and
HTTP error handling intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/api/apierrors/errorcode_test.go`:
- Line 64: Change the file permission used in the os.WriteFile call in
internal/api/apierrors/errorcode_test.go from 0644 to a more restrictive mode
(e.g., 0600) to satisfy Gosec G306; locate the os.WriteFile invocation that
writes "errorcode_gen.go" and replace the permission argument with 0600
(os.FileMode(0600)) so the generated file is created with restricted
permissions.

In `@internal/api/apierrors/metrics.go`:
- Around line 43-47: The initMetrics function validates the wrong map: it checks
the package-level errorCodesMap instead of the passed-in parameter ecm; update
the validation in initMetrics to use len(ecm) and ensure subsequent uses inside
initMetrics reference the parameter ecm (not the global errorCodesMap) so the
function actually validates and operates on the provided map.

---

Outside diff comments:
In `@internal/api/errors.go`:
- Around line 48-66: The metrics RecordPostgresCode(r.Context(), "25005") and
RecordErrorCode(r.Context(), apierrors.ErrorCodeUserSSOManaged) are executed
unconditionally; move those two calls inside the panic branch so they run only
when rvr != nil (i.e., within the if rvr := recover(); rvr != nil { ... }
block). Update the defer in errors.go so that RecordPostgresCode and
RecordErrorCode are invoked immediately after entering the panic handler (before
logging and before creating the HTTPError/HandleResponseError), keeping the
existing observability logging and HTTP error handling intact.

ℹ️ Review info

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Cache: Disabled due to Reviews > Disable Cache setting

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bdb13e3 and 069fd25.

📒 Files selected for processing (13)
  • cmd/serve_cmd.go
  • internal/api/apierrors/errorcode.go
  • internal/api/apierrors/errorcode_gen.go
  • internal/api/apierrors/errorcode_test.go
  • internal/api/apierrors/metrics.go
  • internal/api/apierrors/metrics_test.go
  • internal/api/errors.go
  • internal/api/middleware.go
  • internal/models/errors.go
  • internal/models/oauth_authorization.go
  • internal/models/oauth_client.go
  • internal/models/one_time_token.go
  • internal/observability/metrics.go

sb.WriteString("}\n\n")
}

os.WriteFile("errorcode_gen.go", []byte(sb.String()), 0644)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix file permissions to resolve pipeline failure.

Gosec G306: os.WriteFile uses permissions 0644; recommended 0600 or less.

Proposed fix
-	os.WriteFile("errorcode_gen.go", []byte(sb.String()), 0644)
+	os.WriteFile("errorcode_gen.go", []byte(sb.String()), 0600)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
os.WriteFile("errorcode_gen.go", []byte(sb.String()), 0644)
os.WriteFile("errorcode_gen.go", []byte(sb.String()), 0600)
🧰 Tools
🪛 GitHub Actions: Test

[error] 64-64: Gosec G306: os.WriteFile uses permissions 0644; recommended 0600 or less.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/api/apierrors/errorcode_test.go` at line 64, Change the file
permission used in the os.WriteFile call in
internal/api/apierrors/errorcode_test.go from 0644 to a more restrictive mode
(e.g., 0600) to satisfy Gosec G306; locate the os.WriteFile invocation that
writes "errorcode_gen.go" and replace the permission argument with 0600
(os.FileMode(0600)) so the generated file is created with restricted
permissions.

Comment on lines +43 to +47
func initMetrics(ecm map[string]string) error {
if len(errorCodesMap) == 0 {
const msg = "InitMetrics: errorCodesMap is empty"
return errors.New(msg)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bug: Validation uses global errorCodesMap instead of parameter ecm.

The function accepts ecm as a parameter but line 44 validates the global errorCodesMap instead. This makes the parameter partially unused and the validation incorrect when ecm differs from the global.

Proposed fix
 func initMetrics(ecm map[string]string) error {
-	if len(errorCodesMap) == 0 {
+	if len(ecm) == 0 {
 		const msg = "InitMetrics: errorCodesMap is empty"
 		return errors.New(msg)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func initMetrics(ecm map[string]string) error {
if len(errorCodesMap) == 0 {
const msg = "InitMetrics: errorCodesMap is empty"
return errors.New(msg)
}
func initMetrics(ecm map[string]string) error {
if len(ecm) == 0 {
const msg = "InitMetrics: errorCodesMap is empty"
return errors.New(msg)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/api/apierrors/metrics.go` around lines 43 - 47, The initMetrics
function validates the wrong map: it checks the package-level errorCodesMap
instead of the passed-in parameter ecm; update the validation in initMetrics to
use len(ecm) and ensure subsequent uses inside initMetrics reference the
parameter ecm (not the global errorCodesMap) so the function actually validates
and operates on the provided map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant