Skip to content

Truncate response errors vs swallowing if too large#433

Merged
selzoc merged 1 commit into
mainfrom
truncate-perform-errors
May 29, 2026
Merged

Truncate response errors vs swallowing if too large#433
selzoc merged 1 commit into
mainfrom
truncate-perform-errors

Conversation

@selzoc
Copy link
Copy Markdown
Member

@selzoc selzoc commented May 21, 2026

We found a case where the prepare step was running a vm out of disk, and there were many tar errors. The large amount of errors resulted in the agent returning "Response exceeded maximum allowed length" with no details, leading us to have to get onto the failing vm and examine the agent logs for more details.

This commit will attempt to truncate the error message to fit inside the max message length such that we can surface the error to the bosh task logs.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds UTF-8 boundary-aware truncation for exception response messages when marshaled JSON exceeds maxResponseLength. Introduces a truncationMarker constant and truncateExceptionResponse which probes JSON overhead with a marker, computes the remaining byte budget for the exception message, truncates the message at UTF-8 boundaries preserving head and tail around the marker while re-marshaling to account for escaping, and returns a modified exceptionResponse only if it fits. marshalResponse now attempts truncation for exception responses returned by response.Shorten() before falling back to the existing oversize error. Ginkgo tests cover behavior, UTF-8 correctness, escaping, fallback, and UnlimitedResponseLength.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: truncating error responses instead of returning a generic length-exceeded message when responses are too large.
Description check ✅ Passed The description is directly related to the changeset, explaining the problem (large errors causing generic messages) and the solution (truncating errors to fit within limits).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch truncate-perform-errors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 21, 2026
Alphasite
Alphasite previously approved these changes May 21, 2026
Comment thread handler/perform_handler_with_json.go Outdated
@selzoc selzoc dismissed stale reviews from Alphasite and coderabbitai[bot] via feaf4eb May 21, 2026 22:56
@selzoc selzoc force-pushed the truncate-perform-errors branch from a590649 to feaf4eb Compare May 21, 2026 22:56
Alphasite
Alphasite previously approved these changes May 21, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@handler/perform_handler_with_json_test.go`:
- Around line 93-113: Add a test to verify behavior when error messages contain
many JSON-escapable characters (quotes/backslashes) by calling
PerformHandlerWithJSON with a NewExceptionResponse whose error string is
strings.Repeat(`"`, maxLen) and assert the returned respBytes length is <=
maxLen; if truncation can't account for escaping, ensure the handler falls back
to a generic error payload instead of producing invalid/oversized JSON. Locate
the test near the existing UTF-8 truncation spec (the It block using
PerformHandlerWithJSON and NewExceptionResponse) and mirror its structure,
asserting either successful truncated JSON or the generic fallback response
while keeping utf8/json validity checks where appropriate.

In `@handler/perform_handler_with_json.go`:
- Around line 79-93: The truncation logic for excResp.Exception.Message assumes
byte-for-byte mapping to JSON, but json.Marshal can expand escapable characters;
fix by iteratively trimming and re-marshaling until the marshaled respJSON
length <= maxResponseLength: start with contentBytes (currently derived from
maxMsgBytes), slice msg to that length ensuring utf8.ValidString, set
excResp.Exception.Message = msg+truncatedSuffix, json.Marshal(excResp) and if
len(respJSON) > maxResponseLength reduce contentBytes (e.g., halve or decrement)
and repeat; ensure the loop breaks when contentBytes <= 0 and falls back to the
generic error path as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7fe17c39-891c-4781-bce0-6fcae68aed94

📥 Commits

Reviewing files that changed from the base of the PR and between a590649 and feaf4eb.

📒 Files selected for processing (2)
  • handler/perform_handler_with_json.go
  • handler/perform_handler_with_json_test.go

Comment thread handler/perform_handler_with_json_test.go
Comment thread handler/perform_handler_with_json.go Outdated
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 22, 2026
@github-project-automation github-project-automation Bot moved this from Waiting for Changes | Open for Contribution to Pending Merge | Prioritized in Foundational Infrastructure Working Group May 22, 2026
@selzoc
Copy link
Copy Markdown
Member Author

selzoc commented May 22, 2026

Marking as draft until I can test this on an actual vm

@selzoc selzoc marked this pull request as draft May 22, 2026 18:45
@selzoc selzoc force-pushed the truncate-perform-errors branch from 9f80ff9 to 7976a1e Compare May 28, 2026 22:26
@selzoc
Copy link
Copy Markdown
Member Author

selzoc commented May 28, 2026

I tested this with an actual vm, and the before looks like:

...
Task 132943 | 14:24:11 | Updating instance dummy-jammy: dummy-jammy/47a9d2de-705b-4544-bb28-362f49ea106d (0) (canary) (00:00:27)
                       L Error: Response exceeded maximum allowed length
Task 132943 | 14:24:38 | Error: Response exceeded maximum allowed length

Task 132943 Started  Thu May 28 14:21:45 UTC 2026
Task 132943 Finished Thu May 28 14:24:38 UTC 2026
Task 132943 Duration 00:02:53
Task 132943 error

When I replace the agent with the changes from this pr, we instead get:

Task 132948 | 14:26:59 | Updating instance dummy-jammy: dummy-jammy/47a9d2de-705b-4544-bb28-362f49ea106d (0) (canary) (00:00:28)
                       L Error: Action Failed get_task: Task ddc71c29-ae61-463a-6d66-53aa235f1659 result: Preparing apply spec: Preparing package disk-filler: Installing package directory: Decompressing package files: Shelling out to tar: Running command: 'tar --no-same-owner -xf /var/vcap/data/tmp/bosh-blobstore-externalBlobstore-Get3748154701 -C /var/vcap/data/packages/disk-filler/fe13631f68bfc68a0bea8fc6a356c29a319b58c7', stdout: '', stderr: 'tar: ./data/dir_127/file_12705: Wrote only 4096 of 10240 bytes
tar: ./data/dir_127/file_12702: Cannot write: No space left on device
tar: ./data/dir_127/file_12755: Cannot write: No space left on device
...
tar: ./data/dir_115/file_11540: Cannot open: No such file or directory
tar: ./data/dir_115: Cannot mkdir: No space left on device
tar: ./data/dir_115/file_11528: 
...[truncated]...
dir_051/file_05127: Cannot open: No such file or directory
tar: ./data/dir_051: Cannot mkdir: No space left on device
tar: ./data/dir_051/file_05108: Cannot open: No such file or directory
tar: ./data/dir_051: Cannot mkdir: No space left on device
...
tar: Exiting with failure status due to previous errors
': exit status 2

Task 132948 Started  Thu May 28 14:26:57 UTC 2026
Task 132948 Finished Thu May 28 14:27:27 UTC 2026
Task 132948 Duration 00:00:30
Task 132948 error

@selzoc selzoc marked this pull request as ready for review May 28, 2026 22:32
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
handler/perform_handler_with_json.go (1)

152-168: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Avoid recomputing response.Shorten() and re-marshaling the truncated candidate.

response.Shorten() is invoked at Line 152 and again at Line 160, and on the truncation path the candidate is marshaled inside truncateExceptionResponse and then marshaled a second time at Line 162. On this large-response path Shorten() operates on the already-oversized payload, so the duplicate call is the more meaningful waste. Capture the shortened response once and reuse it.

♻️ Capture Shorten() result once
 	if len(respJSON) > maxResponseLength {
-		respJSON, err = json.Marshal(response.Shorten())
+		shortened := response.Shorten()
+		respJSON, err = json.Marshal(shortened)
 		if err != nil {
 			logger.Error(mbusHandlerLogTag, "Failed to marshal response: %s", err.Error())
 			return respJSON, bosherr.WrapError(err, "Marshalling JSON response")
 		}
-	}
 
-	if len(respJSON) > maxResponseLength {
-		if excResp, ok := response.Shorten().(exceptionResponse); ok {
-			if truncated, ok := truncateExceptionResponse(excResp, maxResponseLength); ok {
-				respJSON, err = json.Marshal(truncated)
-				if err == nil && len(respJSON) <= maxResponseLength {
-					logger.Warn(mbusHandlerLogTag, "Response too large, exception message truncated to fit within %d bytes", maxResponseLength)
-					return respJSON, nil
-				}
-			}
-		}
+		if len(respJSON) > maxResponseLength {
+			if excResp, ok := shortened.(exceptionResponse); ok {
+				if truncated, ok := truncateExceptionResponse(excResp, maxResponseLength); ok {
+					respJSON, err = json.Marshal(truncated)
+					if err == nil && len(respJSON) <= maxResponseLength {
+						logger.Warn(mbusHandlerLogTag, "Response too large, exception message truncated to fit within %d bytes", maxResponseLength)
+						return respJSON, nil
+					}
+				}
+			}
 
-		respJSON, err = BuildErrorWithJSON(responseMaxLengthErrMsg, logger)
-		if err != nil {
-			logger.Error(mbusHandlerLogTag, "Failed to build 'max length exceeded' response: %s", err.Error())
-			return respJSON, bosherr.WrapError(err, "Building error")
+			respJSON, err = BuildErrorWithJSON(responseMaxLengthErrMsg, logger)
+			if err != nil {
+				logger.Error(mbusHandlerLogTag, "Failed to build 'max length exceeded' response: %s", err.Error())
+				return respJSON, bosherr.WrapError(err, "Building error")
+			}
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@handler/perform_handler_with_json.go` around lines 152 - 168, The code calls
response.Shorten() multiple times and re-marshals the same data, wasting CPU and
risking inconsistent results; change the flow to call response.Shorten() once
(store it in a local variable, e.g. shortened := response.Shorten()), use
shortened for the initial json.Marshal and for the large-response branch, pass
shortened into truncateExceptionResponse and only marshal the final truncated
value once before returning; update references in the large-response logic
(exceptionResponse type assertion, truncateExceptionResponse, and the subsequent
json.Marshal and logger.Warn) to use the stored shortened variable and avoid
recomputing Shorten().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@handler/perform_handler_with_json.go`:
- Around line 152-168: The code calls response.Shorten() multiple times and
re-marshals the same data, wasting CPU and risking inconsistent results; change
the flow to call response.Shorten() once (store it in a local variable, e.g.
shortened := response.Shorten()), use shortened for the initial json.Marshal and
for the large-response branch, pass shortened into truncateExceptionResponse and
only marshal the final truncated value once before returning; update references
in the large-response logic (exceptionResponse type assertion,
truncateExceptionResponse, and the subsequent json.Marshal and logger.Warn) to
use the stored shortened variable and avoid recomputing Shorten().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e8404c4e-8ddd-45ec-b5ae-c1dcb0ae3504

📥 Commits

Reviewing files that changed from the base of the PR and between 9f80ff9 and 7976a1e.

📒 Files selected for processing (2)
  • handler/perform_handler_with_json.go
  • handler/perform_handler_with_json_test.go

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 28, 2026
coderabbitai[bot]
coderabbitai Bot previously requested changes May 28, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@handler/perform_handler_with_json.go`:
- Around line 82-86: The early-return in truncateExceptionResponse that returns
excResp when len(msg) <= maxMsgBytes can wrongly bypass truncation because JSON
escaping may still push the marshaled size over the limit; modify
truncateExceptionResponse to verify the actual marshaled size (e.g.,
json.Marshal of excResp or its message field) fits before returning and only
short-circuit when the marshaled bytes <= maxMsgBytes, otherwise fall through to
the truncation loop; preserve the panic-safety guard by clamping contentBytes to
min(len(msg), contentBytes) before entering the loop so we never attempt
negative slicing of excResp.Exception.Message.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0db4a1cb-638c-477b-9e7e-94279b4eb332

📥 Commits

Reviewing files that changed from the base of the PR and between 7976a1e and b59777e.

📒 Files selected for processing (2)
  • handler/perform_handler_with_json.go
  • handler/perform_handler_with_json_test.go

Comment thread handler/perform_handler_with_json.go
@github-project-automation github-project-automation Bot moved this from Pending Merge | Prioritized to Waiting for Changes | Open for Contribution in Foundational Infrastructure Working Group May 28, 2026
@aramprice aramprice requested a review from Copilot May 28, 2026 23:03
aramprice
aramprice previously approved these changes May 28, 2026
@aramprice aramprice requested review from a team, benjaminguttmann-avtq and lnguyen and removed request for a team May 28, 2026 23:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Surfaces oversized exception responses to BOSH task logs by truncating the exception message (preserving head and tail around a ...[truncated]... marker) to fit within the NATS 1 MiB response cap, instead of replacing the entire response with the opaque "Response exceeded maximum allowed length" message.

Changes:

  • Adds truncateExceptionResponse helper that splits the message into a head and tail around a truncation marker, with an iterative re-marshal loop to account for JSON escape expansion and UTF-8 rune boundaries.
  • Wires the new truncation step into marshalResponse after Shorten() for exceptionResponse only; falls back to the existing generic error if truncation cannot fit.
  • Adds Ginkgo tests covering normal responses, head/tail preservation, size cap adherence, JSON-escapable input, multi-byte UTF-8 input, unlimited mode, and the small-limit fallback path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
handler/perform_handler_with_json.go Adds truncateExceptionResponse and integrates it into marshalResponse between Shorten() and the generic max-length fallback.
handler/perform_handler_with_json_test.go New test file exercising truncation behavior, UTF-8 safety, escape expansion, fallback, and unlimited-length pass-through.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread handler/perform_handler_with_json.go Outdated
// next iteration converges toward the target size.
newContentBytes := contentBytes * maxLength / len(candidateJSON)
if newContentBytes >= contentBytes {
contentBytes-- // guard against floating-point rounding stalls
We found a case where the prepare step was running a vm out of disk, and
there were many tar errors.  The large amount of errors resulted in the
agent returning "Response exceeded maximum allowed length" with no
details, leading us to have to get onto the failing vm and examine the
agent logs for more details.

This commit will attempt to truncate the error message to fit inside the
max message length such that we can surface the error to the bosh task
logs.

ai-assisted=yes
[TNZ-88995]
@github-project-automation github-project-automation Bot moved this from Waiting for Changes | Open for Contribution to Pending Merge | Prioritized in Foundational Infrastructure Working Group May 29, 2026
@selzoc selzoc merged commit 162531a into main May 29, 2026
18 checks passed
@github-project-automation github-project-automation Bot moved this from Pending Merge | Prioritized to Done in Foundational Infrastructure Working Group May 29, 2026
@selzoc selzoc deleted the truncate-perform-errors branch May 29, 2026 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

4 participants