Skip to content

fix(eventhubs): rethrow last exception when retries are exhausted#7131

Open
j7nw4r wants to merge 1 commit into
Azure:mainfrom
j7nw4r:fix/7130-retry-operation-rethrow
Open

fix(eventhubs): rethrow last exception when retries are exhausted#7131
j7nw4r wants to merge 1 commit into
Azure:mainfrom
j7nw4r:fix/7130-retry-operation-rethrow

Conversation

@j7nw4r
Copy link
Copy Markdown
Member

@j7nw4r j7nw4r commented May 20, 2026

Summary

Fixes #7130.

RetryOperation::Execute silently swallowed the final exception when every retry attempt of ProducerClient::Send threw: WasLastAttempt is checked against retryCount before the catch increments it, so the value never reaches MaxRetries inside the catch and the else { throw; } branch is unreachable. Control falls through to return false;, and ProducerClient::Send ignores that value — the batch is dropped without any error surfaced to the caller.

This change:

  • Captures std::current_exception() in each catch block of RetryOperation::Execute.
  • Clears the captured exception when a later attempt completes without throwing, so a transient failure followed by a clean false return does not surface a stale exception.
  • Rethrows the captured exception when the retry loop terminates with retries exhausted.
  • Adds a defense-in-depth check in ProducerClient::Send: if Execute ever does return false, throw an EventHubsException instead of dropping the batch silently.
  • Adds regression tests covering: rethrow of EventHubsException after exhausted retries, rethrow of std::exception, immediate throw on fatal exceptions (no retry), success after a transient exception, and the cleared-stale-exception case where a later attempt returns false cleanly.
  • CHANGELOG entry under 1.0.0-beta.12 Unreleased.

Test plan

  • cmake --build build --target azure-messaging-eventhubs-test succeeds.
  • azure-messaging-eventhubs-test --gtest_filter=RetryOperationTest.* passes (11/11).
  • Full CI green on the PR (Azure pipelines + GitHub Actions).
  • Reviewer spot-check of the captured-exception lifecycle in RetryOperation::Execute and the new defense-in-depth branch in ProducerClient::Send.

RetryOperation::Execute returned false instead of rethrowing the
captured exception when every retry attempt threw. WasLastAttempt
is checked against retryCount before the catch increments it, so
the value never reaches MaxRetries inside the catch and the
"else { throw; }" branch is unreachable. ProducerClient::Send
ignored the return value, so a fully failed Send appeared to
succeed and silently dropped the batch.

Capture std::current_exception() in each catch, clear it when a
later attempt completes without throwing, and rethrow it when the
retry loop terminates. As defense in depth, ProducerClient::Send
now throws when Execute reports failure.

Fixes Azure#7130.
Copilot AI review requested due to automatic review settings May 20, 2026 19:51
@j7nw4r j7nw4r requested review from axisc, hmlam and sjkwak as code owners May 20, 2026 19:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a reliability issue in azure-messaging-eventhubs where RetryOperation::Execute() could exhaust retries and then silently return false, causing ProducerClient::Send() to drop a batch without surfacing the final failure (issue #7130).

Changes:

  • Track and rethrow the most recent exception in RetryOperation::Execute() when the retry loop terminates due to exhaustion.
  • Add a defense-in-depth check in ProducerClient::Send() to throw if RetryOperation::Execute() ever reports failure via a false return.
  • Add regression unit tests for retry exhaustion behavior (EventHubsException and std::exception paths) and for transient-failure recovery behavior; add a CHANGELOG entry.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
sdk/eventhubs/azure-messaging-eventhubs/test/ut/retry_operation_test.cpp Adds regression tests to validate exception rethrow on retry exhaustion and correct behavior after transient failures.
sdk/eventhubs/azure-messaging-eventhubs/src/retry_operation.cpp Captures the last thrown exception and rethrows it after the retry loop ends.
sdk/eventhubs/azure-messaging-eventhubs/src/producer_client.cpp Adds a fallback exception if Execute() returns false so sends can’t silently fail.
sdk/eventhubs/azure-messaging-eventhubs/CHANGELOG.md Documents the bug fix under the current Unreleased section.

Comment on lines 31 to 35
int retryCount = 0;
// Capture the most recent exception so a retries-exhausted fallthrough can
// rethrow it instead of silently returning false. See issue #7130.
std::exception_ptr lastException;
while (retryCount < m_retryOptions.MaxRetries)
Comment on lines 39 to 47
try
{
bool result = operation();

if (ShouldRetry(result, retryCount, retryAfter))
{
lastException = nullptr;
retryCount++;
std::this_thread::sleep_for(retryAfter);
Comment on lines +130 to +131
throw Azure::Messaging::EventHubs::EventHubsException(
"ProducerClient::Send failed after exhausting all retry attempts.");
Comment on lines +165 to +183
auto opts = LocalTest::MakeFastRetryOptions(3);
Azure::Messaging::EventHubs::_detail::RetryOperation retryOp(opts);

int callCount = 0;
auto throwsThenReturnsFalse = [&callCount]() -> bool {
++callCount;
if (callCount == 1)
{
throw Azure::Messaging::EventHubs::EventHubsException("first attempt fails");
}
return false;
};

// Second attempt returns false cleanly. ShouldRetry(false=response, retryCount=1)
// returns true, so the loop retries; on attempt 3 the loop terminates and Execute
// returns false. The exception from attempt 1 must not be rethrown.
EXPECT_NO_THROW({ EXPECT_FALSE(retryOp.Execute(throwsThenReturnsFalse)); });
EXPECT_EQ(opts.MaxRetries, callCount);
}
@j7nw4r
Copy link
Copy Markdown
Member Author

j7nw4r commented May 20, 2026

/azp run cpp - eventhubs - ci

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RetryOperation::Execute swallows the final exception → ProducerClient::Send silently fails

2 participants