Skip to content

RetryOperation::Execute swallows the final exception → ProducerClient::Send silently fails #7130

@vifani

Description

@vifani

Library and version

azure-messaging-eventhubs on main

Summary

When every retry attempt of ProducerClient::Send fails, the call returns normally — no exception, no data delivered. The root cause is an off-by-one in RetryOperation::Execute (retry_operation.cpp).

Root cause: WasLastAttempt is never true inside the catch

WasLastAttempt(attempt) returns attempt >= MaxRetries. In Execute, attempt is the value of retryCount before the catch increments it:

int retryCount = 0;
while (retryCount < m_retryOptions.MaxRetries)   // N iterations
{
  try { ... }
  catch (EventHubsException const& e) {
    if (ShouldRetry(IsFatalException(e), retryCount, retryAfter)) {
      retryCount++;
      std::this_thread::sleep_for(retryAfter);
    } else {
      throw;        // <-- the only path that surfaces the failure
    }
  }
  // same shape for catch (std::exception)
}
return false;        // <-- final exception is lost here

For MaxRetries = N, retryCount inside the catch only ever takes the values 0, 1, ..., N-1 — never >= N. So ShouldRetry always returns true, the else { throw; } branch is unreachable, and after the final iteration control falls through to return false. The caller (ProducerClient::Send) ignores that return value, so the failure is silently dropped.

This affects both ENABLE_UAMQP and ENABLE_RUST_AMQP send paths.

Minimal reproduction

#include <azure/messaging/eventhubs.hpp>
#include <azure/identity.hpp>
#include <iostream>

using namespace Azure::Messaging::EventHubs;

int main()
{
  ProducerClientOptions options;
  options.RetryOptions.MaxRetries = 3;
  options.RetryOptions.RetryDelay = std::chrono::seconds(1);

  // Use a host that will resolve/connect-fail at send time, or a valid
  // namespace but an event hub name that doesn't exist. Any condition that
  // makes MessageSender::Send return a non-Ok MessageSendStatus on every
  // attempt reproduces the bug.
  ProducerClient producer(
      "nonexistent-hub.servicebus.windows.net",
      "does-not-exist",
      std::make_shared<Azure::Identity::DefaultAzureCredential>(),
      options);

  auto batch = producer.CreateBatch();
  batch.TryAdd(Models::EventData{"hello"});

  auto start = std::chrono::steady_clock::now();
  try
  {
    producer.Send(batch);
    auto elapsed = std::chrono::steady_clock::now() - start;
    std::cout << "Send returned normally after "
              << std::chrono::duration_cast<std::chrono::milliseconds>(elapsed).count()
              << " ms — BUG: no exception, batch was not delivered.\n";
  }
  catch (std::exception const& e)
  {
    std::cout << "Send threw (expected): " << e.what() << "\n";
  }
}

Expected vs. actual

  • Expected: Send throws EventHubsException describing the last failure.
  • Actual: Send returns normally after ~3 s of backoff sleep (0 + 1 + 2 s between the three attempts); the batch is not delivered.

Suggested fix

In RetryOperation::Execute, capture the in-flight exception in each catch (e.g. std::current_exception()) and, when the while loop exits because retries are exhausted, std::rethrow_exception(...) it instead of return false;. As defense in depth, also have ProducerClient::Send check the return value of retryOp.Execute(...) and throw if it is false.

Note for users

There is no client-side configuration workaround: MaxRetries = 0 skips the actual send entirely (while (0 < 0) never executes the lambda), and any MaxRetries >= 1 hits the swallow path.

Metadata

Metadata

Assignees

Labels

customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-triageWorkflow: This is a new issue that needs to be triaged to the appropriate team.questionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions