Skip to content

ox AdaptiveRetry stops retrying on Left when shouldPayFailureCost is false #442

@haskiindahouse

Description

@haskiindahouse

What

Project

ox (com.softwaremill.ox)

Category

logic-error

Severity

logic-error (retry incorrectly stops instead of continuing for free)

Versions

All ox versions with AdaptiveRetry

Reproducer

// In AdaptiveRetry.retryWithErrorMode, the afterAttempt callback:

// Left (error) case:
case Left(value) =>
  if config.resultPolicy.isWorthRetrying(value) then
    if shouldPayFailureCost(Left(value)) then ScheduleStop(!tokenBucket.tryAcquire(failureCost))
    else ScheduleStop.Yes  // BUG: stops retrying instead of continuing for free
  else ScheduleStop.Yes

// Right (non-success) case:
case Right(value) =>
  // ... (success check omitted)
  else if shouldPayFailureCost(Right(value)) then ScheduleStop(!tokenBucket.tryAcquire(failureCost))
  else ScheduleStop.No  // CORRECT: continues retrying for free

To trigger: Call retryWithErrorMode or retryEither with a shouldPayFailureCost that returns false for a Left error. The retry stops immediately on the first such error, even though:

  1. isWorthRetrying returned true (error is retriable)
  2. We decided not to pay the cost (free retry intended)

Expected behavior

When shouldPayFailureCost returns false, the retry should continue without consuming tokens — matching the Right case behavior (ScheduleStop.No).

Actual behavior

ScheduleStop.Yes (stop = true) is returned, terminating the retry immediately. This contradicts the documentation: "Penalty is paid only if it is decided to retry operation."

Root cause

File: core/src/main/scala/ox/resilience/AdaptiveRetry.scala, line 79:

else ScheduleStop.Yes  // should be ScheduleStop.No

Fix

Change line 79 from:

else ScheduleStop.Yes

to:

else ScheduleStop.No

Duplicate check

  • Searched ox GitHub issues for "shouldPayFailureCost", "AdaptiveRetry", "ScheduleStop" — no results related to this specific logic error

Found by

Lane 2, Iteration 6, Strategy B3 (Error Handling Review)

Verified: YES

Evidence: Lines 75-88 of AdaptiveRetry.scala:

case Left(value) =>
  if config.resultPolicy.isWorthRetrying(value) then
    if shouldPayFailureCost(Left(value)) then ScheduleStop(!tokenBucket.tryAcquire(failureCost))
    else ScheduleStop.Yes     // <-- line 79: STOPS retrying
  else ScheduleStop.Yes
case Right(value) =>
  ...
  else if shouldPayFailureCost(Right(value)) then ScheduleStop(!tokenBucket.tryAcquire(failureCost))
  else ScheduleStop.No        // <-- line 88: CONTINUES retrying

Confirmed: asymmetry between Left and Right cases. When shouldPayFailureCost returns false, the Left case returns ScheduleStop.Yes (stop) while the Right case returns ScheduleStop.No (continue). The Left case should also return ScheduleStop.No to continue for free.

Minimal runnable reproducer

Save as repro.scala and run scala-cli run --server=false repro.scala:

//> using scala 3.6.4
//> using dep com.softwaremill.ox::core:1.0.4

// Bug 020: ox AdaptiveRetry stops retrying on Left when shouldPayFailureCost returns false.
//
// In AdaptiveRetry.retryWithErrorMode the `Left` branch returns ScheduleStop.Yes when
// `shouldPayFailureCost(Left(_))` is false — incorrectly terminating retry. The symmetric
// `Right` branch returns ScheduleStop.No (continues for free). Per the docs:
//   "Penalty is paid only if it is decided to retry operation."
// So when we explicitly say "don't pay", we should still retry — for free.

import ox.resilience.*
import ox.scheduling.Schedule
import scala.concurrent.duration.*

@main def repro(): Unit =
  val ar = AdaptiveRetry.default // TokenBucket(500), failureCost=5, successReward=1

  // Allow up to 5 attempts (1 initial + 4 retries) with no delay.
  val schedule = Schedule.immediate.maxAttempts(5)

  // Count how many times the operation actually runs.
  var attempts = 0
  val op: () => Either[String, Int] = () =>
    attempts += 1
    Left(s"transient-error-$attempts")

  // shouldPayFailureCost = false  =>  we explicitly opt out of paying the cost.
  // Expected: retry continues for free (5 attempts), like the Right branch does.
  // Actual (bug): retry stops after the first Left (1 attempt).
  val result: Either[String, Int] =
    ar.retryEither(schedule, (_: Either[String, Int]) => false)(op())

  val expectedAttempts = 5
  val actualAttempts = attempts

  println(s"shouldPayFailureCost = (_ => false)")
  println(s"schedule = Schedule.immediate.maxAttempts(5)")
  println(s"result = $result")
  println(s"expected attempts = $expectedAttempts (free retries, like the Right branch)")
  println(s"actual   attempts = $actualAttempts")
  if actualAttempts < expectedAttempts then
    println(s"BUG REPRODUCED: AdaptiveRetry stopped on Left after $actualAttempts attempt(s) " +
            s"even though shouldPayFailureCost returned false (free retry was intended).")
  else
    println("No bug observed.")

Verified locally:

  1. cross-vendor adversarial gate (codex prover + claude skeptic + 2 independent judges, distinct judge_id by vendor) returned accepted
  2. scala-cli run --server=false on the repro above reproduces the symptom (exit code / output mismatch)

AI-assisted report. If I missed context or it's intended behavior, sorry — happy to close. I batch-audited my older filings yesterday and self-closed 8 false positives, so I'm trying to be conservative now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions