Clarify whether retry-after delays should respect retry max_delay

### Please read this first

- **Have you read the docs?** Yes. The runner-managed retry docs say `backoff` is the default delay strategy when the policy retries without returning an explicit delay, and `retry_policies.retry_after()` uses the retry-after delay. That is why I am filing this to confirm the intended boundary.
- **Have you searched for related issues?** Yes. I searched for related `retry_after`, `Retry-After`, `max_delay`, and model retry issues/PRs and did not find a direct match.

### Describe the bug

A very large provider `retry_after` hint is used as the runner-managed retry sleep delay even when `ModelRetryBackoffSettings.max_delay` is much smaller. This can make an SDK-managed retry wait for an unexpectedly long time. It is unclear whether `max_delay` is intended to cap only exponential backoff delays or all runner-managed retry sleeps, including retry-after hints.

### Debug information

- Agents SDK version: `0.17.0`
- Python version: Python 3.12.1
- Revision: `683b6e79`

### Repro steps

Run this script from the repository root:

```python
import asyncio
from unittest.mock import patch

import httpx
from openai import APIConnectionError

from agents.retry import (
    ModelRetryAdvice,
    ModelRetryBackoffSettings,
    ModelRetrySettings,
    retry_policies,
)
from agents.run_internal.model_retry import get_response_with_retry


async def main():
    sleeps = []

    async def fake_sleep(delay):
        sleeps.append(delay)

    async def rewind():
        pass

    async def get_response():
        raise APIConnectionError(
            message="temporary failure",
            request=httpx.Request("POST", "https://example.com"),
        )

    with patch("asyncio.sleep", fake_sleep):
        try:
            await get_response_with_retry(
                get_response=get_response,
                rewind=rewind,
                retry_settings=ModelRetrySettings(
                    max_retries=1,
                    backoff=ModelRetryBackoffSettings(max_delay=1.0, jitter=False),
                    policy=retry_policies.retry_after(),
                ),
                get_retry_advice=lambda _request: ModelRetryAdvice(retry_after=999999.0),
                previous_response_id=None,
                conversation_id=None,
            )
        except APIConnectionError:
            pass

    print(sleeps)


asyncio.run(main())
```

Actual result:

```text
[999999.0]
```

### Expected behavior

Please confirm the intended behavior. If `max_delay` is meant to cap every runner-managed retry sleep, the retry-after delay should be capped at `1.0` in this repro. If retry-after hints are intentionally allowed to exceed `max_delay`, the docs should probably call out that `max_delay` only applies when no policy returns an explicit delay.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify whether retry-after delays should respect retry max_delay #3266

Please read this first

Describe the bug

Debug information

Repro steps

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Clarify whether retry-after delays should respect retry max_delay #3266

Description

Please read this first

Describe the bug

Debug information

Repro steps

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions