diff --git a/architecture/client.md b/architecture/client.md index 118d3bb..1f38bf6 100644 --- a/architecture/client.md +++ b/architecture/client.md @@ -31,8 +31,19 @@ Client(httpx2_client=httpx2.Client(trust_env=False)) AsyncClient(httpx2_client=httpx2.AsyncClient(trust_env=False)) ``` -## Bounded error bodies (`max_error_body_bytes`) +## Bounded response bodies (`max_response_body_bytes`) -Both `Client` and `AsyncClient` accept `max_error_body_bytes: int | None = None`. The default (`None`) is backward-compatible: error bodies are read without a size limit. +Both `Client` and `AsyncClient` accept `max_response_body_bytes: int | None = None`. The default (`None`) is unbounded; a non-`None` value below `1` is rejected with `ValueError` at construction. The cap is **status-agnostic** (a `200` trips it the same as a `500`) and counts **decoded** bytes — the actual in-memory footprint, and the only measure that catches a compression bomb (a 133-byte gzip body decoding to 100 KB). -When set, `stream()` raises `ResponseTooLargeError` on a 4xx/5xx response whose declared `Content-Length` header exceeds the cap — before the body is read. Responses without a declared `Content-Length` (chunked transfer) are still read unbounded: a hard mid-read cap would require httpx2 private API, which this project forbids. +The cap bounds memory that httpware buffers on your behalf, at two sites: + +- **The non-streaming terminal** (`send()` and the per-verb helpers). When a cap is set, the terminal switches from `httpx2.send(request)` to `send(request, stream=True)` and accumulates decoded bytes through the shared `_read_capped` helper, failing fast with `ResponseTooLargeError` the moment the cap is crossed. When the cap is `None`, the terminal keeps the plain buffered `send()` fast path — zero streaming overhead. +- **`stream()`'s internal error pre-read** — the 4xx/5xx body httpware reads so `exc.response.content` works is routed through the same `_read_capped`. **User-driven `stream()` iteration is never capped** — you chose streaming to own that memory. + +The declared `Content-Length` is used only as an *early reject* (if even the compressed size already exceeds the cap, fail before reading a byte); it is never an early accept, so the accumulator always runs — chunked and bomb bodies are caught, not waved through. `ResponseTooLargeError.reason` is `"declared"` or `"streamed"` accordingly. Entirely public httpx2 API — no private access. + +**Bodiless responses bypass the cap.** Responses that carry no message body — to a `HEAD` request, or with status `204`/`304` — buffer nothing, so the cap never applies to them even when they declare a large `Content-Length` (`HEAD` legitimately echoes the entity length). These are returned unchanged, preserving their original headers. + +**Rebuilt headers.** The accumulator yields the *decoded* body, so the rebuilt Response drops the wire-encoding headers (`Content-Encoding`, `Transfer-Encoding`, and the now-incorrect compressed `Content-Length`); httpx2 recomputes `Content-Length` from the buffered content. Carrying `Content-Encoding` forward would make httpx2 re-decode already-decoded bytes and raise. + +**Caveat:** on the capped path the buffered response is rebuilt via the public `httpx2.Response(content=...)` constructor, which does not carry `.elapsed` (httpx2 only sets it on its own buffered `send()`). Clients that set a cap and read `response.elapsed` will find it absent; the `None`-cap fast path preserves it. diff --git a/architecture/errors.md b/architecture/errors.md index c516bb5..152d95b 100644 --- a/architecture/errors.md +++ b/architecture/errors.md @@ -18,7 +18,7 @@ The error-mapping table (what `httpx2` exception maps to which `httpware` except The "no `__init__` override" rule scopes only to `StatusError` subclasses. Non-status `ClientError` subclasses — `DecodeError`, `MissingDecoderError`, `BulkheadFullError`, `RetryBudgetExhaustedError`, `CircuitOpenError`, `ResponseTooLargeError` — deliberately define `__init__` with keyword-only fields. -`ResponseTooLargeError` is raised from `stream()` when `max_error_body_bytes` is set and a 4xx/5xx response's declared `Content-Length` exceeds the cap. It is a non-status `ClientError`; it does not carry a `StatusError`-style positional `response` and is not in `STATUS_TO_EXCEPTION`. +`ResponseTooLargeError` is raised when `max_response_body_bytes` is set and a response body would exceed the cap — status-agnostic (a `200` can trip it), counting **decoded** bytes. It fires from the non-streaming terminal (`send()`) and from `stream()`'s internal error pre-read; user-driven `stream()` iteration is never capped. The `reason` field discriminates the two trip modes: `"declared"` (the declared `Content-Length` already exceeds the cap, rejected before any byte is read — `content_length` holds it) and `"streamed"` (the decoded body crossed the cap mid-read, the chunked or compression-bomb case, where the true size is unknown by design). It is a non-status `ClientError`; it does not carry a `StatusError`-style positional `response` and is not in `STATUS_TO_EXCEPTION`. Because it is neither a `StatusError`, `NetworkError`, nor `TimeoutError`, it is not retried and does not count toward the circuit breaker. ## Security: request headers are reachable via `exc.response.request` diff --git a/planning/changes/2026-06-23.03-response-body-cap/design.md b/planning/changes/2026-06-23.03-response-body-cap/design.md new file mode 100644 index 0000000..5e361d0 --- /dev/null +++ b/planning/changes/2026-06-23.03-response-body-cap/design.md @@ -0,0 +1,231 @@ +--- +status: shipped +date: 2026-06-23 +slug: response-body-cap +summary: Replace error-only max_error_body_bytes with a status-agnostic, decoded-byte max_response_body_bytes cap enforced by a streaming capped-accumulator terminal. +supersedes: null +superseded_by: null +pr: 78 +outcome: Shipped via #78 — max_error_body_bytes removed (breaking, pre-1.0) for status-agnostic max_response_body_bytes, enforced at the non-streaming terminal and stream()'s error pre-read via a shared _read_capped accumulator counting decoded bytes (catches compression bombs); Content-Length kept as early-reject only. ResponseTooLargeError gained a declared/streamed reason; >=1 validated. Not retried / not breaker-counted; cap-wins on over-cap retryable 5xx. None-cap keeps the plain send() fast path (.elapsed preserved). 756 tests, 100% coverage. Promoted into architecture/client.md + errors.md; release note 0.15.0. +--- + +# Design: Status-agnostic response-body cap + +## Summary + +Replace the shipped `max_error_body_bytes` knob with a status-agnostic +`max_response_body_bytes` cap that actually bounds memory on the non-streaming +path. Today the cap only fires inside `stream()`, only on 4xx/5xx, and only as a +declared-`Content-Length` pre-check — so a non-streaming `send()` buffers the +whole body before httpware ever gets control, and even `stream()` reads chunked +or compression-bombed error bodies unbounded. The new design routes both the +internal terminal and `stream()`'s error pre-read through a single shared +`_read_capped` helper that streams the response, accumulates **decoded** bytes +against the cap, and fails fast with `ResponseTooLargeError` the moment the cap +is crossed. Entirely public httpx2 API — no `httpx2._`. Off by default (`None`). + +## Motivation + +The 2026-06-14 deep audit flagged (Medium) that `max_error_body_bytes` is not a +real cap: for a non-streaming `send()`, `httpx2.Client.send(request)` buffers the +entire body into memory before httpware reaches the decode seam, so there is no +enforcement point at all on the hot path. The existing guard lives only at +`stream()` entry and only rejects when `Content-Length` is declared. + +Two concrete holes: + +1. **The success path is unprotected and is the larger surface.** A typed + `send(response_model=X)` against a `200` with a multi-GB body exhausts the + heap. Memory exhaustion has no status code; an error-only cap bolts the + smaller door and leaves the bigger one open. +2. **Compression bombs defeat the `Content-Length` pre-check.** Verified: a + 133-byte gzip body decodes to 100,000 bytes (`aiter_bytes()` yields the + *decoded* stream; `Content-Length` reports the *compressed* 133). Real bombs + run ~1000:1. A header pre-check waves these straight through. + +Feasibility — the reason this was deferred — is resolved: the audit feared a +true mid-read cap needs httpx2 private API. It does not. `httpx2.{Async,}Client` +expose `send(request, stream=True)`, `Response.aiter_bytes()/iter_bytes()`, and a +public `Response(content=...)` constructor. That is the whole mechanism. + +## Non-goals + +- **A request-body cap.** This bounds response bodies only. +- **Capping user-driven `stream()` iteration.** When the caller iterates chunks + themselves they own the memory; capping it would defeat `stream()`. +- **A general per-connection limit.** That is httpx2's `limits`; orthogonal. +- **Reporting the true oversized body size.** When the accumulator trips we stop + at the first chunk over the line and do not know (and will not fabricate) the + total. +- **Preserving `.elapsed` on the capped path.** See Risk; an inherent cost of + rebuilding the `Response` via public API. + +## Design + +### 1. One knob: `max_response_body_bytes` (replaces `max_error_body_bytes`) + +Both `Client` and `AsyncClient` take `max_response_body_bytes: int | None = None`. +`None` (default) is unbounded — backward-compatible behavior. The old +`max_error_body_bytes` is **deleted outright** (no deprecation shim — acceptable +pre-1.0). Construction validates `>= 1` and raises +`ValueError("max_response_body_bytes must be >= 1")`, matching the +`failure_threshold` idiom in `circuit_breaker.py`. `0`/negative are rejected; +`None` is the only way to disable. + +### 2. The cap counts decoded bytes; `Content-Length` is early-reject only + +The accumulator counts what `aiter_bytes()` yields (decoded / decompressed), +because decoded size is the actual memory footprint and is the only thing that +stops a compression bomb. The declared `Content-Length` header (the *compressed* +size) is used **only** as an early reject — if even the compressed size already +exceeds the cap, the decoded body certainly will, so we fail before reading a +byte. It is **never** an early accept: a small/absent `Content-Length` says +nothing about decoded size, so the accumulator always runs regardless. + +### 3. Shared capped reader + pure accumulator core + +A pure core, trivially property-testable: + +```python +def _accumulate_capped(chunks: Iterable[bytes], cap: int) -> bytes: + buf = bytearray() + for chunk in chunks: + buf += chunk + if len(buf) > cap: + raise _CapExceeded(read=len(buf)) # internal signal + return bytes(buf) +``` + +`bytearray` grown in place (no transient list + `b"".join` double allocation), +one `bytes()` at the end. A sync `_read_capped` and async `_read_capped_async` +wrap it with the early reject and the `Response` rebuild: + +```python +async def _read_capped_async(response, cap, request) -> httpx2.Response: + cl = _parse_content_length(response.headers.get("content-length")) + if cl is not None and cl > cap: + raise ResponseTooLargeError(status_code=response.status_code, limit=cap, + content_length=cl, reason="declared") + try: + content = _accumulate_capped_sync_over(response.aiter_bytes(), cap) # async variant + except _CapExceeded: + raise ResponseTooLargeError(status_code=response.status_code, limit=cap, + content_length=cl, reason="streamed") + return httpx2.Response(status_code=response.status_code, headers=response.headers, + content=content, request=request, + extensions=_safe_extensions(response.extensions), + history=response.history) +``` + +`_read_capped` takes a *Response*, not a client — so it is agnostic to whether +the response came from the request-based terminal `send(stream=True)` or +`stream()`'s method+url path. It never closes the stream; the caller owns +lifecycle. `_safe_extensions` copies `http_version`/`reason_phrase` and drops the +now-stale `network_stream` (the buffered Response never uses it). + +### 4. Terminal: branch on `cap is None` + +`_terminal` keeps the plain fast path when the cap is off, so non-cap users pay +zero streaming overhead and keep `.elapsed`. Only when a cap is set does it +stream and route through `_read_capped`, owning the stream lifecycle: + +```python +async def _terminal(self, request): + async with _httpx2_exception_mapper(): + if self._max_response_body_bytes is None: + response = await self._httpx2_client.send(request) # unchanged fast path + else: + resp = await self._httpx2_client.send(request, stream=True) + try: + response = await _read_capped_async(resp, self._max_response_body_bytes, request) + finally: + await resp.aclose() + _raise_on_status_error(response) + return response +``` + +### 5. `stream()`: error pre-read routed through the same helper + +`stream()`'s existing 4xx/5xx pre-read (`await response.aread()`, guarded today +by the `Content-Length`-only check) is replaced by `_read_capped`. The user-driven +success path is untouched. This is the only place `stream()` itself buffers, so +the cap reaches it there and nowhere else; `exc.response.content` still works, +now bounded, and chunked/bombed error bodies are caught instead of waved through. + +### 6. `ResponseTooLargeError` gains an explicit `reason` + +Status-agnostic now (`status_code` can be `200`). Two trip modes carry different +information, so the discriminator is explicit rather than inferred: + +- `limit: int` — the cap (always known). +- `status_code: int` — always known; distinguishes a 200 trip from a 5xx. +- `content_length: int | None` — the server's *declared* header, nullable, + informational only. +- `reason: typing.Literal["declared", "streamed"]` — `"declared"` = early reject + on `Content-Length`; `"streamed"` = accumulator crossed the cap (the + bomb/chunked case). No `bytes_read`/"actual size" — never measured, never + fabricated. + +Stays a non-status `ClientError` with the existing `__init__` + `__reduce__` +(per the `errors.md` rule). Message reads correctly per mode. + +### 7. Resilience interaction (falls out of the hierarchy, no special-casing) + +Because `ResponseTooLargeError` is a `ClientError` (not +`StatusError`/`NetworkError`/`TimeoutError`): + +- **Retry** (`_RETRYABLE_EXCEPTIONS`): not retryable — an over-cap body recurs; + retrying wastes bandwidth. +- **Circuit breaker**: not a counted failure — hits `except BaseException`, slot + released, neither success nor failure recorded. Cannot trip the breaker. +- **Bulkhead**: releases its slot normally. + +**Cap-wins / fail-hard:** an otherwise-retryable 5xx whose body exceeds the cap +trips `_read_capped` before status classification, so it surfaces as +`ResponseTooLargeError` (non-retryable) rather than the `StatusError`. Accepted: +the cap is a hard memory-safety limit, retrying would re-fetch the same giant +body, and producing the `StatusError` would require the very buffering we are +refusing. A pathological case (transient error carrying a multi-GB body); a user +who sets a cap is explicitly refusing it. + +## Testing + +- **Pure core — Hypothesis** (`tests/test_capped_read_props.py`): over arbitrary + chunk partitions of a body × arbitrary cap, `_accumulate_capped` raises iff + `len(body) > cap` and returns `body` byte-for-byte otherwise (chunk-boundary + independence — the one subtle invariant). +- **Integration (`MockTransport`, sync + async parity):** within-cap passes; + exactly-at-cap passes (boundary); declared `Content-Length` over cap → + `reason="declared"`, zero bytes read; chunked / no `Content-Length` over cap → + `reason="streamed"`; gzip bomb (133 → 100 K) → `reason="streamed"`; + empty/204/HEAD pass; `ValueError` on `cap < 1`. +- **Resilience:** retry does not retry a `ResponseTooLargeError`; breaker does + not trip; an over-cap retryable 5xx surfaces as `ResponseTooLargeError`. +- **`stream()`:** error pre-read is bounded (declared + streamed); user-driven + success streaming is never capped. +- `just lint && just test` green; coverage preserved. + +## Risk + +- **`.elapsed` dropped on the capped path** (likely × low). Rebuilding the + `Response` via public API loses `.elapsed`, which httpx2 only sets on its own + buffered send. Only affects clients that set a cap *and* read `.elapsed`. + Mitigation: the `cap is None` fast path preserves it for everyone else; + document the caveat in `architecture/client.md`. +- **Breaking removal of `max_error_body_bytes`** (certain × low). A shipped, + exported, documented param disappears. Acceptable pre-1.0; called out in + release notes. No silent behavior change — the name is gone, construction + fails loudly if still passed. +- **Stale `extensions` on the rebuilt Response** (unlikely × low). Mitigated by + `_safe_extensions` dropping `network_stream`. +- **Streaming-path overhead vs `send()`** (certain × low). Only paid when a cap + is set; the fast path is untouched. + +## Operations + +None — no out-of-repo steps. + +## Out of scope + +- Deprecation shim for `max_error_body_bytes` (deleted, not aliased). +- Request-body caps, per-connection limits, capping user-driven `stream()`. diff --git a/planning/changes/2026-06-23.03-response-body-cap/plan.md b/planning/changes/2026-06-23.03-response-body-cap/plan.md new file mode 100644 index 0000000..060315d --- /dev/null +++ b/planning/changes/2026-06-23.03-response-body-cap/plan.md @@ -0,0 +1,298 @@ +--- +status: shipped +date: 2026-06-23 +slug: response-body-cap +spec: response-body-cap +pr: 78 +--- + +# response-body-cap — implementation plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use +> superpowers:subagent-driven-development (recommended) or +> superpowers:executing-plans to implement this plan task-by-task. Steps +> use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Replace error-only `max_error_body_bytes` with a status-agnostic, +decoded-byte `max_response_body_bytes` cap enforced by a shared streaming +capped-accumulator on both the terminal and `stream()`'s error pre-read. + +**Spec:** [`design.md`](./design.md) + +**Branch:** `feat/response-body-cap` + +**Commit strategy:** Per-task commits. TDD: each behavioral task writes the +failing test first, then the implementation. + +--- + +### Task 1: `ResponseTooLargeError` gains `reason` + +**Files:** +- Modify: `src/httpware/errors.py` +- Modify: `tests/test_errors.py` (or the suite that covers `ResponseTooLargeError`) + +Make the error status-agnostic-aware with an explicit trip-mode discriminator. +No client wiring yet. + +- [ ] **Step 1: Write failing tests** + + Assert `ResponseTooLargeError(status_code=200, limit=10, content_length=None, + reason="streamed")` constructs, exposes all four fields, and round-trips through + `pickle` (exercises `__reduce__`). Add a `reason="declared"` case. Assert the + message text differs sensibly per `reason`. Run: `just test tests/test_errors.py` + — red. + +- [ ] **Step 2: Add the field** + + Add `reason: typing.Literal["declared", "streamed"]` to the class body and + `__init__` (keyword-only), thread it into the message and `__reduce__` / + `_reconstruct_response_too_large`. Keep it a non-status `ClientError`. Run: + `just test tests/test_errors.py` — green. + +- [ ] **Step 3: Commit** + + ```bash + git add src/httpware/errors.py tests/test_errors.py + git commit -m "feat: add reason discriminator to ResponseTooLargeError + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 2: Pure `_accumulate_capped` core + Hypothesis property test + +**Files:** +- Modify: `src/httpware/client.py` +- Create: `tests/test_capped_read_props.py` + +The one subtle invariant — chunk-boundary independence — isolated behind a pure +function before any I/O wiring. + +- [ ] **Step 1: Write the property test (red)** + + In `tests/test_capped_read_props.py`, use Hypothesis to draw a body (`bytes`) + and a partition into chunks, plus a `cap >= 1`. Assert: `_accumulate_capped` + returns `body` byte-for-byte when `len(body) <= cap`, and raises `_CapExceeded` + when `len(body) > cap` — independent of how the body is split. Annotate test + args. Run: `just test tests/test_capped_read_props.py` — red (symbols absent). + +- [ ] **Step 2: Implement the core** + + Add module-level `class _CapExceeded(Exception)` (carries `read: int`) and + `def _accumulate_capped(chunks: Iterable[bytes], cap: int) -> bytes` using a + `bytearray` grown in place, raising `_CapExceeded(read=len(buf))` the moment + `len(buf) > cap`. Run: `just test tests/test_capped_read_props.py` — green. + +- [ ] **Step 3: Commit** + + ```bash + git add src/httpware/client.py tests/test_capped_read_props.py + git commit -m "feat: add pure _accumulate_capped core with property test + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 3: `_read_capped` sync/async wrappers + `_safe_extensions` + +**Files:** +- Modify: `src/httpware/client.py` +- Modify: `tests/test_client.py` (or a focused `tests/test_capped_read.py`) + +Wrap the core with the `Content-Length` early reject and the `Response` rebuild. +Helpers take a `Response`, not a client; they never close the stream. + +- [ ] **Step 1: Write failing unit tests** + + Build streaming responses via `MockTransport` + `httpx2.{Async,}Client` and call + `_read_capped` / `_read_capped_async` directly (or through a thin harness): + within-cap returns a buffered `Response` with byte-identical `.content`; + declared `Content-Length > cap` raises `reason="declared"` having read zero; + chunked over-cap raises `reason="streamed"`; gzip bomb (133 → 100 K) raises + `reason="streamed"`; rebuilt `Response.extensions` has no `network_stream` but + keeps `http_version`. Run — red. + +- [ ] **Step 2: Implement** + + Add `_safe_extensions(ext)` (copy, preserve `http_version`/`reason_phrase`, drop + `network_stream`), then `_read_capped` (sync, `iter_bytes`) and + `_read_capped_async` (async, `aiter_bytes`). Each: parse `Content-Length` via + `_parse_content_length`, early-reject → `ResponseTooLargeError(reason="declared")`; + feed the byte iterator to `_accumulate_capped`, `except _CapExceeded` → + `ResponseTooLargeError(reason="streamed")`; else rebuild + `httpx2.Response(status_code=…, headers=…, content=…, request=…, + extensions=_safe_extensions(…), history=…)`. Run — green. + +- [ ] **Step 3: Commit** + + ```bash + git add src/httpware/client.py tests/ + git commit -m "feat: add shared _read_capped streaming accumulator + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 4: Rename param, validate, branch the terminal (both clients) + +**Files:** +- Modify: `src/httpware/client.py` +- Modify: `tests/test_client.py` + +Swap `max_error_body_bytes` → `max_response_body_bytes` on `AsyncClient` and +`Client`; delete the old name entirely; wire the terminal. + +- [ ] **Step 1: Write failing tests** + + For both clients: `ValueError` when `max_response_body_bytes < 1` (test `0` and + `-1`); a non-streaming `send()` against an over-cap body raises + `ResponseTooLargeError` (declared and streamed); within-cap `send()` returns + normally with intact `.content`; `max_response_body_bytes=None` leaves behavior + unchanged. Run — red. + +- [ ] **Step 2: Implement** + + Rename the ctor param + `self._max_*` attr on both clients; add the `>= 1` + validation raising `ValueError("max_response_body_bytes must be >= 1")`. In + `_terminal` / sync terminal: branch on `is None` — keep plain `send(request)` + fast path; else `send(request, stream=True)` inside `try/finally: aclose()`, + routed through `_read_capped[_async]`. Keep `_raise_on_status_error` after. + Run — green. + +- [ ] **Step 3: Commit** + + ```bash + git add src/httpware/client.py tests/test_client.py + git commit -m "feat!: replace max_error_body_bytes with max_response_body_bytes + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 5: Route `stream()` error pre-read through `_read_capped` + +**Files:** +- Modify: `src/httpware/client.py` +- Modify: `tests/test_client.py` (streaming cases) + +Replace the `Content-Length`-only block + `await response.aread()` in both +`stream()` methods with the shared helper; leave user-driven streaming uncapped. + +- [ ] **Step 1: Write failing tests** + + In `stream()`: an over-cap 4xx/5xx error body raises `ResponseTooLargeError` + (declared and streamed, incl. a chunked/no-`Content-Length` case); a within-cap + error still raises the `StatusError` with `exc.response.content` populated; a + user iterating a large **2xx** body is never capped. Sync + async. Run — red. + +- [ ] **Step 2: Implement** + + In each `stream()` error branch (`400 <= status < 600`): replace the guard + + `aread()` with `capped = _read_capped[_async](response, cap, response.request)` + then `_raise_on_status_error(capped)`. Only when `cap is not None`; otherwise + keep the existing unbounded `aread()`. Do not touch the success `yield`. Run — + green. + +- [ ] **Step 3: Commit** + + ```bash + git add src/httpware/client.py tests/test_client.py + git commit -m "feat: bound stream() error pre-read via _read_capped + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 6: Resilience-interaction tests + +**Files:** +- Modify: `tests/` (retry + circuit-breaker suites) + +Lock the fall-out behavior so a future refactor can't silently make +`ResponseTooLargeError` retryable or breaker-counting. + +- [ ] **Step 1: Write tests (expect green)** + + With a retry middleware wrapping an over-cap response: assert exactly one + terminal attempt and `ResponseTooLargeError` propagates (not retried). With a + circuit breaker: assert a cap trip records neither success nor failure and never + opens the breaker. Assert an over-cap **retryable 5xx** surfaces as + `ResponseTooLargeError`, not the `StatusError` (cap-wins). Run — green (no prod + code change expected; if red, the hierarchy assumption broke — stop and + reconcile with the spec). + +- [ ] **Step 2: Commit** + + ```bash + git add tests/ + git commit -m "test: lock ResponseTooLargeError resilience semantics + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 7: Docs, deferred cleanup, index, release notes + +**Files:** +- Modify: `architecture/client.md`, `architecture/errors.md` +- Modify: `planning/deferred.md` +- Modify: `planning/changes/README.md` (generated — via `just index`) +- Create: `planning/releases/.md` (if a release is cut) +- Modify: `design.md`/`plan.md` frontmatter (`status: shipped`, `pr`, `outcome`) + +Promote conclusions into the living architecture docs and retire the deferred +item. + +- [ ] **Step 1: Architecture docs** + + Rewrite `architecture/client.md` "Bounded error bodies" → "Bounded response + bodies": status-agnostic, decoded-byte, bomb-aware, `Content-Length` + early-reject-only, `stream()` interaction, `cap is None` fast path, and the + `.elapsed` caveat. Update the `ResponseTooLargeError` entry in + `architecture/errors.md` (new `reason`, status-agnostic semantics). + +- [ ] **Step 2: Retire the deferred item** + + Remove the "Non-streaming hard response-body cap" bullet from + `planning/deferred.md`. + +- [ ] **Step 3: Regenerate the index** + + ```bash + just index + ``` + +- [ ] **Step 4: Set ship frontmatter + commit** + + Set `status: shipped` + `pr` + `outcome` on both bundle files. Add release + notes if a version is cut (note the breaking `max_error_body_bytes` removal). + + ```bash + git add architecture/ planning/ + git commit -m "docs: promote response-body cap into architecture; retire deferred item + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 8: Full verification + +- [ ] **Step 1: Lint + full suite** + + ```bash + just lint && just test + ``` + + Confirm green and coverage preserved. Grep guard: + `grep -rE 'httpx2\._' src/httpware/` returns nothing; + `grep -rn 'max_error_body_bytes' src/ architecture/` returns nothing. + +- [ ] **Step 2: Open the PR** per `finishing-a-development-branch`. diff --git a/planning/deferred.md b/planning/deferred.md index 608d205..4624f45 100644 --- a/planning/deferred.md +++ b/planning/deferred.md @@ -16,7 +16,3 @@ As of 0.7.0, all planned epics (3, 4, 5, 6) are closed — see the [change Index - **Count-based window variant** (`window_type="count"`) — time-based + `minimum_calls` already covers the fixed-sample-size rationale, and count-based adds a real staleness downside for HTTP health detection (a low-traffic "last N calls" window can reflect outcomes from minutes ago). Polly v8 *removed* count-based; Hystrix and Envoy are time-based. For a spiky low-volume backend, a longer `window_seconds` + `minimum_calls` is the better tool. Revisit only on concrete Resilience4j-parity demand. - **Slow-call-rate dimension** — Resilience4j-only, and redundant with `AsyncTimeout`. - -### Documentation - -- **Non-streaming hard response-body cap** (2026-06-14 deep audit, Medium) — for a non-streaming `send()`, httpx2 buffers the whole body before httpware reaches the decode seam, so a true cap needs a streaming-with-capped-accumulator rework of the Seam-A terminal. The current `max_error_body_bytes` guard only applies at `stream()` entry and only when `Content-Length` is declared. Revisit trigger: the Seam-A terminal is next reworked, or a concrete large-response abuse is reported. (`src/httpware/client.py`) diff --git a/planning/releases/0.15.0.md b/planning/releases/0.15.0.md new file mode 100644 index 0000000..7146fcf --- /dev/null +++ b/planning/releases/0.15.0.md @@ -0,0 +1,62 @@ +# httpware 0.15.0 — status-agnostic response-body cap (`max_response_body_bytes`) + +**Minor release. Contains one breaking change** (pre-1.0): the opt-in +`max_error_body_bytes` parameter is replaced by `max_response_body_bytes`. + +This release turns the error-only body guard into a real, status-agnostic memory +cap that is actually enforced on the non-streaming `send()` path and against +compression bombs. + +## Breaking change + +`max_error_body_bytes` is **removed** and replaced by `max_response_body_bytes` +on both `Client` and `AsyncClient`. There is no compatibility alias — passing the +old keyword raises `TypeError`. + +```python +# before +client = AsyncClient(max_error_body_bytes=1_000_000) +# after +client = AsyncClient(max_response_body_bytes=1_000_000) +``` + +`None` (the default) remains unbounded. A non-`None` value below `1` is now +rejected with `ValueError` at construction. + +## What changed and why + +The old `max_error_body_bytes` only fired inside `stream()`, only on 4xx/5xx, and +only as a declared-`Content-Length` pre-check. For a non-streaming `send()`, +httpx2 buffered the whole body before httpware got control, so the hot path had +no cap at all — and a small compressed body could decode to something enormous +(a 133-byte gzip body decodes to 100 KB; real bombs run ~1000:1) and slip past a +header check entirely. + +`max_response_body_bytes`: + +- Is **status-agnostic** — a `200` is capped the same as a `500`. Memory + exhaustion has no status code, and the success path is the larger surface. +- Counts **decoded** bytes (the in-memory footprint), so compression bombs are + caught. +- Is enforced at the non-streaming terminal (`send()` and the per-verb helpers) + via a streaming capped-accumulator, and on `stream()`'s internal error + pre-read. **User-driven `stream()` iteration is never capped.** +- Fails fast with `ResponseTooLargeError`, which now carries a `reason` field: + `"declared"` (declared `Content-Length` over the cap, rejected before a byte is + read) or `"streamed"` (the decoded body crossed the cap mid-read). + +The declared `Content-Length` is kept only as an early reject (never an early +accept), so chunked and bomb bodies are always run through the accumulator. + +## Semantics + +- `ResponseTooLargeError` is a non-status `ClientError`: it is **not retried** and + does **not** count toward the circuit breaker. +- An otherwise-retryable 5xx whose body exceeds the cap surfaces as + `ResponseTooLargeError` (cap-wins / fail-hard), not the status error — retrying + would only re-fetch the oversized body. +- On the capped path the buffered response is rebuilt via the public + `httpx2.Response(content=...)` constructor and therefore has no `.elapsed`. The + default (`None`-cap) fast path keeps plain `send()` and preserves `.elapsed`. + +All public API is honored — no httpx2 private access. diff --git a/src/httpware/client.py b/src/httpware/client.py index a3e086b..a6b52bf 100644 --- a/src/httpware/client.py +++ b/src/httpware/client.py @@ -2,7 +2,7 @@ import contextlib import typing -from collections.abc import AsyncIterator, Iterator, Sequence +from collections.abc import AsyncIterator, Iterator, Mapping, Sequence from http import HTTPStatus import httpx2 @@ -32,6 +32,15 @@ ) +_MAX_RESPONSE_BODY_BYTES_INVALID = "max_response_body_bytes must be >= 1" + + +def _validate_max_response_body_bytes(cap: int | None) -> None: + """Reject a non-None cap below 1. None means unbounded (the default).""" + if cap is not None and cap < 1: + raise ValueError(_MAX_RESPONSE_BODY_BYTES_INVALID) + + def _parse_content_length(raw: str | None) -> int | None: """Return a non-negative int Content-Length, or None for missing/garbage. Never raises.""" if raw is None: @@ -43,6 +52,123 @@ def _parse_content_length(raw: str | None) -> int | None: return value if value >= 0 else None +class _CapExceeded(Exception): # noqa: N818 — internal control-flow signal, not a user-facing error + """Internal signal: decoded bytes crossed the cap mid-read. Carries bytes read so far.""" + + def __init__(self, *, read: int) -> None: + self.read = read + super().__init__(f"decoded body exceeded cap after {read} bytes") + + +def _accumulate_capped(chunks: typing.Iterable[bytes], cap: int) -> bytes: + """Concatenate `chunks`, raising `_CapExceeded` the moment the running total exceeds `cap`. + + Counts decoded bytes (the in-memory footprint). Grown in a single bytearray + so there is no transient list-plus-join double allocation. + """ + buf = bytearray() + for chunk in chunks: + buf += chunk + if len(buf) > cap: + raise _CapExceeded(read=len(buf)) + return bytes(buf) + + +def _safe_extensions(extensions: Mapping[str, typing.Any]) -> dict[str, typing.Any]: + """Copy response extensions, dropping the now-stale `network_stream`. + + The rebuilt buffered Response never touches its network stream, so carrying a + consumed/closed one wholesale is sloppy. `http_version`/`reason_phrase` and + any other keys are preserved. + """ + return {key: value for key, value in extensions.items() if key != "network_stream"} + + +# Headers describing the wire encoding of the body. The accumulator yields the +# DECODED body, so these no longer apply; httpx2 recomputes content-length from +# the buffered content. Carrying content-encoding forward makes httpx2 try to +# re-decode already-decoded bytes and raise. +_WIRE_BODY_HEADERS = ("content-encoding", "content-length", "transfer-encoding") +_BODILESS_STATUS = frozenset({HTTPStatus.NO_CONTENT, HTTPStatus.NOT_MODIFIED}) # 204, 304 + + +def _buffered_headers(headers: httpx2.Headers) -> httpx2.Headers: + """Copy `headers`, stripping wire-encoding headers stale after decoding+buffering.""" + out = httpx2.Headers(headers) + for name in _WIRE_BODY_HEADERS: + if name in out: + del out[name] + return out + + +def _response_has_body(method: str, status_code: int) -> bool: + """Whether a response carries a message body (RFC 9110 §6.4.1). + + HEAD responses and 204/304 never have a body regardless of a declared + Content-Length, so they must never trip the cap. + """ + return method.upper() != "HEAD" and status_code not in _BODILESS_STATUS + + +def _read_capped(response: httpx2.Response, cap: int, request: httpx2.Request) -> httpx2.Response: + """Buffer a streaming sync `response` under `cap` decoded bytes; return a buffered Response. + + Raises `ResponseTooLargeError` (reason="declared") if the declared + Content-Length already exceeds `cap` — before any byte is read — and + (reason="streamed") if the decoded body crosses `cap` mid-read. Does not + close `response`; the caller owns the stream lifecycle. + """ + if not _response_has_body(request.method, response.status_code): + response.read() # empty body; preserve the original response (and its headers) + return response + content_length = _parse_content_length(response.headers.get("content-length")) + if content_length is not None and content_length > cap: + raise ResponseTooLargeError( + status_code=response.status_code, limit=cap, content_length=content_length, reason="declared" + ) + try: + content = _accumulate_capped(response.iter_bytes(), cap) + except _CapExceeded: + raise ResponseTooLargeError( + status_code=response.status_code, limit=cap, content_length=content_length, reason="streamed" + ) from None + return httpx2.Response( + status_code=response.status_code, + headers=_buffered_headers(response.headers), + content=content, + request=request, + extensions=_safe_extensions(response.extensions), + history=response.history, + ) + + +async def _read_capped_async(response: httpx2.Response, cap: int, request: httpx2.Request) -> httpx2.Response: + """Async mirror of `_read_capped` (counts decoded bytes from `aiter_bytes`).""" + if not _response_has_body(request.method, response.status_code): + await response.aread() # empty body; preserve the original response (and its headers) + return response + content_length = _parse_content_length(response.headers.get("content-length")) + if content_length is not None and content_length > cap: + raise ResponseTooLargeError( + status_code=response.status_code, limit=cap, content_length=content_length, reason="declared" + ) + buf = bytearray() + async for chunk in response.aiter_bytes(): + buf += chunk + if len(buf) > cap: + raise ResponseTooLargeError( + status_code=response.status_code, limit=cap, content_length=content_length, reason="streamed" + ) + return httpx2.Response( + status_code=response.status_code, + headers=_buffered_headers(response.headers), + content=bytes(buf), + request=request, + extensions=_safe_extensions(response.extensions), + history=response.history, + ) + + def _build_default_decoders() -> tuple[ResponseDecoder, ...]: """Construct the default decoder tuple based on installed extras. @@ -94,7 +220,7 @@ class AsyncClient: _decoders: tuple[ResponseDecoder, ...] _user_middleware: tuple[AsyncMiddleware, ...] _dispatch: AsyncNext - _max_error_body_bytes: int | None + _max_response_body_bytes: int | None def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call API self, @@ -109,8 +235,9 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call httpx2_client: httpx2.AsyncClient | None = None, decoders: Sequence[ResponseDecoder] | None = None, middleware: Sequence[AsyncMiddleware] = (), - max_error_body_bytes: int | None = None, + max_response_body_bytes: int | None = None, ) -> None: + _validate_max_response_body_bytes(max_response_body_bytes) if httpx2_client is not None: forwarded = { "base_url": base_url, @@ -148,12 +275,20 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call self._decoder_resolver = _DecoderResolver(self._decoders) self._user_middleware = tuple(middleware) self._dispatch = compose_async(self._user_middleware, self._terminal) - self._max_error_body_bytes = max_error_body_bytes + self._max_response_body_bytes = max_response_body_bytes async def _terminal(self, request: httpx2.Request) -> httpx2.Response: + cap = self._max_response_body_bytes try: async with _httpx2_exception_mapper(): - response = await self._httpx2_client.send(request) + if cap is None: + response = await self._httpx2_client.send(request) + else: + streaming = await self._httpx2_client.send(request, stream=True) + try: + response = await _read_capped_async(streaming, cap, request) + finally: + await streaming.aclose() except RuntimeError as exc: if self._httpx2_client.is_closed: raise TransportError(str(exc)) from exc @@ -1015,16 +1150,13 @@ async def stream( # noqa: PLR0913, C901 — mirrors httpx2 per-method signature async with _httpx2_exception_mapper(), self._httpx2_client.stream(method, url, **kwargs) as response: if HTTPStatus.BAD_REQUEST <= response.status_code < 600: # noqa: PLR2004 — 600 is the synthetic upper bound for 5xx - if self._max_error_body_bytes is not None: - content_length = _parse_content_length(response.headers.get("content-length")) - if content_length is not None and content_length > self._max_error_body_bytes: - raise ResponseTooLargeError( - status_code=response.status_code, - limit=self._max_error_body_bytes, - content_length=content_length, - ) - await response.aread() # pre-read body so exc.response.content works - _raise_on_status_error(response) + cap = self._max_response_body_bytes + if cap is None: + await response.aread() # pre-read body so exc.response.content works + _raise_on_status_error(response) + else: + # Bound the error pre-read; raises ResponseTooLargeError when over cap. + _raise_on_status_error(await _read_capped_async(response, cap, response.request)) yield response async def __aenter__(self) -> typing.Self: @@ -1060,7 +1192,7 @@ class Client: _decoders: tuple[ResponseDecoder, ...] _user_middleware: tuple[Middleware, ...] _dispatch: Next - _max_error_body_bytes: int | None + _max_response_body_bytes: int | None def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call API self, @@ -1075,8 +1207,9 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call httpx2_client: httpx2.Client | None = None, decoders: Sequence[ResponseDecoder] | None = None, middleware: Sequence[Middleware] = (), - max_error_body_bytes: int | None = None, + max_response_body_bytes: int | None = None, ) -> None: + _validate_max_response_body_bytes(max_response_body_bytes) if httpx2_client is not None: forwarded = { "base_url": base_url, @@ -1114,12 +1247,20 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call self._decoder_resolver = _DecoderResolver(self._decoders) self._user_middleware = tuple(middleware) self._dispatch = compose(self._user_middleware, self._terminal) - self._max_error_body_bytes = max_error_body_bytes + self._max_response_body_bytes = max_response_body_bytes def _terminal(self, request: httpx2.Request) -> httpx2.Response: + cap = self._max_response_body_bytes try: with _httpx2_exception_mapper_sync(): - response = self._httpx2_client.send(request) + if cap is None: + response = self._httpx2_client.send(request) + else: + streaming = self._httpx2_client.send(request, stream=True) + try: + response = _read_capped(streaming, cap, request) + finally: + streaming.close() except RuntimeError as exc: if self._httpx2_client.is_closed: raise TransportError(str(exc)) from exc @@ -2003,14 +2144,11 @@ def stream( # noqa: PLR0913, C901 — mirrors httpx2 per-method signatures; kwa with _httpx2_exception_mapper_sync(), self._httpx2_client.stream(method, url, **kwargs) as response: if HTTPStatus.BAD_REQUEST <= response.status_code < 600: # noqa: PLR2004 — 600 is the synthetic upper bound for 5xx - if self._max_error_body_bytes is not None: - content_length = _parse_content_length(response.headers.get("content-length")) - if content_length is not None and content_length > self._max_error_body_bytes: - raise ResponseTooLargeError( - status_code=response.status_code, - limit=self._max_error_body_bytes, - content_length=content_length, - ) - response.read() # pre-read body so exc.response.content works - _raise_on_status_error(response) + cap = self._max_response_body_bytes + if cap is None: + response.read() # pre-read body so exc.response.content works + _raise_on_status_error(response) + else: + # Bound the error pre-read; raises ResponseTooLargeError when over cap. + _raise_on_status_error(_read_capped(response, cap, response.request)) yield response diff --git a/src/httpware/errors.py b/src/httpware/errors.py index d14f255..a539aee 100644 --- a/src/httpware/errors.py +++ b/src/httpware/errors.py @@ -15,7 +15,7 @@ import builtins from collections.abc import Mapping -from typing import Any +from typing import Any, Literal import httpx2 @@ -320,33 +320,51 @@ def _reconstruct_response_too_large( status_code: int, limit: int, content_length: int | None, + reason: 'Literal["declared", "streamed"]', ) -> "ResponseTooLargeError": - return cls(status_code=status_code, limit=limit, content_length=content_length) + return cls(status_code=status_code, limit=limit, content_length=content_length, reason=reason) class ResponseTooLargeError(ClientError): - """Raised when an error response body exceeds the client's max_error_body_bytes cap. - - Fires from stream() on a 4xx/5xx whose declared Content-Length exceeds the - configured cap, BEFORE the body is read — so the oversized body is never - buffered. Only raised when max_error_body_bytes is set (opt-in). + """Raised when a response body exceeds the client's max_response_body_bytes cap. + + Status-agnostic: fires on any non-streaming send() and on stream()'s internal + error pre-read, counting DECODED bytes. Only raised when + max_response_body_bytes is set (opt-in). `reason` discriminates the two trip + modes: + + - "declared": the response's declared Content-Length already exceeds the cap, + so the body is rejected BEFORE a byte is read (`content_length` holds it). + - "streamed": the decoded body crossed the cap mid-read (the chunked or + compression-bomb case); `content_length` is whatever the server declared + and is unrelated to the cap. The true oversized size is unknown by design. """ status_code: int limit: int content_length: int | None + reason: Literal["declared", "streamed"] - def __init__(self, *, status_code: int, limit: int, content_length: int | None) -> None: + def __init__( + self, + *, + status_code: int, + limit: int, + content_length: int | None, + reason: Literal["declared", "streamed"], + ) -> None: self.status_code = status_code self.limit = limit self.content_length = content_length - super().__init__( - f"error response body too large: status={status_code} " - f"content_length={content_length} exceeds max_error_body_bytes={limit}" - ) + self.reason = reason + if reason == "declared": + detail = f"declared content_length={content_length} exceeds max_response_body_bytes={limit}" + else: + detail = f"decoded body exceeded max_response_body_bytes={limit}" + super().__init__(f"response body too large: status={status_code} {detail}") def __reduce__(self) -> tuple[Any, ...]: return ( _reconstruct_response_too_large, - (type(self), self.status_code, self.limit, self.content_length), + (type(self), self.status_code, self.limit, self.content_length, self.reason), ) diff --git a/tests/test_capped_read.py b/tests/test_capped_read.py new file mode 100644 index 0000000..66cec47 --- /dev/null +++ b/tests/test_capped_read.py @@ -0,0 +1,241 @@ +"""Unit tests for the shared _read_capped wrappers (sync + async). + +Drive real streaming responses through MockTransport, then hand the streaming +Response to _read_capped / _read_capped_async directly — exercising the +Content-Length early reject, the decoded-byte accumulator, the rebuilt Response, +and extension sanitisation, independent of client wiring. +""" + +import gzip +from collections.abc import AsyncIterator + +import httpx2 +import pytest + +from httpware.client import _read_capped, _read_capped_async +from httpware.errors import ResponseTooLargeError + + +def _sync_stream(handler: object, method: str = "GET") -> tuple[httpx2.Client, httpx2.Response]: + client = httpx2.Client(transport=httpx2.MockTransport(handler)) # ty: ignore[invalid-argument-type] + request = client.build_request(method, "https://example.test/x") + return client, client.send(request, stream=True) + + +async def _async_stream(handler: object, method: str = "GET") -> tuple[httpx2.AsyncClient, httpx2.Response]: + client = httpx2.AsyncClient(transport=httpx2.MockTransport(handler)) # ty: ignore[invalid-argument-type] + request = client.build_request(method, "https://example.test/x") + return client, await client.send(request, stream=True) + + +# ---- sync ---- + + +def test_read_capped_returns_buffered_response_within_cap() -> None: + body = b"hello world" + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client, resp = _sync_stream(handler) + try: + out = _read_capped(resp, 1000, resp.request) + assert out.content == body + assert out.status_code == 200 # noqa: PLR2004 — mirrors handler + assert "network_stream" not in out.extensions + finally: + resp.close() + client.close() + + +def test_read_capped_declared_content_length_over_cap() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=body) + + client, resp = _sync_stream(handler) + try: + with pytest.raises(ResponseTooLargeError) as caught: + _read_capped(resp, 10, resp.request) + assert caught.value.reason == "declared" + assert caught.value.content_length == 200 # noqa: PLR2004 — len(body) + assert caught.value.limit == 10 # noqa: PLR2004 — cap above + finally: + resp.close() + client.close() + + +def test_read_capped_streamed_over_cap_chunked_no_content_length() -> None: + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=(c for c in (b"a" * 50, b"b" * 50))) + + client, resp = _sync_stream(handler) + try: + with pytest.raises(ResponseTooLargeError) as caught: + _read_capped(resp, 10, resp.request) + assert caught.value.reason == "streamed" + assert caught.value.content_length is None + finally: + resp.close() + client.close() + + +def test_read_capped_within_cap_gzip_returns_decoded_content() -> None: + # Regression: rebuilt Response must not re-decompress already-decoded content. + raw = gzip.compress(b"A" * 500) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-encoding": "gzip"}, content=raw) + + client, resp = _sync_stream(handler) + try: + out = _read_capped(resp, 1_000_000, resp.request) + assert out.content == b"A" * 500 # decoded, not re-gzipped/crashed + assert "content-encoding" not in out.headers # stale wire header dropped + assert out.headers["content-length"] == "500" # recomputed from decoded content + finally: + resp.close() + client.close() + + +def test_read_capped_head_with_large_declared_length_not_rejected() -> None: + # Regression: a bodiless HEAD response buffers nothing and must not trip the cap. + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-length": "50000000"}) + + client, resp = _sync_stream(handler, method="HEAD") + try: + out = _read_capped(resp, 1000, resp.request) + assert out.content == b"" + assert out.headers["content-length"] == "50000000" # entity length preserved for HEAD + finally: + resp.close() + client.close() + + +def test_read_capped_gzip_bomb_trips_on_decoded_bytes() -> None: + raw = gzip.compress(b"A" * 100_000) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-encoding": "gzip"}, content=raw) + + client, resp = _sync_stream(handler) + try: + with pytest.raises(ResponseTooLargeError) as caught: + _read_capped(resp, 1000, resp.request) + assert caught.value.reason == "streamed" # compressed CL (small) passed; decoded tripped + finally: + resp.close() + client.close() + + +def test_read_capped_exact_cap_passes() -> None: + body = b"x" * 10 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client, resp = _sync_stream(handler) + try: + assert _read_capped(resp, 10, resp.request).content == body + finally: + resp.close() + client.close() + + +def test_read_capped_empty_body_passes() -> None: + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(204) + + client, resp = _sync_stream(handler) + try: + assert _read_capped(resp, 1, resp.request).content == b"" + finally: + resp.close() + client.close() + + +# ---- async ---- + + +async def test_read_capped_async_returns_buffered_response_within_cap() -> None: + body = b"hello world" + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client, resp = await _async_stream(handler) + try: + out = await _read_capped_async(resp, 1000, resp.request) + assert out.content == body + assert "network_stream" not in out.extensions + finally: + await resp.aclose() + await client.aclose() + + +async def test_read_capped_async_within_cap_gzip_returns_decoded_content() -> None: + raw = gzip.compress(b"A" * 500) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-encoding": "gzip"}, content=raw) + + client, resp = await _async_stream(handler) + try: + out = await _read_capped_async(resp, 1_000_000, resp.request) + assert out.content == b"A" * 500 + assert "content-encoding" not in out.headers + assert out.headers["content-length"] == "500" + finally: + await resp.aclose() + await client.aclose() + + +async def test_read_capped_async_head_with_large_declared_length_not_rejected() -> None: + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-length": "50000000"}) + + client, resp = await _async_stream(handler, method="HEAD") + try: + out = await _read_capped_async(resp, 1000, resp.request) + assert out.content == b"" + assert out.headers["content-length"] == "50000000" + finally: + await resp.aclose() + await client.aclose() + + +async def test_read_capped_async_declared_over_cap() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=body) + + client, resp = await _async_stream(handler) + try: + with pytest.raises(ResponseTooLargeError) as caught: + await _read_capped_async(resp, 10, resp.request) + assert caught.value.reason == "declared" + assert caught.value.content_length == 200 # noqa: PLR2004 — len(body) + finally: + await resp.aclose() + await client.aclose() + + +async def test_read_capped_async_streamed_over_cap() -> None: + async def body() -> AsyncIterator[bytes]: + yield b"a" * 50 + yield b"b" * 50 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body()) + + client, resp = await _async_stream(handler) + try: + with pytest.raises(ResponseTooLargeError) as caught: + await _read_capped_async(resp, 70, resp.request) # trips on the second 50-byte chunk + assert caught.value.reason == "streamed" + finally: + await resp.aclose() + await client.aclose() diff --git a/tests/test_capped_read_props.py b/tests/test_capped_read_props.py new file mode 100644 index 0000000..f08d50a --- /dev/null +++ b/tests/test_capped_read_props.py @@ -0,0 +1,55 @@ +"""Hypothesis property tests for the pure _accumulate_capped core. + +The one subtle invariant of the response-body cap is chunk-boundary +independence: the accumulator must behave identically no matter how the decoded +body is split into chunks. It must raise _CapExceeded iff the total decoded +length exceeds the cap, and otherwise return the body byte-for-byte. +""" + +import pytest +from hypothesis import given +from hypothesis import strategies as st + +from httpware.client import _accumulate_capped, _CapExceeded + + +def _partition(body: bytes, sizes: list[int]) -> list[bytes]: + """Split `body` into chunks following `sizes` (remainder becomes a final chunk).""" + chunks: list[bytes] = [] + pos = 0 + for size in sizes: + if pos >= len(body): + break + chunks.append(body[pos : pos + size]) + pos += size + if pos < len(body): + chunks.append(body[pos:]) + return chunks + + +@given( + body=st.binary(max_size=2048), + sizes=st.lists(st.integers(min_value=1, max_value=64), max_size=64), + cap=st.integers(min_value=1, max_value=4096), +) +def test_accumulate_capped_chunk_boundary_independence(body: bytes, sizes: list[int], cap: int) -> None: + chunks = _partition(body, sizes) + if len(body) > cap: + with pytest.raises(_CapExceeded) as caught: + _accumulate_capped(chunks, cap) + assert caught.value.read > cap + else: + assert _accumulate_capped(chunks, cap) == body + + +@given(body=st.binary(min_size=2, max_size=512)) +def test_accumulate_capped_trips_at_one_below_length(body: bytes) -> None: + cap = len(body) - 1 + with pytest.raises(_CapExceeded): + _accumulate_capped([body], cap) + + +@given(body=st.binary(max_size=512)) +def test_accumulate_capped_passes_at_exact_length(body: bytes) -> None: + cap = max(1, len(body)) + assert _accumulate_capped([body], cap) == body diff --git a/tests/test_client_body_cap.py b/tests/test_client_body_cap.py new file mode 100644 index 0000000..fe9d086 --- /dev/null +++ b/tests/test_client_body_cap.py @@ -0,0 +1,206 @@ +"""max_response_body_bytes — non-streaming send() cap + construction validation. + +Covers both clients: the terminal buffers under the cap and fails fast with +ResponseTooLargeError when a response body (any status) exceeds it. stream() +coverage lives in tests/test_client_stream*.py. +""" + +import gzip +from collections.abc import AsyncIterator + +import httpx2 +import pytest + +from httpware import AsyncClient, Client +from httpware.errors import ResponseTooLargeError + + +def _sync(handler: object, cap: int | None) -> Client: + return Client( + httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), # ty: ignore[invalid-argument-type] + max_response_body_bytes=cap, + ) + + +def _async(handler: object, cap: int | None) -> AsyncClient: + return AsyncClient( + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), # ty: ignore[invalid-argument-type] + max_response_body_bytes=cap, + ) + + +# ---- construction validation ---- + + +@pytest.mark.parametrize("bad", [0, -1]) +def test_async_rejects_cap_below_one(bad: int) -> None: + with pytest.raises(ValueError, match="max_response_body_bytes must be >= 1"): + AsyncClient(max_response_body_bytes=bad) + + +@pytest.mark.parametrize("bad", [0, -1]) +def test_sync_rejects_cap_below_one(bad: int) -> None: + with pytest.raises(ValueError, match="max_response_body_bytes must be >= 1"): + Client(max_response_body_bytes=bad) + + +# ---- sync send() ---- + + +def test_sync_send_within_cap_returns_response() -> None: + body = b"hello world" + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = _sync(handler, 1000) + request = client.build_request("GET", "https://example.test/x") + assert client.send(request).content == body + client.close() + + +def test_sync_send_over_cap_declared_on_success() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = _sync(handler, 10) + request = client.build_request("GET", "https://example.test/x") + with pytest.raises(ResponseTooLargeError) as caught: + client.send(request) + assert caught.value.reason == "declared" + assert caught.value.status_code == 200 # noqa: PLR2004 — status-agnostic: a 200 trips + client.close() + + +def test_sync_send_over_cap_streamed_gzip_bomb() -> None: + raw = gzip.compress(b"A" * 100_000) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-encoding": "gzip"}, content=raw) + + client = _sync(handler, 1000) + request = client.build_request("GET", "https://example.test/x") + with pytest.raises(ResponseTooLargeError) as caught: + client.send(request) + assert caught.value.reason == "streamed" + client.close() + + +def test_sync_send_within_cap_gzip_returns_decoded() -> None: + raw = gzip.compress(b"A" * 500) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-encoding": "gzip"}, content=raw) + + client = _sync(handler, 1_000_000) + request = client.build_request("GET", "https://example.test/x") + assert client.send(request).content == b"A" * 500 # not re-decompressed/crashed + client.close() + + +def test_sync_head_large_declared_length_not_rejected() -> None: + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-length": "50000000"}) + + client = _sync(handler, 1000) + request = client.build_request("HEAD", "https://example.test/x") + response = client.send(request) + assert response.content == b"" + assert response.headers["content-length"] == "50000000" + client.close() + + +def test_sync_send_none_cap_unbounded() -> None: + body = b"x" * 10_000 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = _sync(handler, None) + request = client.build_request("GET", "https://example.test/x") + assert client.send(request).content == body + client.close() + + +# ---- async send() ---- + + +async def test_async_send_within_cap_returns_response() -> None: + body = b"hello world" + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = _async(handler, 1000) + request = client.build_request("GET", "https://example.test/x") + assert (await client.send(request)).content == body + await client.aclose() + + +async def test_async_send_over_cap_declared() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = _async(handler, 10) + request = client.build_request("GET", "https://example.test/x") + with pytest.raises(ResponseTooLargeError) as caught: + await client.send(request) + assert caught.value.reason == "declared" + await client.aclose() + + +async def test_async_send_over_cap_streamed_chunked() -> None: + async def body() -> AsyncIterator[bytes]: + yield b"a" * 50 + yield b"b" * 50 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body()) + + client = _async(handler, 70) + request = client.build_request("GET", "https://example.test/x") + with pytest.raises(ResponseTooLargeError) as caught: + await client.send(request) + assert caught.value.reason == "streamed" + assert caught.value.content_length is None + await client.aclose() + + +async def test_async_send_within_cap_gzip_returns_decoded() -> None: + raw = gzip.compress(b"A" * 500) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-encoding": "gzip"}, content=raw) + + client = _async(handler, 1_000_000) + request = client.build_request("GET", "https://example.test/x") + assert (await client.send(request)).content == b"A" * 500 + await client.aclose() + + +async def test_async_head_large_declared_length_not_rejected() -> None: + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, headers={"content-length": "50000000"}) + + client = _async(handler, 1000) + request = client.build_request("HEAD", "https://example.test/x") + response = await client.send(request) + assert response.content == b"" + assert response.headers["content-length"] == "50000000" + await client.aclose() + + +async def test_async_send_none_cap_unbounded() -> None: + body = b"x" * 10_000 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = _async(handler, None) + request = client.build_request("GET", "https://example.test/x") + assert (await client.send(request)).content == body + await client.aclose() diff --git a/tests/test_client_stream.py b/tests/test_client_stream.py index 3847eb6..b8f689c 100644 --- a/tests/test_client_stream.py +++ b/tests/test_client_stream.py @@ -347,12 +347,12 @@ def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 return httpx2.Response(500, content=body) client = AsyncClient( - httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_error_body_bytes=10 + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_response_body_bytes=10 ) with pytest.raises(ResponseTooLargeError) as caught: async with client.stream("GET", "https://example.test/x"): pytest.fail("unreachable") - assert caught.value.limit == 10 # noqa: PLR2004 — mirrors max_error_body_bytes above + assert caught.value.limit == 10 # noqa: PLR2004 — mirrors max_response_body_bytes above assert caught.value.content_length == 200 # noqa: PLR2004 — len(body) above await client.aclose() @@ -364,7 +364,7 @@ def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 return httpx2.Response(404, content=body) client = AsyncClient( - httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_error_body_bytes=1000 + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_response_body_bytes=1000 ) with pytest.raises(NotFoundError) as caught: async with client.stream("GET", "https://example.test/x"): @@ -387,6 +387,58 @@ def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 await client.aclose() +async def test_stream_error_pre_read_streamed_over_cap() -> None: + async def body() -> typing.AsyncIterator[bytes]: + yield b"a" * 50 + yield b"b" * 50 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=body()) # chunked: no Content-Length + + client = AsyncClient( + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_response_body_bytes=70 + ) + with pytest.raises(ResponseTooLargeError) as caught: + async with client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.reason == "streamed" + assert caught.value.content_length is None + await client.aclose() + + +async def test_stream_error_pre_read_within_cap_gzip_decoded() -> None: + import gzip # noqa: PLC0415 — local to this regression test + + raw = gzip.compress(b"boom" * 50) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, headers={"content-encoding": "gzip"}, content=raw) + + client = AsyncClient( + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_response_body_bytes=1_000_000 + ) + with pytest.raises(InternalServerError) as caught: + async with client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.response.content == b"boom" * 50 # decoded, not re-decompressed + await client.aclose() + + +async def test_stream_user_driven_success_body_not_capped() -> None: + body = b"x" * 100_000 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = AsyncClient( + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_response_body_bytes=10 + ) + async with client.stream("GET", "https://example.test/x") as response: + chunks = [chunk async for chunk in response.aiter_bytes()] + assert b"".join(chunks) == body # user-driven streaming is never capped + await client.aclose() + + @pytest.mark.parametrize( ("raw", "expected"), [(None, None), ("123", 123), ("abc", None), ("-5", None), ("0", 0)], diff --git a/tests/test_client_stream_sync.py b/tests/test_client_stream_sync.py index 53c62c2..25df725 100644 --- a/tests/test_client_stream_sync.py +++ b/tests/test_client_stream_sync.py @@ -314,10 +314,10 @@ def test_stream_raises_response_too_large_when_over_cap_sync() -> None: def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 return httpx2.Response(500, content=body) - client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_error_body_bytes=10) + client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_response_body_bytes=10) with pytest.raises(ResponseTooLargeError) as caught, client.stream("GET", "https://example.test/x"): pytest.fail("unreachable") - assert caught.value.limit == 10 # noqa: PLR2004 — mirrors max_error_body_bytes above + assert caught.value.limit == 10 # noqa: PLR2004 — mirrors max_response_body_bytes above assert caught.value.content_length == 200 # noqa: PLR2004 — len(body) above client.close() @@ -328,7 +328,7 @@ def test_stream_reads_error_body_when_under_cap_sync() -> None: def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 return httpx2.Response(404, content=body) - client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_error_body_bytes=1000) + client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_response_body_bytes=1000) with pytest.raises(NotFoundError) as caught, client.stream("GET", "https://example.test/x"): pytest.fail("unreachable") assert caught.value.response.content == body @@ -346,3 +346,45 @@ def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 pytest.fail("unreachable") assert caught.value.response.content == body client.close() + + +def test_stream_error_pre_read_streamed_over_cap_sync() -> None: + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=(c for c in (b"a" * 50, b"b" * 50))) # chunked: no Content-Length + + client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_response_body_bytes=70) + with pytest.raises(ResponseTooLargeError) as caught, client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.reason == "streamed" + assert caught.value.content_length is None + client.close() + + +def test_stream_error_pre_read_within_cap_gzip_decoded_sync() -> None: + import gzip # noqa: PLC0415 — local to this regression test + + raw = gzip.compress(b"boom" * 50) + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, headers={"content-encoding": "gzip"}, content=raw) + + client = Client( + httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_response_body_bytes=1_000_000 + ) + with pytest.raises(InternalServerError) as caught, client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.response.content == b"boom" * 50 + client.close() + + +def test_stream_user_driven_success_body_not_capped_sync() -> None: + body = b"x" * 100_000 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(200, content=body) + + client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_response_body_bytes=10) + with client.stream("GET", "https://example.test/x") as response: + chunks = list(response.iter_bytes()) + assert b"".join(chunks) == body # user-driven streaming is never capped + client.close() diff --git a/tests/test_errors.py b/tests/test_errors.py index 2a087ae..fbb9fe3 100644 --- a/tests/test_errors.py +++ b/tests/test_errors.py @@ -433,18 +433,36 @@ def test_status_error_message_masks_query_secret() -> None: def test_response_too_large_error_fields_and_message() -> None: - exc = ResponseTooLargeError(status_code=500, limit=1024, content_length=2048) + exc = ResponseTooLargeError(status_code=500, limit=1024, content_length=2048, reason="declared") assert exc.status_code == 500 # noqa: PLR2004 — literal mirrors construction above assert exc.limit == 1024 # noqa: PLR2004 — literal mirrors construction above assert exc.content_length == 2048 # noqa: PLR2004 — literal mirrors construction above + assert exc.reason == "declared" assert "1024" in str(exc) assert "2048" in str(exc) +def test_response_too_large_error_status_agnostic_streamed() -> None: + exc = ResponseTooLargeError(status_code=200, limit=10, content_length=None, reason="streamed") + assert exc.status_code == 200 # noqa: PLR2004 — literal mirrors construction above + assert exc.content_length is None + assert exc.reason == "streamed" + assert "10" in str(exc) + + +def test_response_too_large_error_message_differs_by_reason() -> None: + declared = ResponseTooLargeError(status_code=500, limit=10, content_length=2048, reason="declared") + streamed = ResponseTooLargeError(status_code=500, limit=10, content_length=None, reason="streamed") + assert str(declared) != str(streamed) + assert "2048" in str(declared) + assert "2048" not in str(streamed) + + def test_response_too_large_error_pickle_round_trip() -> None: - exc = ResponseTooLargeError(status_code=503, limit=10, content_length=None) + exc = ResponseTooLargeError(status_code=503, limit=10, content_length=None, reason="streamed") restored = pickle.loads(pickle.dumps(exc)) # noqa: S301 — round-tripping our own exception assert isinstance(restored, ResponseTooLargeError) assert restored.status_code == 503 # noqa: PLR2004 — literal mirrors construction above assert restored.limit == 10 # noqa: PLR2004 — literal mirrors construction above assert restored.content_length is None + assert restored.reason == "streamed" diff --git a/tests/test_resilience_body_cap.py b/tests/test_resilience_body_cap.py new file mode 100644 index 0000000..eadaaee --- /dev/null +++ b/tests/test_resilience_body_cap.py @@ -0,0 +1,70 @@ +"""Resilience interaction with max_response_body_bytes. + +ResponseTooLargeError is a non-status ClientError, so it must fall outside the +retry/circuit-breaker failure classifications. These tests lock that behavior so +a future refactor can't silently make a cap trip retryable or breaker-counting. +""" + +import httpx2 +import pytest + +from httpware import AsyncClient, CircuitState, ResponseTooLargeError +from httpware.middleware.resilience.circuit_breaker import AsyncCircuitBreaker +from httpware.middleware.resilience.retry import AsyncRetry + + +class _CountingHandler: + """Mock transport that counts calls and always returns the same response.""" + + def __init__(self, status: int, body: bytes) -> None: + self.status = status + self.body = body + self.calls = 0 + + def __call__(self, request: httpx2.Request) -> httpx2.Response: + self.calls += 1 + return httpx2.Response(self.status, content=self.body, request=request) + + +def _client(handler: _CountingHandler, *, middleware: list[object], cap: int) -> AsyncClient: + return AsyncClient( + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), + middleware=middleware, # ty: ignore[invalid-argument-type] + max_response_body_bytes=cap, + ) + + +async def test_response_too_large_is_not_retried() -> None: + handler = _CountingHandler(200, b"x" * 200) + client = _client(handler, middleware=[AsyncRetry()], cap=10) + request = client.build_request("GET", "https://example.test/x") + with pytest.raises(ResponseTooLargeError): + await client.send(request) + assert handler.calls == 1 # not retried — a single terminal attempt + await client.aclose() + + +async def test_over_cap_retryable_5xx_surfaces_as_too_large_not_retried() -> None: + # 503 is retryable, but the cap trips first: cap-wins / fail-hard. + handler = _CountingHandler(503, b"x" * 200) + client = _client(handler, middleware=[AsyncRetry()], cap=10) + request = client.build_request("GET", "https://example.test/x") + with pytest.raises(ResponseTooLargeError) as caught: + await client.send(request) + assert caught.value.status_code == 503 # noqa: PLR2004 — the retryable status, surfaced not retried + assert handler.calls == 1 + await client.aclose() + + +async def test_response_too_large_does_not_trip_circuit_breaker() -> None: + # failure_threshold=1: one real failure would open the circuit; a cap trip must not. + handler = _CountingHandler(500, b"x" * 200) + breaker = AsyncCircuitBreaker(failure_threshold=1) + client = _client(handler, middleware=[breaker], cap=10) + request = client.build_request("GET", "https://example.test/x") + for _ in range(3): + with pytest.raises(ResponseTooLargeError): + await client.send(request) + assert breaker.state is CircuitState.CLOSED # neither success nor failure recorded + assert handler.calls == 3 # noqa: PLR2004 — breaker never opened, every call reached the transport + await client.aclose()