Stream Anthropic responses uncompressed#771
Open
xymbol wants to merge 1 commit into
Open
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #771 +/- ##
=======================================
Coverage 87.05% 87.06%
=======================================
Files 119 119
Lines 5594 5596 +2
Branches 1407 1407
=======================================
+ Hits 4870 4872 +2
Misses 724 724 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
8e01ac9 to
e400fd4
Compare
Net::HTTP auto-inflates the upstream gzip, which buffers SSE chunks until Cloudflare flushes its deflate state — turning ~100 events into 2 bursts and pushing first-chunk arrival from ~1s to ~15s on a 22s response. Set Accept-Encoding: identity on streaming requests. Non-streaming responses keep gzip.
e400fd4 to
5f7a121
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Net::HTTP requests gzip and auto-inflates by default. Cloudflare gzips Anthropic's SSE with infrequent deflate flushes, batching chunk delivery into 2 bursts and pushing first-chunk arrival to ~15s on a 22s response.
Setting
Accept-Encoding: identityon streaming requests bypasses Net::HTTP's inflater. Scoped to Anthropic streaming; non-streaming responses still benefit from gzip.Measured on
claude-haiku-4-5, 1500-word completion. Sparkline: each char = 1s, digit = chunks delivered,_= zero.9____984528454645464555__92Type of change
Scope check
Quality check
[:method, :uri], and Net::HTTP still inflates a recorded gzipped response on its own. Re-recording withrake vcr:record[anthropic]is optional and not included here; the only load-bearing diff would be theAccept-Encodingrequest header value.spec/ruby_llm/providers/anthropic/streaming_spec.rbasserts the request header (fails without the fix, passes with).AI-generated code
API changes
Related
anthropics/anthropic-sdk-ruby#182 — the official Anthropic Ruby SDK has the same Net::HTTP auto-inflate bug and applies the same one-header fix (
Accept-Encoding: identity) on its streaming endpoints.