Environment
- hackney 4.4.5, h2 0.10.2, Erlang/OTP 28.4, Elixir 1.19.5 (also seen via Tesla.Adapter.Hackney and HTTPoison 3.0.0, but the behavior is in hackney/h2)
- Peer: a standard HTTP/2 server over TLS (ALPN negotiates h2)
Since upgrading 1.x → 4.x, any POST/PUT with a body larger than ~1 MB over HTTP/2 intermittently fails with {error, send_buffer_full}. Same requests worked over HTTP/1.1 (hackney 1.x) and work again in 4.x with {protocols, [http1]}. Reproduces on real uploads: multipart documents and XML/JSON with base64 attachments (~1–5 MB).
Possible cause
h2 caps a stream's send buffer at ?MAX_SEND_BUFFER_BYTES (1 MB) and returns {error, send_buffer_full} as intentional backpressure ([h2_connection.erl:69 + :2577]). hackney streams the body with the non-blocking send and propagates that error, aborting the upload - it never waits for WINDOW_UPDATEs to drain, nor uses h2's blocking h2:send_data/5 (#{block => Timeout}) variant ([hackney_conn.erl stream_body_fun_h2, ~:3020]).
So whenever hackney feeds chunks faster than the peer opens its window (normal for a multi-MB body against the default 64 KB initial window), the buffer crosses 1 MB and the request dies.
Suggested fix
I think h2 already provides the primitive: h2_connection:send_data/5 with #{block => Timeout} "parks the caller until the buffer drains, returning ok, or {error, timeout}." - https://github.com/benoitc/erlang_h2/blob/0.10.2/README.md#L253 Both hackney body-send paths (one-shot and stream_body_fun_h2) funnel through the single h2_send_data/4 helper, so switching that one call to the blocking form should fix all of them:
%% deps/hackney/src/hackney_conn.erl
h2_send_data(H2Conn, StreamId, Bin, EndStream) ->
try
- h2_connection:send_data(H2Conn, StreamId, Bin, EndStream)
+ h2_connection:send_data(H2Conn, StreamId, Bin, EndStream, #{block => SendTimeout})
catch
exit:{ExitReason, _} -> {error, {closed, ExitReason}};
exit:ExitReason -> {error, {closed, ExitReason}}
end.
I am not very versed in Erlang yet, so I hope this is correct.
Environment
Since upgrading 1.x → 4.x, any POST/PUT with a body larger than ~1 MB over HTTP/2 intermittently fails with {error, send_buffer_full}. Same requests worked over HTTP/1.1 (hackney 1.x) and work again in 4.x with {protocols, [http1]}. Reproduces on real uploads: multipart documents and XML/JSON with base64 attachments (~1–5 MB).
Possible cause
h2 caps a stream's send buffer at ?MAX_SEND_BUFFER_BYTES (1 MB) and returns {error, send_buffer_full} as intentional backpressure ([h2_connection.erl:69 + :2577]). hackney streams the body with the non-blocking send and propagates that error, aborting the upload - it never waits for WINDOW_UPDATEs to drain, nor uses h2's blocking h2:send_data/5 (#{block => Timeout}) variant ([hackney_conn.erl stream_body_fun_h2, ~:3020]).
So whenever hackney feeds chunks faster than the peer opens its window (normal for a multi-MB body against the default 64 KB initial window), the buffer crosses 1 MB and the request dies.
Suggested fix
I think h2 already provides the primitive:
h2_connection:send_data/5with#{block => Timeout}"parks the caller until the buffer drains, returningok, or{error, timeout}." - https://github.com/benoitc/erlang_h2/blob/0.10.2/README.md#L253 Both hackney body-send paths (one-shot andstream_body_fun_h2) funnel through the singleh2_send_data/4helper, so switching that one call to the blocking form should fix all of them:I am not very versed in Erlang yet, so I hope this is correct.