Skip to content

OnMessage applies the inbound HTTP/1 half-close 5s deadline to OUTBOUND connections — delays EOF delivery when response data + peer-close coalesce in one read cycle #48

Description

@mwfj

Summary

ConnectionHandler::OnMessage() (server/connection_handler.cc) applies an inbound-only HTTP/1 half-close heuristic to all connections, including established outbound (client-role / upstream-pool) connections. When an outbound peer writes its response body and then closes the TCP connection such that the body bytes and the EOF (read() == 0) are drained in the same OnMessage read cycle, the EOF is not delivered to the connection's borrower — instead a 5-second fallback deadline is armed. The borrower (e.g. the upstream proxy codec) therefore observes the close, and finalizes the response, up to 5 s late.

Affected code

server/connection_handler.cc, ConnectionHandler::OnMessage(), the peer_closed block (≈ lines 546–597 on main):

        if (output_bf_.Size() > 0) {
            client_channel_->EnableWriteMode();
        } else if (callback_ran) {                 // line 573 — fires for INBOUND *and* OUTBOUND
            // ... inbound HTTP/1 half-close heuristic ...
            if (!has_deadline_) {
                SetDeadline(std::chrono::steady_clock::now() +
                            std::chrono::seconds(5));
            }
        } else {
            ForceClose();
        }

The else if (callback_ran) branch is intended for the inbound case documented just above it — "HTTP/1 clients are allowed to half-close the write side (shutdown(SHUT_WR)) while waiting for the response" — so the server must not close, and arms a bounded fallback deadline. That is correct for inbound connections, but it is applied unconditionally to outbound connections too.

Root cause

The branch does not distinguish inbound from outbound connections. The file already establishes the invariant (comment near line 503): "Inbound connections always have connect_state_ == NONE"; outbound connections are CONNECTED. The half-close heuristic only makes sense for inbound (NONE) — for an outbound connection there is no "in-flight request still being answered." A peer read-close on an outbound connection is a definitive end-of-stream, and the connection should close promptly so the close path delivers the EOF to the borrower (the pool relays it as the empty-bytes on_message_callback, which drives the upstream codec's EOF finalization).

Whether this branch is reached depends on read-cycle coalescing:

  • If the response bytes and the FIN are drained in separate OnMessage cycles, the EOF cycle has callback_ran == false → the final else { ForceClose(); } fires → EOF delivered immediately. (Correct.)
  • If they coalesce into one cycle (body bytes appended, then read() == 0 in the same drain loop), callback_ran == true → the 5 s deadline is armed → EOF delivery delayed up to 5 s. (Bug.)

Coalescing is timing- and platform-dependent: on Linux/epoll under load the body + FIN very frequently arrive together (reproduces deterministically in CI); on macOS/kqueue they more often split across cycles (the bug is intermittent / usually masked).

Impact

Any outbound/upstream response whose completion is EOF-delimited — no Content-Length and no Transfer-Encoding: chunked (HTTP/1.0 responses, Connection: close responses, and streaming/SSE upstreams that end by closing the socket) — can have its finalization delayed by up to 5 s whenever the body and FIN coalesce in one read. For a streaming upstream that drops mid-stream, the proxy's terminal/disconnect signal to the downstream client is delayed the same way. Length-delimited and chunked responses are unaffected (they complete via on_message_complete, not via the peer_closed path).

Reproduction

  1. Configure a proxy route to an upstream.
  2. Have the upstream return an EOF-delimited HTTP response (no Content-Length, no chunked — e.g. Connection: close), write the full body, then immediately close() the socket so the body bytes and FIN land in a single read drain on the proxy's outbound connection (reliable on Linux/loopback).
  3. Observe that the proxy finalizes the response ~5 s late (the inbound fallback deadline) instead of promptly on EOF.

Proposed fix

Gate the inbound half-close heuristic to inbound connections only; outbound connections fall through to ForceClose():

-        } else if (callback_ran) {
+        } else if (callback_ran && connect_state_ == ConnectState::NONE) {
             // INBOUND only. ... arm the 5s fallback deadline ...
             if (!has_deadline_) {
                 SetDeadline(std::chrono::steady_clock::now() +
                             std::chrono::seconds(5));
             }
         } else {
-            // No callback ran ... nothing to wait for.
+            // No callback ran, OR an OUTBOUND connection whose peer read-closed
+            // after delivering its response (data + EOF coalesced into one read).
+            // A peer close on an outbound connection is a definitive end-of-stream —
+            // close now so the close path delivers the EOF to the borrower
+            // immediately, rather than waiting out the inbound-only fallback deadline.
             ForceClose();
         }

This relies only on the existing connect_state_ invariant, leaves the output_bf_.Size() > 0 flush-then-close branch untouched, and does not change any inbound behavior (inbound is always NONE). ForceClose() here runs after on_message_callback returns — identical to the existing sibling else { ForceClose(); } path.

Notes

Found while integrating reactor_server_cpp as the foundation of a downstream agent gateway: a streaming (SSE) upstream that dropped mid-stream failed to surface its synthesized terminal within the client's read budget on Linux CI — deterministically — while passing on macOS. Verified downstream that the one-line gate fixes it (full test suite green under gcc and ASan/UBSan; no regression to the HTTP/1 proxy, HTTP/2 upstream, keep-alive, or inbound paths).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions