fix: avoid TSaslClientTransport reuse to eliminate server-side SASL noise#3357
Conversation
|
Hi @mnzpk and @kevinjqliu — this addresses the server-side noise that #1747 / #1941 left in place. Your review would be much appreciated when you have time. |
Fokko
left a comment
There was a problem hiding this comment.
This seems reasonable to me, since it is much simpler :)
|
@Fokko Thanks for the review! Could you help merge this when you have a moment? |
|
@barking-code Sure thing, I wanted to give others also the oppertunity to jump in :) Thanks for fixing this |
|
thank you for fixing this @barking-code. I dont have a local setup to test Hive with SASL, so the previous solution was best effort 😆 |
|
Hi @kevinjqliu — I ran it against a kerberized HMS we have. Opened/closed the client 5 times on both branches and watched the HMS server log:
Client side worked in both cases (main recovers via the existing |
Related: #1744, #1747, #1941
Rationale for this change
#1747 / #1941 caught the
TSaslClientTransportreopen failure on the client side, but the doomedopen()attempt still completes a TCP handshake before aborting, which leaves aTTransportExceptionevent in the HMS server log per_HiveClientre-entry. In production this produces a steady stream of server-side error events matched by HMS alert filters.TSaslClientTransportis single-use by design: itsSASLClientis disposed onclose()and there is no API to reset it. Reusing the same transport across context-manager entries was the original misuse. This PR keeps the existing_HiveClient.__init__setup intact — the transport is still created in__init__so_HiveClientcan be used without a context manager (per @kevinjqliu's note in #1747) — and reinitializes the transport beforeopen()whenever the client has been used before:The
try/exceptretry from #1747 / #1941 is removed because the failure mode it covered (reopening a disposed SASL session) cannot occur on a fresh transport.The public behaviour of
_HiveClient(with hive_client as c:returns a workingClient,__exit__closes it) is unchanged.Are these changes tested?
test_kerberized_client_uses_fresh_transport_on_reuseasserts that the transport instance differs between two consecutive context-manager entriestest_create_hive_client_with_kerberos_using_context_managercontinues to passAre there any user-facing changes?
No.