Add on_poll callback for real-time query monitoring (#723)#724
Merged
Conversation
Add an optional on_poll callback to BaseCursor/Connection, invoked once per poll iteration with the current execution object. Wired at the cursor/connection level so it propagates via the existing **kwargs chain (no execute() changes; avoids adding to the execute() duplication tracked in #691). Covers all poll loops: BaseCursor.__poll (sync + thread-pool async), SparkBaseCursor.__poll, and the native-async aio loops. For Spark the callback receives the per-poll calculation status, so the signature is Callable[[AthenaQueryExecution | AthenaCalculationExecutionStatus], None]. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a "Query polling callback" section to docs/usage.md covering on_poll: connection- and cursor-level configuration, the synchronous callback contract, per-iteration invocation including the terminal state, async cursor usage, and the Spark calculation-status payload. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the on_start_query_execution field from the five synchronous cursors (Cursor, PandasCursor, ArrowCursor, PolarsCursor, S3FSCursor) into BaseCursor.__init__, mirroring on_poll, so both connection-level callbacks live in one place. The per-execute() override and its invocation stay in each synchronous cursor (the broader execute() kwargs consolidation is tracked in #691). Async/aio/Spark cursors inherit the field but do not invoke it, as they return the query id immediately through their execution model. Behavior is unchanged; a clarifying comment documents this. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
WHAT
Add an optional
on_pollcallback that is invoked once per poll iteration with the current execution object, enabling real-time query monitoring (e.g. live progress in Jupyter). Closes #723.on_pollparameter onBaseCursor.__init__andConnection/connect(...)(andconnection.cursor(...)). DefaultNone→ no behavior/perf change when unused.**kwargschain intoBaseCursor.__init__, so no per-cursorexecute()changes are needed — which also avoids adding to theexecute()kwargs duplication tracked in Refactor shared execute() kwargs handling across cursor implementations #691.get_query_execution/get_calculation_execution_status, before the terminal-state check, so the final state is delivered too.BaseCursor.__poll:BaseCursor.__poll— sync + thread-pool async cursors (default/dict/pandas/arrow/polars/s3fs)SparkBaseCursor.__poll— Spark calculationsaionative-async loops (AioBaseCursor,AioSparkCursor)Callable[[AthenaQueryExecution | AthenaCalculationExecutionStatus], None](exposed aspyathena.common.OnPollCallback).Also: centralize
on_start_query_executionstorageWhile adding
on_polltoBaseCursor, the siblingon_start_query_executionfield was still declared/stored in each of the five synchronous cursors (Cursor,PandasCursor,ArrowCursor,PolarsCursor,S3FSCursor). To keep the two callbacks symmetric, its storage is moved intoBaseCursor.__init__(mirroringon_poll). The per-execute()override and its invocation stay in each synchronous cursor — the broaderexecute()kwargs consolidation remains #691's scope. Async/aio/Spark cursors inherit the field but do not invoke it (they return the query id immediately); a comment documents this. Behavior is unchanged.Usage:
WHY
There is currently no public hook into the polling loop, so interactive environments cannot observe query state, elapsed time, or data scanned while a query runs. Requested in #723.
Tests
on_pollfires once per iteration in order incl. the terminal state;Noneis a no-op; Spark calculation poll variant.SELECT 1): connection-level, cursor-level, and async (AsyncCursor, matching the issue example).on_start_query_executioncallback tests pass unchanged after the storage move.make lint(ruff + mypy) and the callback/on_polltests pass locally.Refs #691.