feat(query): 429 retry + truncation auto-follow (#688)#111
Conversation
75a78d2 to
9f01c24
Compare
| result = api.get_result(result_id) | ||
| last_status = result.status | ||
| if last_status == "ready": | ||
| return result | ||
| if last_status == "failed": | ||
| raise ResultFailedError( | ||
| result_id=result_id, error_message=result.error_message | ||
| ) |
There was a problem hiding this comment.
Blocking: the failed branch is dead code against the real backend, so a failed result won't surface as ResultFailedError.
Per the server contract (and results_api.py's response map), a failed result is returned as HTTP 409, and ApiClient.response_deserialize (api_client.py:326-331) raises ApiException for any non-2xx status. So api.get_result(result_id) raises on a failed result — it never returns a GetResultResponse with status == "failed". This branch can only be reached if the server ever returns 200/202 with a failed body, which the contract says it doesn't.
The result: a real failed result propagates a raw ApiException (status 409) out of wait_for_result / query(), instead of the documented ResultFailedError. test_autofollow_failed_result_raises masks this because its stub returns a status="failed" object rather than raising a 409 ApiException like the generated client does.
Catch the 409 and translate it (the deserialized body is on exc.data):
while True:
try:
result = api.get_result(result_id)
except ApiException as exc:
if exc.status == 409:
data = getattr(exc, "data", None)
raise ResultFailedError(
result_id=result_id,
error_message=getattr(data, "error_message", None),
) from exc
raise
last_status = result.status
if last_status == "ready":
return resultand update the test stub to raise a 409 ApiException so it reflects real behavior.
| delay = policy.base_backoff_s | ||
| last_status = "pending" | ||
| while True: | ||
| result = api.get_result(result_id) |
There was a problem hiding this comment.
Blocking: the readiness poll undermines the bounded-memory guard it exists to enforce.
get_result(result_id) is called here with no limit, and per the contract limit is unbounded by default — so for a ready result this materializes the entire row set into the GetResultResponse on every poll. The max_auto_rows / max_auto_bytes guards in _materialize_full / _fetch_all_rows only run after wait_for_result returns, so an oversized result is fully pulled into client memory before the guard can trip. That defeats the module's stated guarantee that "an unbounded result is never silently pulled into client memory," and it also re-fetches everything redundantly (_fetch_all_rows pages it again from offset 0).
Poll with limit=0 to fetch only the status (0 is valid — both must be non-negative):
result = api.get_result(result_id, limit=0)_authoritative_total already takes the count from the preview / query-run record, so the poll doesn't need any rows.
There was a problem hiding this comment.
Review
Blocking Issues
-
wait_for_resultfailed-result handling is dead against the real backend (hotdata/query.py:366-369). A failed result is returned as HTTP 409, and the generated client raisesApiExceptionon any non-2xx (api_client.py:326). Soget_resultnever returns astatus == "failed"object — a real failed result escapes as a raw 409ApiExceptioninstead of the documentedResultFailedError. The test passes only because its stub returns a failed object rather than raising 409. -
Readiness poll undermines the bounded-memory guard (
hotdata/query.py:362). The poll callsget_resultwith nolimit, which is unbounded by default, so areadyresult is fully materialized into memory on every poll — beforemax_auto_rows/max_auto_bytescan trip. The guards run only afterwait_for_resultreturns, so an oversized result is pulled into client RAM (and then re-fetched page by page). Poll withlimit=0.
Action Required
- Catch
ApiExceptionwith status 409 inwait_for_resultand translate it toResultFailedError(error message is onexc.data.error_message); updatetest_autofollow_failed_result_raisesto raise a 409ApiExceptionso it reflects the generated client's actual behavior. - Pass
limit=0on the readiness poll so it fetches status only, not the full result set.
There was a problem hiding this comment.
Both prior blocking issues are resolved:
-
409 →
ResultFailedError(query.py:366-380): the failed-result branch now catches the HTTP 409ApiExceptionthe generated client actually raises and translates it, pullingerror_messageoffexc.data. Verifiedresults_api.pymaps 409 →GetResultResponseandapi_client.pydeserializes the body intoexc.data, so this works against the real backend, not just the stub. The test stub was updated to raise a real 409. -
limit=0readiness poll (query.py:368): the poll now fetches status only, so a ready oversized result is no longer pulled into memory before the size guards run. Confirmedlimitisge=0-validated and serialized when non-None.
No new blocking issues.
SDK support for the #640 bounded-memory query contract (#688): an enhanced
hotdata.query.QueryApithat transparently retries HTTP 429 (OVERLOADED) and auto-follows truncated results, made the default client so the obvious import path gets the safe behavior.Included
Retry-After(uncapped — the server value is authoritative), else exponential backoff + jitter under a deadline budget;ServerOverloadedErroron exhaustion.max_auto_rows(1M) andmax_auto_bytes(64 MiB). Readiness poll useslimit=0(status only) so it never pulls the result into RAM before the guards run.ResultFailedError; sharedResultErrorbase for all lifecycle errors.from hotdata import QueryApi/ResultsApinow resolve to the enhanced clients (raw generated ones remain athotdata.api.*), kept across regen byscripts/patch_query_exports.py.unitjob runs all non-integration tests on every PR (they previously never ran in CI).Notes
Closes #688.