feat: add dry run to the read_gbq function by antoineeripret · Pull Request #979 · googleapis/python-bigquery-pandas

antoineeripret · 2025-11-06T09:32:23Z

This change allows the user to run a dry run query using the read_gbq function. Instead of returning a pd.DataFrame, the behavior is changed and the amount of data processed (in GB) is returned.

shuoweil · 2025-11-07T18:20:36Z

@antoineeripret Could you please check the failed tests? Thanks a lot.

antoineeripret · 2025-11-10T08:19:33Z

@shuoweil , I've added a new commit with some changes to fix tests. I've ran nox -s unit-3.10 and got 0 fails. Thank you !

shuoweil · 2025-11-11T18:16:02Z

lint / lint (pull_request)

Hi @antoineeripret, could you please check the failed check please? It should be a quick fix. Thanks a lot.

antoineeripret · 2025-11-12T23:41:36Z

Hi @shuoweil, the last commit should fix it. Got the following on my local env:

python -m black --check docs pandas_gbq tests noxfile.py setup.py
All done! ✨ 🍰 ✨
45 files would be left unchanged.

sycai · 2025-11-13T18:57:08Z

pandas_gbq/gbq_connector.py

+                # we need to get it from the query result
+                # For query_and_wait_via_client_library, the RowIterator should have job set
+                raise ValueError("Cannot access QueryJob from RowIterator for dry_run")
+            return query_job.total_bytes_processed / 1024**3


Could we simply return query_job.total_bytes_processed without further processing?

Reasons:

The total_bytes_processed has integer type, which is more precise than a float type

For small tables (ones with 1-10 MB sizes), converting the size to GB makes the result less readable

It aligns more with the behavior of BigQuery Python client to return size in bytes.

Generally speaking, we want the caller of this function to perform unit conversions.

@sycai, good call ! I've though about my own usage, but didn't think about the bigger picture here. I'll commit the change. :)

sycai

Thank you! I think we should be good to go once the doc and tests are updated.

pandas_gbq/gbq.py

tests/unit/test_gbq.py

antoineeripret · 2025-11-18T07:06:39Z

@sycai : updated :)

shuoweil · 2025-11-18T18:48:11Z

@antoineeripret I believe lint fails. Could you please update it? It still fails with the new commit.

antoineeripret · 2025-11-26T12:09:04Z

@shuoweil, fixed with last commit (I left a trailing space) in a file.

shuoweil · 2025-12-03T19:45:56Z

@antoineeripret Could you please take a look at the failed testcases? They seem to related to your change.
tests/system/test_gbq.py::TestReadGBQIntegration::test_read_gbq_with_dry_run

Thanks a lot.

shuoweil

@antoineeripret Please check the failed testcase tests/system/test_gbq.py::TestReadGBQIntegration::test_read_gbq_with_dry_run

antoine-eripret-docplanner · 2025-12-22T07:16:12Z

@shuoweil: Can I have the details of what has failed? I can't read the Kokoro's results (permission denied). Thank you !

sycai · 2025-12-29T20:27:49Z

Hey @antoineeripret I took the liberty to directly enhance your code.

Now when we do dry run, we will get more stats like table schema and creation time in addition to the total bytes processed. The results are aggregated into a pandas Series like this:

bigquerySchema         [SchemaField('my_col', 'TIMESTAMP', 'NULLABLE'...
projectId                                                     project-id
location                                                              US
jobType                                                            QUERY
dispatchedSql                                   SELECT * FROM `my_table`
destinationTable          {'projectId': 'project-id', 'datasetId': '_...
useLegacySql                                                       False
referencedTables          [{'projectId': 'project-id', 'datasetId': '...
totalBytesProcessed                                                  320
cacheHit                                                           False
statementType                                                     SELECT
creationTime                            2025-12-29 19:36:24.972000+00:00
dtype: object

I will make sure all the tests pass this time.

Also tagging @shuoweil and @tswast as it aligns more with our BigFrames code: https://github.com/googleapis/python-bigquery-dataframes/blob/69fa7f404cde5a202df68ed21a6faeac98fb1e4d/bigframes/session/dry_runs.py#L112

antoineeripret · 2025-12-30T07:37:19Z

Hey @antoineeripret I took the liberty to directly enhance your code.
[...]

Amazing improvement, thanks for adding your input and knowledge @sycai ! The output is beter than what I had envisioned !

Thank you !

PR created by the Librarian CLI to initialize a release. Merging this PR will auto trigger a release. Librarian Version: v0.7.0 Language Image: us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:c8612d3fffb3f6a32353b2d1abd16b61e87811866f7ec9d65b59b02eb452a620 <details><summary>pandas-gbq: 0.33.0</summary> ## [0.33.0](v0.32.0...v0.33.0) (2026-01-05) ### Features * add dry run to the read_gbq function (#979) ([516f986](516f986f)) </details> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

antoineeripret added 2 commits November 6, 2025 10:21

feat: add dry run to the read_gbq function

d268a62

return the cost (in GB) if dry run is set to True

13fbf92

antoineeripret requested review from a team as code owners November 6, 2025 09:32

antoineeripret requested review from Linchin and sycai November 6, 2025 09:32

blunderbuss-gcf bot assigned GarrettWu Nov 6, 2025

product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the googleapis/python-bigquery-pandas API. labels Nov 6, 2025

antoineeripret changed the title ~~Add dry run~~ feat: add dry run to the read_gbq function Nov 6, 2025

GarrettWu assigned shuoweil and unassigned GarrettWu Nov 6, 2025

shuoweil added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 7, 2025

yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 7, 2025

updates to fix test

adcfc7b

product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Nov 10, 2025

fix lint

e9f4c00

sycai reviewed Nov 13, 2025

View reviewed changes

Remove unit conversion

a171ff4

antoineeripret requested a review from sycai November 16, 2025 10:48

sycai reviewed Nov 17, 2025

View reviewed changes

pandas_gbq/gbq.py Outdated Show resolved Hide resolved

tests/unit/test_gbq.py Outdated Show resolved Hide resolved

fix docs

8207a47

antoineeripret requested a review from sycai November 18, 2025 07:06

modify doc to use int instead of float + remove trailing space

171a6f5

antoineeripret dismissed sycai’s stale review via 171a6f5 November 26, 2025 12:08

sycai previously approved these changes Dec 2, 2025

View reviewed changes

shuoweil self-requested a review December 2, 2025 19:43

shuoweil approved these changes Dec 2, 2025

View reviewed changes

shuoweil added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 3, 2025

yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 3, 2025

shuoweil requested changes Dec 3, 2025

View reviewed changes

shuoweil and others added 2 commits December 22, 2025 19:36

Merge branch 'main' into add_dry_run

78edab9

enrich dry run result with more stats

cc2be7c

sycai dismissed their stale review via cc2be7c December 29, 2025 20:23

product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Dec 29, 2025

shuoweil self-requested a review December 29, 2025 21:52

sycai added 2 commits December 29, 2025 22:48

increase test coverage

bdffe92

simplify logic

9b09fdb

sycai added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 30, 2025

yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 30, 2025

sycai approved these changes Dec 30, 2025

View reviewed changes

sycai mentioned this pull request Dec 30, 2025

Ability to handle a dry_run googleapis/google-cloud-python#14485

Open

shuoweil approved these changes Dec 30, 2025

View reviewed changes

shuoweil merged commit 516f986 into googleapis:main Dec 30, 2025
25 of 26 checks passed

tswast mentioned this pull request Jan 5, 2026

chore: librarian release pull request: 20260105T185010Z #1010

Merged

Conversation

antoineeripret commented Nov 6, 2025

Uh oh!

shuoweil commented Nov 7, 2025

Uh oh!

antoineeripret commented Nov 10, 2025

Uh oh!

shuoweil commented Nov 11, 2025

Uh oh!

antoineeripret commented Nov 12, 2025

Uh oh!

sycai Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

antoineeripret Nov 16, 2025

Choose a reason for hiding this comment

Uh oh!

sycai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

antoineeripret commented Nov 18, 2025

Uh oh!

shuoweil commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

antoineeripret commented Nov 26, 2025

Uh oh!

shuoweil commented Dec 3, 2025

Uh oh!

shuoweil left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoine-eripret-docplanner commented Dec 22, 2025

Uh oh!

sycai commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

antoineeripret commented Dec 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

shuoweil commented Nov 18, 2025 •

edited

Loading

shuoweil left a comment •

edited

Loading

sycai commented Dec 29, 2025 •

edited

Loading