-
Notifications
You must be signed in to change notification settings - Fork 17
feat(python): add Python bindings for ggsql #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
georgestagg
merged 22 commits into
posit-dev:main
from
cpsievert:feature/python-e2e-tests
Jan 29, 2026
Merged
Changes from 12 commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
0d6d157
feat(python): add ggsql-python package with PyO3 bindings
cpsievert 994062e
Add dev dependency group
cpsievert 9b3370d
fix: exclude ggsql-python from default cargo build
cpsievert a7b331c
style: format ggsql-python
cpsievert 0f725c2
refactor: use pyproject.toml extras for CI dependencies
cpsievert 757da46
fix: Python CI workflow improvements
cpsievert a374a66
fix: handle narwhals DataFrames and use correct global data key
cpsievert ba4633e
feat: use narwhals for DataFrame conversion
cpsievert f0cf267
fix: correct wheel path in Python CI workflow
cpsievert f19ecc4
chore: drop pyarrow dependency
cpsievert 9a46b9a
fix: restore pyarrow dependency required by pyo3-polars
cpsievert cbc08e2
fix: commit tree-sitter generated files for Windows CI
cpsievert f951bfc
Add a basic .gitignore
cpsievert 4b4e7aa
feat(python): return altair.Chart from render()
cpsievert 23053a2
fix(python): add runtime validation for writer parameter
cpsievert abc2ff6
refactor(python): consolidate tests and focus on Python logic
cpsievert 67cc299
refactor(python): rename render() to render_altair()
cpsievert 0d0d5f3
refactor(python): remove pyarrow dependency, use IPC for data transfer
cpsievert e351fab
style(python): fix Rust formatting
cpsievert 9c974d5
docs: add Python bindings section to CLAUDE.md
cpsievert 8ffdc1a
fix: add tree-sitter-cli to CI workflows for Windows compatibility
cpsievert d0bb9fe
fix(ci): skip doc tests to avoid linker memory issues
cpsievert File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,90 @@ | ||
| name: Python | ||
|
|
||
| on: | ||
| push: | ||
| paths: ['ggsql-python/**', '.github/workflows/python.yml'] | ||
| pull_request: | ||
| paths: ['ggsql-python/**', '.github/workflows/python.yml'] | ||
|
|
||
| jobs: | ||
| test: | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| os: [ubuntu-latest, macos-latest, windows-latest] | ||
| python: ['3.10', '3.11', '3.12', '3.13'] | ||
| runs-on: ${{ matrix.os }} | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: ${{ matrix.python }} | ||
|
|
||
| - name: Install Rust | ||
| uses: dtolnay/rust-toolchain@stable | ||
|
|
||
| - name: Rust cache | ||
| uses: Swatinem/rust-cache@v2 | ||
| with: | ||
| workspaces: ggsql-python | ||
| shared-key: ${{ matrix.os }}-python | ||
|
|
||
| - name: Build wheel | ||
| uses: PyO3/maturin-action@v1 | ||
| with: | ||
| working-directory: ggsql-python | ||
| command: build | ||
| args: --release | ||
| sccache: true | ||
|
|
||
| - name: Install wheel and test | ||
| shell: bash | ||
| run: | | ||
| pip install --find-links target/wheels/ ggsql[test] | ||
| pytest ggsql-python/tests/test_ggsql.py -v | ||
|
|
||
| e2e-test: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.13' | ||
|
|
||
| - name: Install Rust | ||
| uses: dtolnay/rust-toolchain@stable | ||
|
|
||
| - name: Rust cache | ||
| uses: Swatinem/rust-cache@v2 | ||
| with: | ||
| workspaces: ggsql-python | ||
| shared-key: ubuntu-latest-python | ||
|
|
||
| - name: Build wheel | ||
| uses: PyO3/maturin-action@v1 | ||
| with: | ||
| working-directory: ggsql-python | ||
| command: build | ||
| args: --release | ||
| sccache: true | ||
|
|
||
| - name: Install wheel and E2E dependencies | ||
| shell: bash | ||
| run: pip install --find-links target/wheels/ ggsql[test,e2e] | ||
|
|
||
| - name: Run E2E tests | ||
| shell: bash | ||
| run: pytest ggsql-python/tests/test_altair_e2e.py -v | ||
|
|
||
| lint: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Check Rust formatting | ||
| run: cargo fmt --package ggsql-python -- --check | ||
|
|
||
| - name: Clippy | ||
| run: cargo clippy --package ggsql-python -- -D warnings |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| [package] | ||
| name = "ggsql-python" | ||
| version = "0.1.0" | ||
| edition = "2021" | ||
| license = "MIT" | ||
| description = "Python bindings for ggsql" | ||
|
|
||
| [lib] | ||
| name = "_ggsql" | ||
| crate-type = ["cdylib"] | ||
|
|
||
| [dependencies] | ||
| pyo3 = { version = "0.26", features = ["extension-module"] } | ||
| pyo3-polars = { version = "0.25", features = ["dtype-decimal", "dtype-struct"] } | ||
| polars.workspace = true | ||
| ggsql = { path = "../src", default-features = false, features = ["vegalite"] } | ||
|
|
||
| [features] | ||
| default = [] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,156 @@ | ||
| # ggsql | ||
|
|
||
| Python bindings for [ggsql](https://github.com/georgestagg/ggsql), a SQL extension for declarative data visualization. | ||
|
|
||
| This package provides a thin wrapper around the Rust `ggsql` crate, enabling Python users to render Vega-Lite visualizations from polars DataFrames using ggsql's VISUALISE syntax. | ||
|
|
||
| ## Installation | ||
|
|
||
| ### From PyPI (when published) | ||
|
|
||
| ```bash | ||
| pip install ggsql | ||
| ``` | ||
|
|
||
| ### From source | ||
|
|
||
| Building from source requires: | ||
| - Rust toolchain (install via [rustup](https://rustup.rs/)) | ||
| - Python 3.10+ | ||
| - [maturin](https://github.com/PyO3/maturin) | ||
|
|
||
| ```bash | ||
| # Clone the monorepo | ||
| git clone https://github.com/georgestagg/ggsql.git | ||
| cd ggsql/ggsql-python | ||
|
|
||
| # Create a virtual environment | ||
| python -m venv .venv | ||
| source .venv/bin/activate # or `.venv\Scripts\activate` on Windows | ||
|
|
||
| # Install build dependencies | ||
| pip install maturin | ||
|
|
||
| # Build and install in development mode | ||
| maturin develop | ||
|
|
||
| # Or build a wheel | ||
| maturin build --release | ||
| pip install target/wheels/ggsql-*.whl | ||
| ``` | ||
|
|
||
| ## Usage | ||
|
|
||
| ```python | ||
| import ggsql | ||
| import polars as pl | ||
|
|
||
| # Split a ggSQL query into SQL and VISUALISE portions | ||
| sql, viz = ggsql.split_query(""" | ||
| SELECT date, revenue, region FROM sales | ||
| WHERE year = 2024 | ||
| VISUALISE date AS x, revenue AS y, region AS color | ||
| DRAW line | ||
| LABEL title => 'Sales Trends' | ||
| """) | ||
|
|
||
| # Execute SQL with your preferred tool | ||
| import duckdb | ||
| df = duckdb.sql(sql).pl() | ||
|
|
||
| # Render DataFrame + VISUALISE spec to Vega-Lite JSON | ||
| vegalite_json = ggsql.render(df, viz) | ||
| ``` | ||
|
|
||
| ### Mapping styles | ||
|
|
||
| The `render()` function supports various mapping styles: | ||
|
|
||
| ```python | ||
| df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30], "category": ["A", "B", "A"]}) | ||
|
|
||
| # Explicit mapping | ||
| ggsql.render(df, "VISUALISE x AS x, y AS y DRAW point") | ||
|
|
||
| # Implicit mapping (column name = aesthetic name) | ||
| ggsql.render(df, "VISUALISE x, y DRAW point") | ||
|
|
||
| # Wildcard mapping (map all matching columns) | ||
| ggsql.render(df, "VISUALISE * DRAW point") | ||
|
|
||
| # With color encoding | ||
| ggsql.render(df, "VISUALISE x, y, category AS color DRAW point") | ||
| ``` | ||
|
|
||
| ## API | ||
|
|
||
| ### `split_query(query: str) -> tuple[str, str]` | ||
|
|
||
| Split a ggSQL query into SQL and VISUALISE portions. | ||
|
|
||
| **Parameters:** | ||
| - `query`: The full ggSQL query string | ||
|
|
||
| **Returns:** | ||
| - Tuple of `(sql_portion, visualise_portion)` | ||
|
|
||
| **Raises:** | ||
| - `ValueError`: If the query cannot be parsed | ||
|
|
||
| ### `render(df, viz, *, writer="vegalite") -> str` | ||
|
|
||
| Render a DataFrame with a VISUALISE specification. | ||
|
|
||
| **Parameters:** | ||
| - `df`: Any narwhals-compatible DataFrame (polars, pandas, etc.). LazyFrames are collected automatically. | ||
| - `viz`: The VISUALISE specification string | ||
| - `writer`: Output format, currently only `"vegalite"` is supported | ||
|
|
||
| **Returns:** | ||
| - JSON string containing the Vega-Lite specification | ||
|
|
||
| **Raises:** | ||
| - `ValueError`: If the spec cannot be parsed or rendered | ||
|
|
||
| ## Development | ||
|
|
||
| ### Keeping in sync with the monorepo | ||
|
|
||
| The `ggsql-python` package is part of the [ggsql monorepo](https://github.com/georgestagg/ggsql) and depends on the Rust `ggsql` crate via a path dependency. When the Rust crate is updated, you may need to rebuild: | ||
|
|
||
| ```bash | ||
| cd ggsql-python | ||
|
|
||
| # Rebuild after Rust changes | ||
| maturin develop | ||
|
|
||
| # If tree-sitter grammar changed, clean and rebuild | ||
| cd .. && cargo clean -p tree-sitter-ggsql && cd ggsql-python | ||
| maturin develop | ||
| ``` | ||
|
|
||
| ### Running tests | ||
|
|
||
| ```bash | ||
| # Install test dependencies | ||
| pip install pytest altair | ||
|
|
||
| # Run unit tests | ||
| pytest tests/test_ggsql.py -v | ||
|
|
||
| # Run E2E tests with altair | ||
| pytest tests/test_altair_e2e.py -v | ||
|
|
||
| # Run all tests | ||
| pytest tests/ -v | ||
| ``` | ||
|
|
||
| ## Requirements | ||
|
|
||
| - Python >= 3.10 | ||
| - polars >= 1.0 | ||
| - narwhals >= 2.15 | ||
|
|
||
| ## License | ||
|
|
||
| MIT |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| [build-system] | ||
| requires = ["maturin>=1.4"] | ||
| build-backend = "maturin" | ||
|
|
||
| [project] | ||
| name = "ggsql" | ||
| version = "0.1.0" | ||
| description = "SQL extension for declarative data visualization" | ||
| readme = "README.md" | ||
| requires-python = ">=3.10" | ||
| license = { text = "MIT" } | ||
| keywords = ["sql", "visualization", "vega-lite", "grammar-of-graphics"] | ||
| classifiers = [ | ||
| "Programming Language :: Rust", | ||
| "Programming Language :: Python :: Implementation :: CPython", | ||
| ] | ||
| dependencies = [ | ||
| "narwhals>=2.15.0", | ||
| "polars>=1.0", | ||
| "pyarrow>=14.0", | ||
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
| test = ["pytest>=7.0"] | ||
| e2e = ["altair>=5.0"] | ||
| dev = ["maturin>=1.4"] | ||
|
|
||
| [tool.maturin] | ||
| features = ["pyo3/extension-module"] | ||
| python-source = "python" | ||
| module-name = "ggsql._ggsql" | ||
|
|
||
| [dependency-groups] | ||
| dev = [ | ||
| "maturin>=1.11.5", | ||
| "pytest>=9.0.2", | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| from __future__ import annotations | ||
| from typing import Literal | ||
|
|
||
| import narwhals as nw | ||
| from narwhals.typing import IntoFrame | ||
|
|
||
| from ggsql._ggsql import split_query, render as _render | ||
|
|
||
| __all__ = ["split_query", "render"] | ||
| __version__ = "0.1.0" | ||
|
|
||
|
|
||
| def render( | ||
| df: IntoFrame, | ||
| viz: str, | ||
| *, | ||
| writer: Literal["vegalite"] = "vegalite", | ||
| ) -> str: | ||
| """Render a DataFrame with a VISUALISE spec. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| df | ||
| Data to visualize. Accepts polars, pandas, or any narwhals-compatible | ||
| DataFrame. LazyFrames are collected automatically. | ||
| viz | ||
| VISUALISE spec string (e.g., "VISUALISE x, y DRAW point") | ||
| writer | ||
| Output format. Currently only "vegalite" supported. | ||
|
|
||
| Returns | ||
| ------- | ||
| str | ||
| Vega-Lite JSON specification. | ||
| """ | ||
|
|
||
| df = nw.from_native(df, pass_through=True) | ||
|
|
||
| if isinstance(df, nw.LazyFrame): | ||
| df = df.collect() | ||
|
|
||
| if not isinstance(df, nw.DataFrame): | ||
| raise TypeError("df must be a narwhals DataFrame or compatible type") | ||
|
|
||
| # Should be safe as long as we take polars dependency | ||
| pl_df = df.to_polars() | ||
|
|
||
| return _render(pl_df, viz, writer=writer) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # PEP 561 marker file |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was needed since
tree-sitter generatewas failing on Windows. Claude offered a few different ways to fix this, but recommended this approach:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the issue is simply that tree sitter is not installed in CI, I'd rather not check in the generated files and instead install it on Windows using something like https://github.com/tree-sitter/setup-action/tree/master.
Those generated files can get very large as grammars get complex, and I'd like to avoid the noise if we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do recall having to do a little 'nvm' song and dance to please installation on windows, but I've forgotten the details already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 8ffdc1a