Skip to content

Commit e30a71a

Browse files
cpsievertclaude
andcommitted
feat(python): add ggsql-python package with PyO3 bindings
Adds Python bindings for ggsql via PyO3/maturin, enabling Python users to render Vega-Lite visualizations from polars DataFrames. Features: - split_query(): Split a ggSQL query into SQL and VISUALISE parts - render(): Render a DataFrame with a VISUALISE spec to Vega-Lite JSON - Supports explicit, implicit, and wildcard mappings - Works with Python polars >= 1.0 Includes: - Unit tests and E2E tests with altair - GitHub Actions workflow for Python 3.10-3.14 - README with installation and development guide Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 66e154e commit e30a71a

10 files changed

Lines changed: 770 additions & 1 deletion

File tree

.github/workflows/python.yml

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
name: Python
2+
3+
on:
4+
push:
5+
paths: ['ggsql-python/**', '.github/workflows/python.yml']
6+
pull_request:
7+
paths: ['ggsql-python/**', '.github/workflows/python.yml']
8+
9+
jobs:
10+
test:
11+
strategy:
12+
fail-fast: false
13+
matrix:
14+
os: [ubuntu-latest, macos-latest, windows-latest]
15+
python: ['3.10', '3.11', '3.12', '3.13', '3.14']
16+
runs-on: ${{ matrix.os }}
17+
steps:
18+
- uses: actions/checkout@v4
19+
20+
- uses: actions/setup-python@v5
21+
with:
22+
python-version: ${{ matrix.python }}
23+
24+
- name: Install Rust
25+
uses: dtolnay/rust-toolchain@stable
26+
27+
- name: Build wheel
28+
uses: PyO3/maturin-action@v1
29+
with:
30+
working-directory: ggsql-python
31+
command: build
32+
args: --release
33+
34+
- name: Install wheel and test
35+
shell: bash
36+
run: |
37+
pip install ggsql-python/target/wheels/*.whl
38+
pip install pytest polars
39+
pytest ggsql-python/tests/test_ggsql.py -v
40+
41+
e2e-test:
42+
runs-on: ubuntu-latest
43+
steps:
44+
- uses: actions/checkout@v4
45+
46+
- uses: actions/setup-python@v5
47+
with:
48+
python-version: '3.13'
49+
50+
- name: Install Rust
51+
uses: dtolnay/rust-toolchain@stable
52+
53+
- name: Build wheel
54+
uses: PyO3/maturin-action@v1
55+
with:
56+
working-directory: ggsql-python
57+
command: build
58+
args: --release
59+
60+
- name: Install wheel and E2E dependencies
61+
shell: bash
62+
run: |
63+
pip install ggsql-python/target/wheels/*.whl
64+
pip install pytest polars altair
65+
66+
- name: Run E2E tests
67+
shell: bash
68+
run: |
69+
pytest ggsql-python/tests/test_altair_e2e.py -v
70+
71+
lint:
72+
runs-on: ubuntu-latest
73+
steps:
74+
- uses: actions/checkout@v4
75+
76+
- name: Check Rust formatting
77+
run: cargo fmt --package ggsql-python -- --check
78+
79+
- name: Clippy
80+
run: cargo clippy --package ggsql-python -- -D warnings

Cargo.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@
22
members = [
33
"tree-sitter-ggsql",
44
"src",
5-
"ggsql-jupyter"
5+
"ggsql-jupyter",
6+
"ggsql-python"
67
]
78
resolver = "2"
89

ggsql-python/Cargo.toml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
[package]
2+
name = "ggsql-python"
3+
version = "0.1.0"
4+
edition = "2021"
5+
license = "MIT"
6+
description = "Python bindings for ggsql"
7+
8+
[lib]
9+
name = "_ggsql"
10+
crate-type = ["cdylib"]
11+
12+
[dependencies]
13+
pyo3 = { version = "0.26", features = ["extension-module"] }
14+
pyo3-polars = { version = "0.25", features = ["dtype-decimal", "dtype-struct"] }
15+
polars.workspace = true
16+
ggsql = { path = "../src", default-features = false, features = ["vegalite"] }
17+
18+
[features]
19+
default = []

ggsql-python/README.md

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
# ggsql
2+
3+
Python bindings for [ggsql](https://github.com/georgestagg/ggsql), a SQL extension for declarative data visualization.
4+
5+
This package provides a thin wrapper around the Rust `ggsql` crate, enabling Python users to render Vega-Lite visualizations from polars DataFrames using ggsql's VISUALISE syntax.
6+
7+
## Installation
8+
9+
### From PyPI (when published)
10+
11+
```bash
12+
pip install ggsql
13+
```
14+
15+
### From source
16+
17+
Building from source requires:
18+
- Rust toolchain (install via [rustup](https://rustup.rs/))
19+
- Python 3.10+
20+
- [maturin](https://github.com/PyO3/maturin)
21+
22+
```bash
23+
# Clone the monorepo
24+
git clone https://github.com/georgestagg/ggsql.git
25+
cd ggsql/ggsql-python
26+
27+
# Create a virtual environment
28+
python -m venv .venv
29+
source .venv/bin/activate # or `.venv\Scripts\activate` on Windows
30+
31+
# Install build dependencies
32+
pip install maturin
33+
34+
# Build and install in development mode
35+
maturin develop
36+
37+
# Or build a wheel
38+
maturin build --release
39+
pip install target/wheels/ggsql-*.whl
40+
```
41+
42+
## Usage
43+
44+
```python
45+
import ggsql
46+
import polars as pl
47+
48+
# Split a ggSQL query into SQL and VISUALISE portions
49+
sql, viz = ggsql.split_query("""
50+
SELECT date, revenue, region FROM sales
51+
WHERE year = 2024
52+
VISUALISE date AS x, revenue AS y, region AS color
53+
DRAW line
54+
LABEL title => 'Sales Trends'
55+
""")
56+
57+
# Execute SQL with your preferred tool
58+
import duckdb
59+
df = duckdb.sql(sql).pl()
60+
61+
# Render DataFrame + VISUALISE spec to Vega-Lite JSON
62+
vegalite_json = ggsql.render(df, viz)
63+
```
64+
65+
### Mapping styles
66+
67+
The `render()` function supports various mapping styles:
68+
69+
```python
70+
df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30], "category": ["A", "B", "A"]})
71+
72+
# Explicit mapping
73+
ggsql.render(df, "VISUALISE x AS x, y AS y DRAW point")
74+
75+
# Implicit mapping (column name = aesthetic name)
76+
ggsql.render(df, "VISUALISE x, y DRAW point")
77+
78+
# Wildcard mapping (map all matching columns)
79+
ggsql.render(df, "VISUALISE * DRAW point")
80+
81+
# With color encoding
82+
ggsql.render(df, "VISUALISE x, y, category AS color DRAW point")
83+
```
84+
85+
## API
86+
87+
### `split_query(query: str) -> tuple[str, str]`
88+
89+
Split a ggSQL query into SQL and VISUALISE portions.
90+
91+
**Parameters:**
92+
- `query`: The full ggSQL query string
93+
94+
**Returns:**
95+
- Tuple of `(sql_portion, visualise_portion)`
96+
97+
**Raises:**
98+
- `ValueError`: If the query cannot be parsed
99+
100+
### `render(df, viz, *, writer="vegalite") -> str`
101+
102+
Render a DataFrame with a VISUALISE specification.
103+
104+
**Parameters:**
105+
- `df`: A `polars.DataFrame` or `polars.LazyFrame` (LazyFrames are collected automatically)
106+
- `viz`: The VISUALISE specification string
107+
- `writer`: Output format, currently only `"vegalite"` is supported
108+
109+
**Returns:**
110+
- JSON string containing the Vega-Lite specification
111+
112+
**Raises:**
113+
- `ValueError`: If the spec cannot be parsed or rendered
114+
115+
## Development
116+
117+
### Keeping in sync with the monorepo
118+
119+
The `ggsql-python` package is part of the [ggsql monorepo](https://github.com/georgestagg/ggsql) and depends on the Rust `ggsql` crate via a path dependency. When the Rust crate is updated, you may need to rebuild:
120+
121+
```bash
122+
cd ggsql-python
123+
124+
# Rebuild after Rust changes
125+
maturin develop
126+
127+
# If tree-sitter grammar changed, clean and rebuild
128+
cd .. && cargo clean -p tree-sitter-ggsql && cd ggsql-python
129+
maturin develop
130+
```
131+
132+
### Running tests
133+
134+
```bash
135+
# Install test dependencies
136+
pip install pytest altair
137+
138+
# Run unit tests
139+
pytest tests/test_ggsql.py -v
140+
141+
# Run E2E tests with altair
142+
pytest tests/test_altair_e2e.py -v
143+
144+
# Run all tests
145+
pytest tests/ -v
146+
```
147+
148+
## Requirements
149+
150+
- Python >= 3.10
151+
- polars >= 1.0
152+
- pyarrow >= 12 (required for Arrow FFI)
153+
154+
## License
155+
156+
MIT

ggsql-python/pyproject.toml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
[build-system]
2+
requires = ["maturin>=1.4,<2.0"]
3+
build-backend = "maturin"
4+
5+
[project]
6+
name = "ggsql"
7+
version = "0.1.0"
8+
description = "SQL extension for declarative data visualization"
9+
readme = "README.md"
10+
requires-python = ">=3.10"
11+
license = { text = "MIT" }
12+
keywords = ["sql", "visualization", "vega-lite", "grammar-of-graphics"]
13+
classifiers = [
14+
"Programming Language :: Rust",
15+
"Programming Language :: Python :: Implementation :: CPython",
16+
]
17+
dependencies = ["polars>=1.0", "pyarrow>=12"]
18+
19+
[project.optional-dependencies]
20+
dev = ["pytest>=7.0", "maturin>=1.4"]
21+
e2e = ["altair>=5.0"]
22+
23+
[tool.maturin]
24+
features = ["pyo3/extension-module"]
25+
python-source = "python"
26+
module-name = "ggsql._ggsql"
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
from __future__ import annotations
2+
from typing import Literal
3+
4+
import polars as pl
5+
6+
from ggsql._ggsql import split_query, render as _render
7+
8+
__all__ = ["split_query", "render"]
9+
__version__ = "0.1.0"
10+
11+
12+
def render(
13+
df: "pl.DataFrame | pl.LazyFrame",
14+
viz: str,
15+
*,
16+
writer: Literal["vegalite"] = "vegalite",
17+
) -> str:
18+
"""Render a DataFrame with a VISUALISE spec.
19+
20+
Parameters
21+
----------
22+
df : polars.DataFrame | polars.LazyFrame
23+
Data to visualize. LazyFrames are collected automatically.
24+
viz : str
25+
VISUALISE spec string (e.g., "VISUALISE x, y DRAW point")
26+
writer : Literal["vegalite"]
27+
Output format. Currently only "vegalite" supported.
28+
29+
Returns
30+
-------
31+
str
32+
Vega-Lite JSON specification.
33+
"""
34+
if isinstance(df, pl.LazyFrame):
35+
df = df.collect()
36+
37+
return _render(df, viz, writer=writer)

ggsql-python/python/ggsql/py.typed

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# PEP 561 marker file

0 commit comments

Comments
 (0)