Skip to content

Commit dd2f77c

Browse files
committed
feat(codegen/go): add opt-in iter.Seq2 companions for :many queries
Implement emit_iterators PoC from docs/emit-iterators-prd.md: lazy iter.Seq2[T, error] methods alongside existing slice APIs, with global or explicit_only scope and :stream / :many:stream query annotations. Includes stdlib and pgx templates, config options, end-to-end testdata, and codegen integration tests.
1 parent ecec179 commit dd2f77c

28 files changed

Lines changed: 1110 additions & 34 deletions

File tree

docs/emit-iterators-prd.md

Lines changed: 363 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,363 @@
1+
# PRD: Opt-in Iterator Generation for `:many` Queries
2+
3+
**Status:** Proposal
4+
**Target repo:** [sqlc-dev/sqlc](https://github.com/sqlc-dev/sqlc)
5+
**Related issues:** [#720](https://github.com/sqlc-dev/sqlc/issues/720), [#4464](https://github.com/sqlc-dev/sqlc/issues/4464), [#4108](https://github.com/sqlc-dev/sqlc/issues/4108)
6+
**Related PR (closed, reference only):** [#3631](https://github.com/sqlc-dev/sqlc/pull/3631)
7+
**Author:** Community proposal (reviving stalled discussion)
8+
**Last updated:** 2026-06-14
9+
10+
---
11+
12+
## 1. Problem Statement
13+
14+
sqlc generates excellent type-safe Go code, but its default API for `:many` queries **always materializes the full result set into a slice**:
15+
16+
```go
17+
func (q *Queries) ListAuthors(ctx context.Context) ([]Author, error)
18+
```
19+
20+
For large result sets (exports, sync jobs, ETL pipelines, backfills), this forces **O(n) heap allocation** even when the caller only needs to process rows one at a time. Alternatives today:
21+
22+
| Workaround | Drawback |
23+
|------------|----------|
24+
| Manual paging with `LIMIT`/`OFFSET` | Extra query complexity; offset cost at scale; not always expressible |
25+
| Fork sqlc or post-process generated code | Maintenance burden; loses upstream improvements |
26+
| Skip sqlc for streaming paths | Loses type safety on the hot path |
27+
28+
Go 1.23 shipped **range-over-function iterators** (`iter.Seq`, `iter.Seq2`). sqlc maintainers [noted in #720](https://github.com/sqlc-dev/sqlc/issues/720) that this unblocks native iterator generation. As of June 2026, **no implementation has merged**; [PR #3631](https://github.com/sqlc-dev/sqlc/pull/3631) was closed without merge after API design remained unresolved.
29+
30+
---
31+
32+
## 2. Goals
33+
34+
1. **Zero breaking changes** — existing `:many` → `[]T` APIs remain the default.
35+
2. **Opt-in streaming** — callers choose slice vs iterator via config or query annotation.
36+
3. **Native Go 1.23 idioms** — generate `iter.Seq2[T, error]` (primary) with optional alternate styles.
37+
4. **Lazy evaluation** — query execution begins on first iteration, not at method call (configurable).
38+
5. **Correct resource lifecycle** — `rows.Close()` on normal completion, early break, panic (via `defer`), and error paths.
39+
6. **Incremental rollout** — Go + `database/sql` first; pgx/stdlib variants and other languages follow.
40+
41+
## 3. Non-Goals (v1)
42+
43+
- Replacing or changing default `:many` behavior.
44+
- Automatic streaming for `:one`, `:exec`, or `:copyfrom`.
45+
- Memory pooling / object reuse (future enhancement; see #3631 discussion).
46+
- Server-side PostgreSQL cursors (`DECLARE`/`FETCH`) — separate feature ([#1517](https://github.com/sqlc-dev/sqlc/issues/1517)).
47+
- Python/Kotlin generators in the initial PR (coordinate separately; see #4464).
48+
49+
---
50+
51+
## 4. Proposed Configuration
52+
53+
### 4.1 `sqlc.yaml` options
54+
55+
```yaml
56+
version: "2"
57+
sql:
58+
- schema: schema.sql
59+
queries: queries.sql
60+
engine: postgresql
61+
gen:
62+
go:
63+
package: db
64+
out: internal/db
65+
66+
# --- Iterator options (all opt-in, defaults shown) ---
67+
68+
emit_iterators: false
69+
# When true, generate a streaming companion method for each :many query.
70+
71+
iterator_scope: global
72+
# global — all :many queries get a streaming method
73+
# explicit_only — only queries annotated with :many:stream (or :stream)
74+
75+
iterator_method_prefix: "Iter"
76+
# ListAuthors → IterAuthors
77+
# Set to "Stream" for StreamAuthors if preferred.
78+
79+
iterator_style: seq2
80+
# seq2 — iter.Seq2[T, error] (recommended default)
81+
# callback — EachAuthors(ctx, func(Author) error) error
82+
# rows — *AuthorsRows with Next()/Scan()/Close()/Err() (legacy #720 style)
83+
84+
iterator_start: lazy
85+
# lazy — DB query runs on first iteration step (recommended)
86+
# eager — DB query runs at method call; returns (seq, error) or (*Rows, error)
87+
```
88+
89+
### 4.2 Query-level override (optional, for `iterator_scope: explicit_only`)
90+
91+
```sql
92+
-- name: ListAuthors :many:stream
93+
SELECT id, name, bio FROM authors ORDER BY name;
94+
```
95+
96+
Alternatively, a dedicated query kind (as proposed in #4464):
97+
98+
```sql
99+
-- name: StreamAuthors :stream
100+
SELECT id, name, bio FROM authors ORDER BY name;
101+
```
102+
103+
**Recommendation:** support **both** `emit_iterators: global` and `explicit_only` + `:stream` annotation so teams can choose DX vs fine-grained control.
104+
105+
---
106+
107+
## 5. Generated API
108+
109+
### 5.1 Default: `seq2` + `lazy` (recommended)
110+
111+
**SQL (unchanged):**
112+
113+
```sql
114+
-- name: ListAuthors :many
115+
SELECT id, name, bio FROM authors ORDER BY name;
116+
```
117+
118+
**Generated Go:**
119+
120+
```go
121+
import "iter"
122+
123+
const listAuthors = `-- name: ListAuthors :many
124+
SELECT id, name, bio FROM authors ORDER BY name
125+
`
126+
127+
// Existing — unchanged
128+
func (q *Queries) ListAuthors(ctx context.Context) ([]Author, error) {
129+
// ... current implementation ...
130+
}
131+
132+
// New — opt-in via emit_iterators
133+
func (q *Queries) IterAuthors(ctx context.Context) iter.Seq2[Author, error] {
134+
return func(yield func(Author, error) bool) {
135+
rows, err := q.db.QueryContext(ctx, listAuthors)
136+
if err != nil {
137+
yield(Author{}, err)
138+
return
139+
}
140+
defer rows.Close()
141+
142+
for rows.Next() {
143+
var i Author
144+
if err := rows.Scan(&i.ID, &i.Name, &i.Bio); err != nil {
145+
yield(Author{}, err)
146+
return
147+
}
148+
if !yield(i, nil) {
149+
return // early break; defer closes rows
150+
}
151+
}
152+
if err := rows.Err(); err != nil {
153+
yield(Author{}, err)
154+
}
155+
}
156+
}
157+
```
158+
159+
**Caller usage:**
160+
161+
```go
162+
for author, err := range q.IterAuthors(ctx) {
163+
if err != nil {
164+
return fmt.Errorf("list authors: %w", err)
165+
}
166+
if err := process(author); err != nil {
167+
return err
168+
}
169+
}
170+
return nil
171+
```
172+
173+
**Properties:**
174+
175+
- **Lazy:** no DB round-trip until `range` begins.
176+
- **No wrapper type** for the common case — aligns with Kyle's [later preference](https://github.com/sqlc-dev/sqlc/pull/3631) for `for x, err := range q.Method(ctx)`.
177+
- **Errors in-band** via `Seq2` — familiar Go 1.23 pattern.
178+
- **`break` / `return` safe:** `defer rows.Close()` runs on all exit paths.
179+
180+
### 5.2 Alternate: `seq2` + `eager`
181+
182+
For callers who want connection errors before iteration:
183+
184+
```go
185+
func (q *Queries) IterAuthors(ctx context.Context) (iter.Seq2[Author, error], error) {
186+
rows, err := q.db.QueryContext(ctx, listAuthors)
187+
if err != nil {
188+
return nil, err
189+
}
190+
return func(yield func(Author, error) bool) {
191+
defer rows.Close()
192+
// ... same loop ...
193+
}, nil
194+
}
195+
```
196+
197+
### 5.3 Alternate: `callback` style
198+
199+
Sugar for callers who prefer a single error return:
200+
201+
```go
202+
func (q *Queries) EachAuthor(ctx context.Context, fn func(Author) error) error {
203+
for author, err := range q.IterAuthors(ctx) {
204+
if err != nil {
205+
return err
206+
}
207+
if err := fn(author); err != nil {
208+
return err
209+
}
210+
}
211+
return nil
212+
}
213+
```
214+
215+
**Note:** `Each*` can be generated optionally or left as a one-liner at call sites. Generating **both** `Iter*` and `Each*` for every query adds API surface without much benefit — recommend **`seq2` only** in v1, with `callback` as an opt-in `iterator_style`.
216+
217+
### 5.4 Alternate: `rows` style (compatibility with #720 / #3631)
218+
219+
For teams migrating from manual `sql.Rows` patterns:
220+
221+
```go
222+
type IterAuthorsRows struct { /* rows, err */ }
223+
func (q *Queries) IterAuthors(ctx context.Context) *IterAuthorsRows
224+
func (r *IterAuthorsRows) All() iter.Seq2[Author, error] // or Rows(), Items()
225+
func (r *IterAuthorsRows) Err() error
226+
func (r *IterAuthorsRows) Close() error
227+
```
228+
229+
Useful when lazy start + separate error channel is required; more boilerplate than `seq2`.
230+
231+
---
232+
233+
## 6. Design Decisions & Rationale
234+
235+
| Decision | Choice | Rationale |
236+
|----------|--------|-----------|
237+
| Break `:many` default? | **No** | Maintainer consensus (#720, #4464, Kyle) |
238+
| Keyword vs yaml flag | **Both** | Global flag for DX; `:stream` for explicit control |
239+
| Primary iterator type | `iter.Seq2[T, error]` | Go 1.23 stdlib idiom; Kyle referenced [Thibaut Rousseau's iterator post](https://blog.thibaut-rousseau.com/blog/writing-testing-a-paginated-api-iterator/) |
240+
| Lazy vs eager default | **Lazy** | Avoids dangling queries if iterator is never consumed; matches pierrre/sgielen feedback in #3631 |
241+
| Method naming | `Iter*` default, `Stream*` configurable | `Iter` matches PR #3631; `Stream` matches Kyle's early examples and #4464 |
242+
| Generate 3 methods per query? | **No (v1)** | `List*` + `Iter*` sufficient; `Each*` is optional sugar |
243+
| Min Go version | **1.23+** when `emit_iterators: true` | Required for `iter` package; document in release notes |
244+
| pgx vs database/sql | **database/sql first** | Match existing codegen paths; pgx in follow-up |
245+
246+
---
247+
248+
## 7. Error Handling Semantics
249+
250+
### 7.1 `seq2` lazy mode
251+
252+
| Event | Behavior |
253+
|-------|----------|
254+
| Query fails | First `yield(zero, err)`; iteration ends |
255+
| Scan fails | `yield(zero, err)`; iteration ends |
256+
| `rows.Err()` after loop | Final `yield(zero, err)` |
257+
| Caller `break` / `yield` returns false | Loop stops; `defer rows.Close()` |
258+
| Panic in caller loop body | `defer rows.Close()` still runs |
259+
260+
### 7.2 Close-on-break concern (from #3631)
261+
262+
[gbarr noted](https://github.com/sqlc-dev/sqlc/pull/3631) that without `defer rows.Close()` inside the iterator closure, a recovered panic leaves the connection mid-fetch. **All generated iterators MUST use `defer rows.Close()`** inside the `Seq2` closure.
263+
264+
### 7.3 Context cancellation
265+
266+
Callers may cancel via `ctx`. Behavior depends on driver:
267+
268+
- `database/sql`: `rows.Next()` may block until cancel (driver-dependent).
269+
- **v1:** pass `ctx` to `QueryContext`; document that full cancel propagation requires driver support.
270+
- **Future:** optional `select { case <-ctx.Done(): ... }` in loop (as MatthiasKunnen uses with pgx).
271+
272+
---
273+
274+
## 8. Parameterized Queries
275+
276+
Iterators work identically for parameterized `:many` queries:
277+
278+
```sql
279+
-- name: ListAuthorsByIDs :many
280+
SELECT id, name, bio FROM authors WHERE id = ANY($1::int[]);
281+
```
282+
283+
```go
284+
func (q *Queries) IterAuthorsByIDs(ctx context.Context, ids []int32) iter.Seq2[Author, error]
285+
func (q *Queries) ListAuthorsByIDs(ctx context.Context, ids []int32) ([]Author, error)
286+
```
287+
288+
Same SQL constant, same prepared statement wiring — only the result consumption differs.
289+
290+
---
291+
292+
## 9. Implementation Plan
293+
294+
### Phase 1 — Design sign-off (this proposal)
295+
296+
- [ ] Post proposal to #720; cross-link #4464
297+
- [ ] Maintainer confirmation on: naming, lazy default, global vs explicit scope
298+
- [ ] Agree v1 scope: Go + `database/sql` + PostgreSQL example
299+
300+
### Phase 2 — PoC PR
301+
302+
- [ ] Add config parsing in `internal/codegen/golang/opts`
303+
- [ ] Extend `:many` code generation in `internal/codegen/golang/query.go` (or templates)
304+
- [ ] Generate `Iter*` method alongside existing `List*`
305+
- [ ] End-to-end test in `examples/` (pattern from #3631)
306+
- [ ] Document in sqlc.dev docs
307+
308+
### Phase 3 — Expand coverage
309+
310+
- [ ] MySQL, SQLite engines
311+
- [ ] pgx/v5 driver variant
312+
- [ ] `iterator_style: rows` and `callback` if requested
313+
- [ ] Python generator (coordinate with borissmidt / sqlc-gen-python)
314+
315+
---
316+
317+
## 10. Testing Requirements
318+
319+
1. **Unit:** generated code compiles with `go test` under Go 1.23+.
320+
2. **Integration:** iterator returns all rows; early `break` closes rows (verify via connection pool or mock).
321+
3. **Error paths:** query error, scan error, `rows.Err()` — each yields exactly one error and stops.
322+
4. **Parity:** for same fixture, `List*` and collecting `Iter*` produce identical slices.
323+
5. **Opt-out:** `emit_iterators: false` produces byte-identical output to today (regression).
324+
325+
---
326+
327+
## 11. Open Questions for Maintainers
328+
329+
1. **Preferred method prefix:** `Iter` vs `Stream`?
330+
2. **Lazy default:** agree lazy is correct for v1?
331+
3. **Global flag vs `:stream` only:** ship both?
332+
4. **Eager mode:** worth exposing in v1 or defer?
333+
5. **pgx:** same PR or immediate follow-up?
334+
6. **Min Go version bump:** gate behind `emit_iterators` or raise global minimum?
335+
336+
---
337+
338+
## 12. One-Line Pitch
339+
340+
> sqlc generates type-safe Go that materializes every `:many` query into a slice; with Go 1.23, an opt-in `emit_iterators` flag can generate lazy `iter.Seq2[T, error]` companions — same type safety, O(1) memory per row, zero breaking changes.
341+
342+
---
343+
344+
## Appendix A: Comparison with Closed PR #3631
345+
346+
| Aspect | PR #3631 | This proposal |
347+
|--------|----------|---------------|
348+
| Trigger | `:iter` query annotation | `emit_iterators` yaml + optional `:stream` |
349+
| API | Wrapper type + `Iterate()` + `Err()` | `iter.Seq2` direct range (default) |
350+
| Lazy start | Unclear / eager in PoC | Explicit `iterator_start: lazy` default |
351+
| Config surface | None | Full yaml options |
352+
| Status | Closed, not merged | — |
353+
354+
This proposal incorporates #3631's implementation lessons and resolves the API debates raised in its review thread.
355+
356+
## Appendix B: References
357+
358+
- [#720 — Ability to return an iterator on a "many" query](https://github.com/sqlc-dev/sqlc/issues/720)
359+
- [#4464 — add :stream keyword](https://github.com/sqlc-dev/sqlc/issues/4464)
360+
- [#4108 — low-level prepare/bind helpers for streaming](https://github.com/sqlc-dev/sqlc/issues/4108)
361+
- [PR #3631 — Ability to return an iterator on rows (closed)](https://github.com/sqlc-dev/sqlc/pull/3631)
362+
- [Go 1.23 — range-over-func](https://go.dev/doc/go1.23#range-over-function)
363+
- [Eli Bendersky — Ranging over functions in Go 1.23](https://eli.thegreenplace.net/2024/ranging-over-functions-in-go-123/)

internal/cmd/shim.go

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ import (
55
"github.com/sqlc-dev/sqlc/internal/config"
66
"github.com/sqlc-dev/sqlc/internal/config/convert"
77
"github.com/sqlc-dev/sqlc/internal/info"
8+
"github.com/sqlc-dev/sqlc/internal/metadata"
89
"github.com/sqlc-dev/sqlc/internal/plugin"
910
"github.com/sqlc-dev/sqlc/internal/sql/catalog"
1011
)
@@ -156,7 +157,7 @@ func pluginQueries(r *compiler.Result) []*plugin.Query {
156157
Name: q.Metadata.Name,
157158
Cmd: q.Metadata.Cmd,
158159
Text: q.SQL,
159-
Comments: q.Metadata.Comments,
160+
Comments: pluginQueryComments(q.Metadata),
160161
Columns: columns,
161162
Params: params,
162163
Filename: q.Metadata.Filename,
@@ -166,6 +167,14 @@ func pluginQueries(r *compiler.Result) []*plugin.Query {
166167
return out
167168
}
168169

170+
func pluginQueryComments(md metadata.Metadata) []string {
171+
comments := md.Comments
172+
if md.Stream {
173+
comments = append(comments, metadata.StreamAnnotationComment)
174+
}
175+
return comments
176+
}
177+
169178
func pluginQueryColumn(c *compiler.Column) *plugin.Column {
170179
l := -1
171180
if c.Length != nil {

0 commit comments

Comments
 (0)