Support DuckDB main (PEG parser): port expression FFI + parser-swap regression test#38
Merged
Merged
Conversation
DuckDB is replacing its legacy PostgreSQL parser with a PEG parser (opt-in in 1.5, the sole parser on main, default in 2.0). Yardstick integrates via allow_parser_override_extension="fallback", so when the override declines a query, DuckDB's parser handles it -- including surfacing syntax errors, whose shape differs between the legacy and PEG parsers. Add test/sql/peg_parser.test guarding the two behaviors most exposed to the swap: the parser_override rewrite path still works, and the fallback path still raises a parser-agnostic "Parser Error". The test runs under whichever parser the build uses, so it becomes PEG coverage automatically once the suite runs against a PEG-default DuckDB.
d3e920b to
1b7dcce
Compare
DuckDB main refactored the parsed-expression API: subclass fields are now
private (accessed via FunctionName()/GetArgumentsMutable()/LeftMutable()/...)
and names use a dedicated Identifier type instead of std::string. This broke
the duckdb-next-build CI lane (compile failure in yardstick_parser_ffi.cpp),
which has been red for a week.
Add a small compatibility shim, selected at compile time via
__has_include("duckdb/common/identifier.hpp") (matching the project's existing
__has_include pattern), and route all expression field access through it. The
old field-based API (DuckDB 1.5 and the pinned submodule) and the new
accessor-based API (main) are both supported, so the stable and next CI lanes
both build.
Behavior is unchanged: the shim is a pure access adaptation over the same tree
walks. Verified with -fsyntax-only against both DuckDB main and v1.5-variegata
headers (0 errors each).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef1898017b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Fixing the expression FFI let the next-build compile advance to the next DuckDB-main API change: parse_function_t now receives the post-PEG-failure token tail (const vector<SimpleToken>&) instead of the raw query string (yardstick_extension.hpp/.cpp). Yardstick does all of its rewriting in yardstick_parser_override, which still receives the full query string on both APIs, so on the new signature parse_function is a no-op fallback. The old string-based fallback is retained for DuckDB 1.5 via the same __has_include detection. Matching yardstick_parse to parse_function_t also fixes the function-pointer comparison in yardstick_bind. Verified with -fsyntax-only against both DuckDB main and v1.5-variegata (0 errors); all three extension translation units now compile against main.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
DuckDB is replacing its legacy PostgreSQL parser with a PEG parser (opt-in in 1.5, the sole parser on
main, default in 2.0). Two problems surfaced:duckdb-next-buildhas been red for ~a week — unrelated to PEG. DuckDB main refactored the parsed-expression API: subclass fields are now private (accessors instead) and names use a dedicatedIdentifiertype instead ofstd::string.src/yardstick_parser_ffi.cppaccessed those fields directly, so it no longer compiled against main.allow_parser_override_extension="fallback"; when the override declines a query, DuckDB's parser handles it — including surfacing syntax errors, whose shape differs between the legacy and PEG parsers.What this does
1. Ports the parser FFI to DuckDB main's expression API, keeping 1.5 working (
src/yardstick_parser_ffi.cpp).A small compat shim selected at compile time via
__has_include("duckdb/common/identifier.hpp")(matching the project's existing__has_includestyle) routes all expression field access through helpers. Both the old field-based API (1.5 + the pinned submodule) and the new accessor-based API (main) build. Behavior is unchanged — it's a pure access adaptation over the same tree walks.Validated locally with
-fsyntax-onlyagainst both DuckDBmainandv1.5-variegataheaders (0 errors each). This is what un-redsduckdb-next-build; because main is PEG-only, that lane then exercises the whole suite under PEG automatically.2. Adds
test/sql/peg_parser.test— parser-agnostic regression guard for the surface most exposed to the swap:parser_overriderewrite path still works (prefixed + unprefixedAGGREGATE,AT (ALL ...)).Parser Errorprefix (emitted by both parsers).It runs under whichever parser the build uses, so it becomes real PEG coverage on
duckdb-next-build.Validation
-fsyntax-onlyagainst DuckDB main and 1.5 headers: 0 errors both.duckdb-stable-build= 1.5,duckdb-next-build= main/PEG). Not built locally —make testneeds the submodule initialized and a full from-source compile.