Skip to content

new(citus): horizontally-scalable PostgreSQL extension#13132

Open
tannevaled wants to merge 11 commits into
pkgxdev:mainfrom
tannevaled:new/citus
Open

new(citus): horizontally-scalable PostgreSQL extension#13132
tannevaled wants to merge 11 commits into
pkgxdev:mainfrom
tannevaled:new/citus

Conversation

@tannevaled
Copy link
Copy Markdown
Contributor

Summary

Adds Citus — the PostgreSQL extension that shards tables across worker nodes to provide horizontal scaling while keeping vanilla Postgres semantics. Common alternative to CockroachDB / Yugabyte for teams who want distributed Postgres without leaving Postgres itself.

  • Source build (PGXS) from upstream tarballs at github.com/citusdata/citus.
  • Citus 14.x supports PostgreSQL 16, 17, 18 (configure enforces this), so postgresql.org is pinned >=16<19.
  • Compression deps (lz4.org, facebook.com/zstd) and curl.se (anonymous stats) are wired explicitly to match upstream's default feature set.
  • make install-all prefix={{prefix}} redirects PGXS-derived install dirs into pantry's bottle prefix so extension artefacts (citus.so, control + SQL files) don't leak into the postgresql.org dependency tree.
  • No provides: — Citus ships no CLI; artefacts are loaded inside Postgres via CREATE EXTENSION citus after adding to shared_preload_libraries.

Test plan

  • CI build passes on linux/x86-64
  • CI build passes on linux/aarch64
  • CI build passes on darwin/x86-64
  • CI build passes on darwin/aarch64
  • citus.so present under lib/postgresql/
  • citus.control present under share/postgresql/extension/

🤖 Generated with Claude Code

tannevaled and others added 11 commits May 31, 2026 18:23
Citus turns Postgres into a distributed SQL database by sharding tables
across worker nodes. Common alternative to CockroachDB / Yugabyte for
users who'd rather keep vanilla Postgres semantics (same SQL, same
protocol, same MVCC) and just add sharding.

Build is PGXS; pantry's postgresql.org provides pg_config. Citus 14.x
supports PG 16, 17, 18 (configure enforces this), so the dep is pinned
`>=16<19`. Compression deps (lz4, zstd) and libcurl (anonymous stats)
are enabled explicitly so the bottle matches upstream's default
feature set.

`make install-all prefix={{prefix}}` redirects PGXS-derived install
dirs into pantry's prefix so extension files don't leak into the
postgresql.org dependency's tree.

No `provides:` — Citus ships no CLI; artefacts are citus.so plus the
extension control/SQL files, loaded inside Postgres via
`CREATE EXTENSION citus`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Citus's configure copies CC from `$(pg_config --cc)` (so it matches the
PostgreSQL the extension links against). Pantry's postgresql.org bottle
was built with clang, but the pkgx linux toolchain ships gcc only — so
configure picked `clang`, then died with "C compiler cannot create
executables". Pin CC=cc to use the toolchain's default (gcc on linux,
clang on darwin) while still building hermetically.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PostgreSQL's bottle was compiled with clang, and Citus extensions
inherit CFLAGS via `pg_config --cflags`. Those flags include
clang-specific warnings (-Werror=unguarded-availability-new,
-Wno-cast-function-type-strict, …) that gcc-16 rejects outright. Add
llvm.org as a linux build-dep and set CC=clang so the extension
matches PG's compiler.

On darwin, brewkit's hermetic CPATH masks the Xcode SDK headers, so
Xcode clang's #include_next chain can't reach <stdio.h> / <inttypes.h>.
Set SDKROOT=$(xcrun --sdk macosx --show-sdk-path) — same pattern used
by github.com/99designs/aws-vault for the same situation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The pantry postgresql.org bottle was built with the macOS 15.2 SDK
(/Applications/Xcode.app/.../SDKs/MacOSX15.2.sdk), and pg_config bakes
that path into the -isysroot flag returned by `pg_config --cflags`.
On the current darwin runners only the newer Xcode SDK is installed,
so clang's -isysroot points at a directory that doesn't exist, and
stdio.h / math.h / inttypes.h all go missing.

Sed the stale -isysroot out of Makefile.global after configure runs;
the SDKROOT env var (already set above) then steers clang to the
correct, present-day SDK. No-op on linux, where -isysroot is absent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous attempt to sed the stale `-isysroot .../MacOSX15.2.sdk`
out of Makefile.global was a no-op — PGXS pulls that flag in from
pg_config at make time, not from the per-extension Makefile.global.

Use the PGXS-blessed extension hook instead: PG_CPPFLAGS is appended
to the compile line after PG's own CPPFLAGS, and clang resolves
multiple -isysroot occurrences left-to-right (last wins). Passing
`PG_CPPFLAGS="-isysroot $(xcrun --sdk macosx --show-sdk-path)"`
therefore overrides the baked-in 15.2 path with the SDK that's
actually installed on the runner.

Variable is empty on linux, so the make invocations are identical
across platforms.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previous attempt set PG_CPPFLAGS, but PGXS prepends that to CPPFLAGS:
  override CPPFLAGS := \$(PG_CPPFLAGS) \$(CPPFLAGS)
So our -isysroot ended up *before* PG's stale -isysroot in the compile
line, and clang picked the stale 15.2 SDK (left-to-right resolution,
last wins).

PG_CFLAGS, by contrast, is appended last:
  override CFLAGS := \$(CFLAGS) \$(PG_CFLAGS)
Switching to PG_CFLAGS puts our fresh -isysroot after PG's, so clang
selects the SDK that actually exists on the runner.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PGXS reads PG's CFLAGS through pg_config at configure time and bakes
them into Citus's Makefile.global. The PG bottle was built with macOS
SDK 15.2, so -isysroot .../MacOSX15.2.sdk gets cemented in — but the
darwin runners only carry a newer SDK, so the path resolves to nothing
and stdio.h / inttypes.h / math.h all go missing.

Earlier attempts (PG_CPPFLAGS prepended → loses to PG; PG_CFLAGS
appended → still appeared before PG's own -isysroot in the final
compile line because of how PGXS interleaves CPPFLAGS and CFLAGS) all
left the stale SDK winning.

Switch to a build-local wrapper pg_config that pipes the real binary's
output through sed on darwin to substitute the live SDK path. Point
Citus's configure at the wrapper via PG_CONFIG. On linux the wrapper
passes through unchanged.

Linux x64 already passed with the previous PG_CFLAGS commit, so this
keeps that fix conceptually intact while finally addressing darwin.
ARM64 hit a transient DNS error fetching the tarball; not a code bug.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The wrapper pg_config approach didn't help: PGXS pulls PG's CPPFLAGS
not from \`pg_config --cppflags\` at make time but by including PG's
installed Makefile.global on disk, which has the stale
\`-isysroot .../MacOSX15.2.sdk\` baked in.

Instead, read pg_config --cppflags ourselves, sed-rewrite the SDK
path on darwin, and pass the cleaned value as \`make CPPFLAGS=...\`.
PGXS does \`override CPPFLAGS := -I. -I\$(srcdir) \$(CPPFLAGS)\`,
which then *augments* our value rather than discarding it — so our
SDK fix survives, and PGXS adds its own -I flags on top.

Linux gets the unmodified pg_config --cppflags, so behaviour there is
identical to the previous (passing) PG_CFLAGS commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CPPFLAGS override fixed compilation, but the dylib link step still
picked up -isysroot .../MacOSX15.2.sdk from PG's LDFLAGS, so ld
couldn't find the System library. Read pg_config --ldflags, run the
same SDK-path rewrite on it, and pass it on the make command line
alongside CPPFLAGS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Build now succeeds on darwin (compile + link both green) — the failure
was in the test step, which only looked for citus.so. PGXS produces
citus.dylib on darwin (its $(DLSUFFIX) is .dylib there). Match either.
Drop the share/extension fallback path too: with `make install-all
prefix=...` the file always lands at share/postgresql/extension/.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous \`ls ... | grep -q .\` form returned non-zero when the
first arg didn't exist, before the second was even considered. Use
\`test -f A -o -f B\` instead — short-circuits cleanly and is the
same idiom the task brief suggested.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant