Releases: Query-farm/vgi-rpc-python
v0.16.1
Bug fixes
Linux idle-shutdown hang in serve_unix
_serve_unix_threaded armed an idle timer that closed the listening socket from another thread to interrupt the main accept() loop. On Linux this does not actually wake a blocked accept() — close() only decrements the fd refcount, leaving the syscall waiting. macOS happens to wake it, which is why this only manifested on Linux: launcher-spawned workers stayed alive past their idle window and never unlinked their socket file.
The accept loop now runs with a 0.5s socket timeout and an explicit shutdown_requested flag the timer sets instead of closing the listener. Accepted connections are reset to blocking so per-call reads behave unchanged. No public API change; affects only the AF_UNIX (run_server --unix / vgi_rpc.launcher.launch) code path.
Windows mypy errors in launcher tests
tests/test_launcher.py references socket.AF_UNIX and os.geteuid, neither of which appears in Windows type stubs. The tests were already runtime-skipped on Windows but still got parsed by mypy. Added sys.platform != "win32" guards so mypy narrows the POSIX-only attribute access on Windows. Tests remain skipped on Windows.
v0.16.0
Breaking: stream-state tokens now use XChaCha20-Poly1305 AEAD
HTTP stream-state tokens (the opaque continuation tokens carried in the vgi_rpc.stream_state#b64 Arrow custom-metadata field) are now sealed with XChaCha20-Poly1305 authenticated encryption instead of HMAC-SHA256-signed plaintext. The serialized StreamState is no longer readable by anything between client and server — proxies, browser history, log captures, and tcpdump output all see opaque ciphertext.
Wire format v4
b64( version=4(1B) || nonce(24B) || ciphertext+tag )
The plaintext payload (including the creation timestamp used for TTL enforcement) lives entirely inside the ciphertext. TTL is checked after decryption.
Principal binding moves to AAD
Previously, the master HMAC key was HKDF-style stretched per (domain, principal) so that a token minted for user A could not be replayed by user B. The same property now comes from binding (domain, principal) into AEAD associated data — cross-principal replay fails decryption atomically with no per-user key derivation, simplifying key rotation.
Breaking changes
- The public parameter
signing_keyhas been renamed totoken_keyonmake_wsgi_app,make_sync_client, and in examples / docs. Token bytes are no longer signed — they are sealed. - The token wire format is incompatible with v0.15.x. Clients and servers must upgrade together.
- The
VGI_SIGNING_KEYenvironment variable (referenced indocs/hosting.md) is renamed toVGI_TOKEN_KEY.
New dependency
pynacl >= 1.5 is now a required dependency of the vgi-rpc[http] extra.
Access-log records now capture decrypted state
The request_state and response_state fields in access-log records are now both the decrypted Arrow-IPC StreamState payload, regardless of wire envelope. Previously request_state happened to coincide with the on-wire token (readable under HMAC); under AEAD that would have been opaque ciphertext with no audit value. The two fields are now semantically symmetric. See docs/access-log-spec.md for the updated contract.
HTTP streaming compression middleware now uses ~50% less peak memory
The compression middleware previously held three copies of each response body simultaneously during zstd compression — the response stream's internal BytesIO, a body = stream.read() bytes copy, and the compressed output. For an N-thread WSGI server emitting M-byte Arrow IPC bodies, that was ~3·N·M of avoidable peak; under typical Kafka-consume tuning (64 MiB per tick × 16 Waitress threads) it added up to ~3 GiB on a 2 GiB Fly VM, OOMing the worker.
The new path drives zstandard.ZstdCompressor.stream_writer from 64 KiB reads off resp.stream, eliminating the body bytes copy. Seekable streams (which is what the framework's response builders always produce) get size=N passed through so the zstd frame header still carries content_size — preserving wire-format compatibility with existing clients. Non-seekable streams fall back to the buffer-then-compress path.
Per-thread peak during compression drops from stream_buf + body + compressed (~2× body) to stream_buf + compressed (~1× body + ~5% for zstd output).
v0.15.0
Workers can now learn which transport they are bound to
Workers (RPC service implementations) had no way to know which transport (pipe / http / unix) they were being served over. CallContext.transport_metadata only carried HTTP fields and was ambiguous for non-HTTP transports — an empty dict could mean "pipe" or "HTTP request without those headers." There was no startup hook either.
TransportKind and the on_serve_start hook
A new TransportKind enum (PIPE, HTTP, UNIX) is exposed via three knobs:
RpcServer.transport_kind— coarse identifier of the bound transport, populated once serving begins.RpcServer.transport_capabilities—frozenset[str]of capability flags, currently{"shm"}when bound to aShmPipeTransport.CallContext.kind— per-call view of the sameTransportKindfor methods that already acceptctx.
Implementations may opt into a ServeStartHook lifecycle method:
from vgi_rpc import CallContext, TransportKind
class MyServiceImpl:
def on_serve_start(self, kind: TransportKind) -> None:
if kind is TransportKind.HTTP:
self._cache = build_http_cache()
def fetch(self, key: str, ctx: CallContext) -> str:
if ctx.kind is TransportKind.HTTP and self._cache is not None:
return self._cache.get(key)
return load_from_disk(key)The hook is duck-typed (no base class needed); a ServeStartHook Protocol is exported for type-hinting.
Fork-safe HTTP firing
For pipe / unix transports the hook fires inside RpcServer.serve(transport). For HTTP it fires lazily on the first request handled in the current process, via a tiny one-shot Falcon middleware. Pre-fork WSGI servers (gunicorn, uwsgi) therefore run startup work in each child worker, not the master — per-process resources (DB pools, threads, file handles) are no longer fork-unsafely inherited. Subprocess workers report PIPE because they speak Arrow IPC over the parent's stdin/stdout.
Failure semantics
Hook exceptions propagate (and are logged via logging.getLogger("vgi_rpc.rpc").exception first) — a misconfigured worker dies loudly rather than serving in a broken state. Rebinding the same RpcServer to a different transport re-fires the hook with the new kind rather than raising, so test fixtures that exercise multiple transports against one server are supported.
SHM as a capability, not an enum value
Shared-memory availability is exposed via transport_capabilities, not the enum, so coarse transport-kind checks stay simple while workers that need zero-copy paths can still detect SHM:
def on_serve_start(self, kind: TransportKind) -> None:
if "shm" in self._server.transport_capabilities:
self._enable_zero_copy()v0.14.0
Sentry: richer user data and protocol-level tagging
Two improvements to the auto-attached Sentry dispatch hook. Server-only — no client cooperation required.
Standards-aligned set_user
Previously, the JWT sub claim was placed in user.username, leaving Sentry showing opaque IdP identifiers (4FySzCeE4zIYuvph49iD9tcJL0_zDWfpMqarUUPc1uA) where a human handle should be. Now auth.principal populates user.id (Sentry's canonical opaque identifier) and the decoded JWT claims feed user.username, user.email, user.name from standard OIDC claim names:
| Sentry user field | JWT claim |
|---|---|
id |
auth.principal (typically sub) |
username |
preferred_username |
email |
email |
name |
name |
SentryConfig gains a user_claim_map so non-standard IdPs (e.g. Auth0 namespaced claims like https://example.com/email) can override per-key. Static bearer tokens (no JWT, empty claims) populate only user.id — the same effective behavior as before but in the correct field.
Generic vgi.attach_id / vgi.transaction_id scope tags
Every dispatch now auto-tags vgi.attach_id and vgi.transaction_id on the Sentry scope when present. _extract_well_known walks one level into kwargs to handle the three protocol shapes:
- direct kwargs —
catalog_detach(attach_id=...) - request dataclasses —
bind(request=BindRequest(attach_id=...)) InitRequest.bind_callnesting —init(request=InitRequest(bind_call=BindRequest(attach_id=...)))
No per-method wiring required. Tag values are 12-char SHA-256 prefixes via the new public short_hash() helper — bytes and their .hex() form produce the same value, so Sentry's tag-value distribution UI stays bounded while same-input → same-output preserves cross-event correlation. Full hex remains in catalog breadcrumbs for direct lookup.
Public API additions
vgi_rpc.sentry.short_hash(value, *, length=12)— stable 12-char SHA-256 prefix, acceptsbytes | str | None. Reusable for any high-cardinality opaque ID you want to tag in Sentry without exhausting the tag-value distribution UI.SentryConfig.user_claim_map—Mapping[str, str]of Sentry user-field name → JWT claim name. Defaults to OIDC standard names.
Internal
_SentryDispatchHook.on_dispatch_startwalks request dataclasses forattach_id/transaction_idafter the existing tag/context logic. Hook signature unchanged.
v0.13.0
Sentry: useful traces, not just errors
Phase A of making Sentry traces actually useful for finding slow RPC calls. Three changes ship together — all server-only, no client cooperation required.
Transaction names
By default, vgi-rpc now overrides Sentry's WSGI-derived /{method} transaction name (a literal route template — the actual method name was getting lost) with rpc {method}. Transactions group by RPC method in the Performance dashboard. Disable with SentryConfig(set_transaction_name=False) if you have alerts pinned to the route names.
Searchable span attributes
Every dispatch attaches rpc.system, rpc.service, rpc.method, and rpc.method_type as span data on the active transaction's root span — searchable in Trace Explorer / Insights. Streams also carry rpc.stream_id (one uuid shared across all /init and /exchange HTTP turns of one logical call) on span data so you can find sibling turns in the UI.
Opt-in RPC parameter recording
SentryConfig(record_params=True) exposes kwargs as rpc.param.<k> span attributes — the executeScan(table='orders', predicate=...) use case:
span.op:rpc.server rpc.method:executeScan rpc.param.table:orders
-> chart p99(span.duration) GROUP BY rpc.param.predicate
Default off because kwargs may carry user data and Sentry's default scrubber matches key names only. New SentryConfig knobs:
| Field | Default | Purpose |
|---|---|---|
set_transaction_name |
True |
Override WSGI route-template names with rpc {method} |
record_params |
False |
Record kwargs as rpc.param.<k> span attributes |
tag_params |
() |
Operator-curated whitelist of params duplicated as scope tags for Issues filtering (low cardinality) |
param_redactor |
key-based default | Sanitiser; default strips password|token|secret|key|authorization keys. noop_redactor opts out |
max_param_value_bytes |
1024 |
Truncate string values; matches Relay's per-attribute cap |
Span attributes accept primitives only (str/bool/int/float and homogeneous lists thereof) — non-primitive kwargs are silently dropped to avoid Sentry ingestion errors. Free-text PII is the operator's responsibility; the default scrubber won't catch values like predicate=\"email = 'alice@x.com'\".
See docs/api/sentry.md for the full caveat list.
Internal contract change
_DispatchHook.on_dispatch_start Protocol gains a kwargs: Mapping[str, Any] parameter so observability backends can attach RPC parameters to traces. OTel hook accepts-and-ignores it for now (param recording on OTel is a separate follow-up). The hook is _-prefixed and internal — no public-API impact.
Other
fix(utils):empty_batchnow supports schemas with sparse and dense union fields. pyarrow'spa.array([], type=union)raisesArrowNotImplementedError, so empty union arrays are now built viaArray.from_buffersfrom empty children plus an empty type-codes buffer (and an empty offsets buffer for dense unions).docs(sentry): corrected misleading wording aboutSENTRY_DSN. The Sentry SDK does not auto-initialise on import;sentry_sdk.init()must be called explicitly. It will fall back to readingSENTRY_DSNfrom the environment only if you don't pass an explicitdsn=.
v0.12.0
Sentry: zero-config auto-attach
If sentry_sdk is initialised in the worker process (via sentry_sdk.init(), or by setting SENTRY_DSN so the SDK auto-inits), every RpcServer now wires default-config Sentry instrumentation automatically. No flag, no extra env var — sentry_sdk.is_initialized() is the signal of intent.
import sentry_sdk
sentry_sdk.init(dsn="https://...") # or just set SENTRY_DSN
from vgi_rpc import RpcServer
server = RpcServer(MyService, MyServiceImpl()) # Sentry already wiredThe check is gated on sentry_sdk already being importable in the process, so workers without vgi-rpc[sentry] pay nothing.
Customising
instrument_server_sentry() (and make_wsgi_app(sentry_config=...) / serve_http(sentry_config=...)) now use replace semantics: an explicit configured call strips any existing _SentryDispatchHook from the chain (preserving non-Sentry hooks like OTel) before installing the new one. So explicit config always wins, regardless of timing relative to auto-attach.
from vgi_rpc.sentry import SentryConfig, instrument_server_sentry
instrument_server_sentry(server, SentryConfig(custom_tags={"env": "prod"}, enable_performance=True))See docs/api/sentry.md for the full configuration reference.
v0.11.1
Performance
Cut subprocess worker startup time by ~55% (312ms → 140ms) and the full test suite by ~29% (132s → 94s).
- Lazy-load optional integrations (
vgi_rpc.http,vgi_rpc.sentry,vgi_rpc.otel,vgi_rpc.s3,vgi_rpc.gcs) via module-level__getattr__instead of importing them eagerly fromvgi_rpc/__init__.py. Cutsimport vgi_rpccost roughly in half (~250ms → ~115ms) for callers that don't need those extras — including every pipe/subprocess worker. - Test fixture service (Protocol, Impl, state classes) extracted from
tests.test_rpcintotests/_fixture_service.pyso subprocess fixtures don't drag pytest and conftest into worker import paths.
No public API changes. from vgi_rpc import http_connect, vgi_rpc.SentryConfig, from vgi_rpc import * all behave identically.
v0.11.0
Highlights
- HTTP response caps reworked: separate
max_response_bytes(HTTP body) andmax_externalized_response_bytes(external upload volume) knobs; worker-visible budgets onOutputCollector; strict-fail enforcement; pre-flight externalization check. Deprecatedmax_stream_response_bytesretained for one cycle. - Slim
__describe__v4 +protocol_hash: language-neutral 8-column wire format; SHA-256 protocol hash surfaced in describe metadata and access-log records. - Access-log spec: cross-language conformance (
access_log.schema.json), record truncation, size/time rotation,{pid}/{server_id}path placeholders, Vector + Fluent Bit shipper configs. - OAuth PKCE token-exchange proxy for SPA clients.
- HTTP dispatcher symmetry: extracted
_run_http_exchange_turn/_mint_continuation_token; refactored producer-stream dispatch. - Windows CI is green.
- Sequential Unix-socket listen backlog bumped to 16.
OutputCollector.emitselect-then-cast so projection works correctly.
See git log v0.10.0..v0.11.0 for the full diff.
v0.10.0
Highlights
- DESCRIBE_VERSION 4: slim language-neutral wire format with
protocol_hash. - Cross-language access-log conformance: Go, TypeScript, Java aligned (Rust pending).
- Shared memory transport hardening with new cross-language and header-format tests.
- Java benchmark suite for direct comparison against the Python results.
See the commit log for the full set of changes since v0.9.0.
v0.9.0: HTTP cookie support
HTTP cookie support for unary RPC methods
RPC methods served over the HTTP transport can now read incoming cookies and set/delete cookies on the response via CallContext.
API additions (vgi_rpc.rpc.CallContext)
ctx.cookies— read-onlyMapping[str, str]of incoming request cookies. Empty on non-HTTP transports.ctx.set_cookie(name, value, *, expires=None, max_age=None, domain=None, path=None, secure=None, http_only=True, same_site=None, partitioned=False)— queue aSet-Cookieon the HTTP response. Unary HTTP methods only.ctx.delete_cookie(name, *, path=None, domain=None)— queue an unset-cookie. Unary HTTP methods only.
Behavior
- Cookies queued before an exception are still emitted on the 4xx/5xx response.
- Calling
set_cookie/delete_cookiefrom a streaming method or non-HTTP transport raisesRuntimeError(surfaced to clients asRpcError). _AuthMiddlewarenow installs unconditionally, sotransport_metadata(remote_addr,user_agent,cookies) is available even on services without anauthenticatecallback.
Example
class Session(Protocol):
def login(self, user: str, password: str) -> str: ...
def whoami(self) -> str: ...
class SessionImpl:
def login(self, user: str, password: str, ctx: CallContext) -> str:
ctx.set_cookie("sid", generate_token(user), max_age=3600, http_only=True, secure=True, same_site="Lax")
return user
def whoami(self, ctx: CallContext) -> str:
return lookup_user(ctx.cookies.get("sid", ""))