Skip to content

feat(api): expose Reply-To as a first-class field on inbound payloads#81

Merged
jiashuoz merged 5 commits into
mainfrom
feat/reply-to-field
May 19, 2026
Merged

feat(api): expose Reply-To as a first-class field on inbound payloads#81
jiashuoz merged 5 commits into
mainfrom
feat/reply-to-field

Conversation

@jiashuoz
Copy link
Copy Markdown
Member

Summary

  • Adds reply_to: list[str] to the inbound webhook payload, REST list/detail responses, and the Python SDK's InboundEmail / AsyncInboundEmail / MessageSummary. Empty list when the header is absent — never falls back silently to sender.
  • New DB column messages.reply_to text[] (migration 005_reply_to.sql), populated server-side from Reply-To: using the same mail.ParseAddressList path as to/cc so display names get stripped uniformly. RFC 5322 § 3.6.2 allows multiple addresses, hence the list shape.
  • Existing displaySender (first Reply-To wins) behavior preserved — the webhook from field is unchanged for backwards compatibility. The new reply_to field is purely additive.

Motivating case

Granola sends meeting-summary emails with From: notifications@mail.granola.ai and Reply-To: <real-user>. Today consumers re-parse raw_message with stdlib email.message_from_bytes() to recover the intended correspondent — slow and bypasses e2a's signature trust path. With this change the field is structured and gated behind verify_signature().

DKIM-coverage decision

verify_signature() in this SDK is HMAC over body_hash, not real DKIM verification. Since Reply-To: lives inside raw_message, tampering breaks the HMAC — reply_to enjoys the same end-to-end integrity guarantee as to/cc. Picked option (a) from the design question: keep reply_to populated whenever present, no reply_to_signed flag. Rationale: a per-header DKIM-coverage check would require wiring data out of internal/emailauth into the SDK, which crosses the "no Authentication-Results / SPF / DMARC surface changes" line called out as out-of-scope. The trust limitation (upstream MITM can rewrite Reply-To if the original sender's DKIM doesn't cover it) is documented on the property docstring and in sdks/python/CHANGELOG.md. Callers doing high-stakes routing should also confirm is_verified and the sender domain.

Consumer awareness

  • Webhook payload schema — new optional reply_to: string[] key. Old SDKs that ignore unknown keys are unaffected; new SDKs talking to an old server simply see []. Additive in both directions.
  • DB migration005_reply_to.sql must run before the upgraded binary on existing deployments. ADD COLUMN IF NOT EXISTS is idempotent. Existing inbound rows have reply_to IS NULL and coerce to []string on read.
  • from hoisting unchanged — when Reply-To is present, from still equals Reply-To[0]. A separate change can revisit that conflation; this PR is scoped to adding the new field, not breaking the old one.

Test plan

  • make test-unit — passes
  • make test-integration — passes against local Postgres on :5433 (covers internal/identity and internal/agent tests that touch the new column and the new arg on CreateInboundMessage)
  • Python pytest sdks/python/tests/ — 178 passed, 0 failed. Adds 7 new tests covering: absent Reply-To → empty list, single, multi-address, display-name-stripped (server-normalized), gated-until-verified, HMAC-trust path, and unverified_payload surfacing
  • uvx mypy sdks/python/src — clean on new code (9 pre-existing errors unrelated to reply_to)
  • go build ./... + go vet ./... — clean
  • make test-e2e — pre-existing build break on origin/main (internal/e2e/e2e_test.go:71 compares p.Body.To []string to a string literal). Unrelated to this branch; verified by stashing my changes and reproducing on c66f3a7.

🤖 Generated with Claude Code

@jiashuoz jiashuoz force-pushed the feat/reply-to-field branch from 5d9ecb9 to eef6922 Compare May 19, 2026 00:30
jiashuoz and others added 5 commits May 18, 2026 18:28
Surface the parsed Reply-To: header as reply_to (list[str]) on the
webhook payload, the REST list/detail responses, and the Python SDK's
InboundEmail / AsyncInboundEmail / MessageSummary. Empty list when the
header is absent — callers decide whether to fall back to sender. Trust
path is e2a's HMAC over raw_message body_hash, same as to/cc; upstream
DKIM coverage of Reply-To is not separately surfaced (documented).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-running `make swagger` picks up the Reply-To docstring from
identity.Message (referenced transitively from API endpoints), and
`make generate-sdk` propagates `reply_to` into MessageDetail /
MessageSummary in the TS types and the identity.Message Pydantic model.

Fixes the failing "Swagger spec freshness" and "Generated code
freshness" checks on this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the cross-language parity gap left by the original Reply-To
implementation:

- TS SDK `InboundEmail` gains a `replyTo: string[]` gated getter
  mirroring Python's. Empty list when the header was absent; the SDK
  never silently falls back to `sender`. `unverifiedPayload` and the
  `WebhookPayload` interface gain matching fields.
- CLI `e2a read` displays a `Reply-To:` line when present, so the
  Granola-style notifications case is visible end-to-end.
- Go round-trip test for `reply_to` extends the existing to/cc round-
  trip test to cover `GetInboundMessage`, `GetMessageWithContent`,
  and `GetMessagesByAgent`. Each is a distinct SELECT that could
  drift independently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original PR updated the Python README and CHANGELOG but missed
the language-neutral docs/api.md (which describes the webhook payload
shape) and the TypeScript SDK README's InboundEmail property table.

OpenAPI spec is already fresh via the regen commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…at + cover GDPR export

Address four issues raised in review:

C2 — trust docstring was misleading. The HMAC binds a fixed set of
auth headers and body_hash = SHA-256(raw_message). It does NOT sign
the JSON envelope, and the SDK reads reply_to from the JSON envelope
rather than re-parsing raw_message. So tampering with the JSON
reply_to does not break verify_signature — the SDK's own
test_reply_to_trusted_through_body_hash test demonstrates this
(raw_message contains Reply-To: real-user, HMAC verifies, yet
email.reply_to == [] because the JSON omitted the key). Rewrite the
docstring on InboundEmail.reply_to + AsyncInboundEmail.reply_to + the
TS replyTo getter to be honest: TLS protects the JSON envelope, HMAC
+ body_hash protect the body bytes, and the same caveat applies to
to/cc/recipient/subject/body. CHANGELOG "Reply-To trust path" section
gets the same correction.

S2 — InboundEmail.__init__, AsyncInboundEmail.__init__, and
MessageSummary (dataclass) all gained a required reply_to kwarg/field
with no default. Public exports — third-party constructors (test
fixtures, custom adapters) broke with TypeError. Make reply_to
default to None / field(default_factory=list); positional shape on
MessageSummary is preserved by moving reply_to to the end of the
dataclass.

S4b — scanMessagesForUser (the GDPR Art.15 right-of-access export)
SELECT omitted reply_to, so users requesting their data didn't get
the Reply-To header e2a stored. Add reply_to to the SELECT + Scan.
Extend seedUserData + TestExportUserData to assert round-trip.

S1 — migration comment claimed "empty array (not NULL) when the
header was absent" but no DEFAULT clause meant existing rows stay
NULL and new nil-slice inserts also write NULL. Rewrite the comment
to explain the actual SDK-normalization layer; skip DEFAULT '{}' to
avoid a backfill scan on the messages table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jiashuoz jiashuoz force-pushed the feat/reply-to-field branch from c979693 to f333f51 Compare May 19, 2026 01:28
@jiashuoz jiashuoz merged commit cff2891 into main May 19, 2026
12 checks passed
jiashuoz added a commit that referenced this pull request May 19, 2026
* feat(migrate): embed migrations + add idempotent runner with tracker

Adds a startup migration runner so the e2a binary self-applies any
pending schema changes from migrations/*.sql. Closes the class of
incident hit yesterday: PR #81 merged migration 005_reply_to.sql but
nothing ran it on Cloud SQL, so every list_messages query 500'd on
prod after the new binary deployed.

Three components:

- migrations/embed.go — embeds the SQL files into the binary via
  embed.FS, exported as migrations.FS. Single source of truth: the
  same files used by `make migrate` (local dev) and by the runner
  (prod boot) and by the testutil (test setup).

- internal/identity/migrate.go — RunMigrations(ctx, pool, fsys, mode).
  Reads files sorted lexicographically (the 00N_ prefix enforces order),
  diffs against the schema_migrations tracker table (which the runner
  bootstraps itself — chicken-and-egg solved), applies pending ones
  each in a transaction. Three modes via E2A_MIGRATION_MODE:
    * auto (default): apply pending, fail on error
    * verify: refuse to apply, return error listing pending — for
      cautious operators who want a separate manual migration step
    * skip: no-op, for emergency surgery
  Tracker table CREATE IF NOT EXISTS lives in the runner, not in
  migrations/ — otherwise it'd need to bootstrap itself recursively.

- internal/identity/migrate_test.go — DB-backed tests per CLAUDE.md
  convention:
    * ParseMigrationMode: empty/auto/verify/skip + 3 invalid forms
    * AppliesPending: fresh stub migrations run, tracker records them
    * Idempotent: second run is a no-op, tracker stays single-recorded
    * FailingMigrationRollsBack: bad SQL → tracker NOT updated
    * VerifyModeRefusesToApply: pending → error, no table created
    * SkipModeIsNoop: no DB writes at all
    * OrdersByFilename: 003 references table from 001 — must apply
      in order or it errors
    * PartialState: a.sql pre-recorded → only b.sql applies

All tests pass against the local Postgres on :5433 (per CLAUDE.md
test-integration convention).

Wire-up into cmd/e2a/main.go lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(migrate): wire RunMigrations into binary startup

After DB ping, before any other DB use (including bootstrap mode),
the binary now runs identity.RunMigrations against migrations.FS.
This closes the gap that allowed PR #81 to ship a binary depending
on a column that the prod DB didn't have.

Mode is selected via E2A_MIGRATION_MODE env var:
  - auto (default, hosted prod): apply pending migrations
  - verify: refuse to apply, fail with a list of pending — for
    cautious operators who want a separate migration step
  - skip: log + proceed without touching schema (emergency override)

Invalid mode string fails loudly at boot — log.Fatalf rather than
silently defaulting, so a typo doesn't ship un-noticed.

Migration runs BEFORE bootstrap mode so `--bootstrap-email` benefits
from auto-apply too; otherwise a fresh self-hoster would hit the same
column-not-found error.

Existing identity suite still passes (6.2s, no regressions).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(claude): note migration auto-apply + idempotence requirement

CLAUDE.md's "Key Conventions" section had a "Schema changes" bullet
warning about DB-backed tests but no mention of the migration system
itself. Added a "Migrations" bullet covering:

- Migration files MUST be idempotent (CREATE/ALTER ... IF NOT EXISTS)
- Migration files MUST be non-destructive on prod-sized tables
  (ALTER TABLE ADD COLUMN safe; ALTER COLUMN TYPE may rewrite a whole
  table — avoid on messages and usage_events)
- The binary embeds migrations/*.sql and auto-applies pending ones
  at startup against a schema_migrations tracker table
- E2A_MIGRATION_MODE knob: auto (default) / verify / skip

Verified e2e:
- bin/e2a against a fresh DB applies all 5 migrations and records
  them in schema_migrations
- restart with all applied → "[migrate] all 5 migrations applied",
  no-op — tracker works

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(migrate): pre-merge review fixes — lock, directive, orphan, test fixes

Addresses H1, H2, H3, M1, L1 from the second /review pass on #95.

H2 — Postgres session advisory lock. RunMigrations now acquires
pg_advisory_lock($e2aMigrationLockID) on a dedicated pooled
connection, holds for the entire body, and releases in a
fresh-context defer. Lock auto-releases on session disconnect, so a
crashed binary doesn't leave the DB stuck. Closes the rolling-
restart race where two concurrent binaries could double-run
migration 004's INSERT … WHERE NOT EXISTS backfill (which is
non-atomic at READ COMMITTED, no UNIQUE constraint to catch it).

H1 — `-- e2a:no-transaction` directive support. Migrations with the
marker in their leading comment block bypass BeginTx and Exec
directly on the connection. Required for any future migration using
CREATE INDEX CONCURRENTLY, REINDEX CONCURRENTLY, VACUUM, etc., which
Postgres refuses inside a transaction. Important caveat documented
in the source: such migrations must be single-statement (pgx's
simple protocol sends multi-statement scripts in an implicit
server-side transaction). Real-world use (CREATE INDEX CONCURRENTLY)
is always single-statement so this constraint is benign.

H3 — Fix broken assertion in TestRunMigrations_OrdersByFilename.
`errors.Is(rows.Err(), nil) && len(rowsSeen) != 2` silently skipped
the length check whenever rows.Err() was non-nil. Replaced with
proper rows.Err() check, length assertion, and per-row content
assertion to actually verify lex order produced the expected
inserts.

M1 — Orphan-record warning. After computing applied vs files, the
runner logs `WARN: <name> recorded in schema_migrations but not in
embedded migrations/` for any tracker row whose filename is gone.
Doesn't fail (legitimate squash-migration workflows might retire
files); surfaces silent drift to operators.

L1 — Per-migration timing log: `[migrate] applied <name> in <ms>`
plus a total-duration line. Lets operators see if a future
migration takes 20 minutes vs 20ms.

Also: improved error message on schema_migrations creation
failure to hint at the CREATE TABLE permission likely cause for
restricted DB users (L4 from review).

New tests:
- ConcurrentInvocations: 4 goroutines into RunMigrations on a fresh
  DB; asserts exactly one tracker row and one body insert (proves
  the advisory lock serializes correctly — without it, the test's
  not-quite-idempotent INSERT WHERE NOT EXISTS would produce
  multiple rows).
- NoTransactionDirective: VACUUM users (single statement) with the
  directive succeeds; recorded.
- NoTransactionDirective_RejectsInsideTx: same VACUUM without the
  directive fails inside BeginTx — confirms the directive is what
  makes the no-tx path work.
- OrphanedTrackerRecord: pre-planted orphan filename + empty FS
  produces a WARN log but doesn't fail the run.

Total: 12 tests passing (was 8). Full identity suite green in 7.3s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(migrate): third-audit fixes — multi-stmt detector, acquire timeout

Addresses H-A, H-B, M-A, L-C from the third /review pass on #95.

H-A — Detect multi-statement no-transaction migrations at runtime
and refuse with a clear error. Previously, a user writing
`SET maintenance_work_mem='1GB'; CREATE INDEX CONCURRENTLY …;`
with the directive would still get Postgres's confusing
"cannot run inside a transaction block" because pgx's simple
protocol sends multi-statement strings as one Query message
which the server wraps in an implicit transaction.

Implementation: `looksMultiStatement` strips line/block comments
and single-quoted strings via `stripSQLCommentsAndStrings`, then
checks for any remaining `;` after trimming trailing whitespace
and a final `;`. Not a full SQL parser — handles the obvious
shape; dollar-quoted strings and PL/pgSQL would need real
parsing but the directive isn't meant for those cases.

New tests:
- RejectsMultiStatement: stub with directive + two SELECTs is
  refused with "multiple statements" in error message, naming
  the file; tracker stays empty.
- AcceptsSemicolonInComment: single statement with ';' inside
  both a `--` comment line and a string literal must NOT trigger
  the detector — exercises the comment + string strippers.

H-B — Bound pool.Acquire with a 60s timeout. Previously called
with the parent ctx (Background from main), so on a rolling deploy
where the old container holds all 4 pgxpool conns the new binary
hangs at startup forever with no log line. Now wraps with
context.WithTimeout(ctx, acquireConnTimeout); error surfaces a
hint about the previous instance holding connections.

M-A — Log the advisory lock ID at acquire so an operator
investigating a stuck migration can grep for it:
  `[migrate] holding advisory lock <id> (e2a-migrations namespace)`

L-C — testutil/db.go's runMigrations was silently swallowing all
errors with "Tables may already exist, ignore from IF NOT EXISTS."
This hid genuine SQL bugs in new migrations during `go test`. Now
logs the error to stderr so a broken migration surfaces during
test runs even before the real RunMigrations path is hit. Doesn't
fail tests outright — preserves the original semantics of
tolerating re-application — but stops absorbing real errors.

Total: 14 tests passing (was 12). Full identity suite green in
6.4s. Build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant