Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
8d46f46
CCM-16073 - Enhanced callbacks
rhyscoxnhs Apr 14, 2026
cba6b10
CCM-16073 - Fixed lints
rhyscoxnhs Apr 17, 2026
11319b1
CCM-16073 - Fixed terraform
rhyscoxnhs Apr 17, 2026
d4b4f70
CCM-16073 - Fixed terraform
rhyscoxnhs Apr 17, 2026
dba2894
CCM-16073 - Attempt to trigger a fresh build
rhyscoxnhs Apr 17, 2026
3cd6e4d
CCM-16073 - Fixed terraform
rhyscoxnhs Apr 17, 2026
6810e97
updated gitignore
cgitim Apr 17, 2026
e46720c
updated vale acceptable words
cgitim Apr 17, 2026
0765ef1
updated docs for npm->pnpm changeover
cgitim Apr 17, 2026
5b76a87
CCM-16073 - PR feedback
rhyscoxnhs Apr 20, 2026
f6c0532
CCM-16073 - PR feedback
rhyscoxnhs Apr 20, 2026
a53ed73
CCM-16073 - PR feedback
rhyscoxnhs Apr 20, 2026
4e59c1c
CCM-16073 - PR feedback
rhyscoxnhs Apr 20, 2026
af98725
Lua unit tests
mjewildnhs Apr 17, 2026
0bbcd79
Add luacheck to pre-commit and fix issue
mjewildnhs Apr 17, 2026
b7297d7
luacheck in CI workflow
mjewildnhs Apr 17, 2026
e4cd754
CCM-16073 - PR feedback
rhyscoxnhs Apr 20, 2026
c55139d
CCM-16073 - PR feedback
rhyscoxnhs Apr 20, 2026
3cf99aa
CCM-16073 - PR feedback
rhyscoxnhs Apr 20, 2026
6262f2f
CCM-16073 - PR feedback
rhyscoxnhs Apr 21, 2026
7297c18
CCM-16073 - PR feedback
rhyscoxnhs Apr 21, 2026
0ab926d
CCM-16073 - PR feedback
rhyscoxnhs Apr 22, 2026
dc4343a
CCM-16073 - PR feedback
rhyscoxnhs Apr 22, 2026
b25a822
CCM-16073 - Integration test fixes (#152)
mjewildnhs Apr 22, 2026
91d993f
Set the SPKI hash for test client config
mjewildnhs Apr 21, 2026
d4d304c
CCM-16002 - Revised performance test implementation (#123)
rhyscoxnhs Apr 23, 2026
ebf9e81
CCM-16073 - ITs, metrics fix, log correlationId (#156)
mjewildnhs Apr 24, 2026
39d70f3
Fix DLQ on delivery
mjewildnhs Apr 27, 2026
4093022
CCM-16073 - Updated rate limiting behaviour (#158)
rhyscoxnhs Apr 29, 2026
7bcb460
Fix flakey retry policy tests
mjewildnhs Apr 29, 2026
0b5cdda
CCM-16073 - Addressed PR feedback
rhyscoxnhs Apr 30, 2026
8de4274
consistency and naming changes
cgitim Apr 30, 2026
e78cda1
CCM-16073 - Addressed PR feedback
rhyscoxnhs May 1, 2026
1455012
Update IT test assertion following observability changes
mjewildnhs May 1, 2026
070b3d7
Fix initial state when circuit breaker enabled
mjewildnhs May 1, 2026
c8207c0
Fix debug int test script
mjewildnhs May 1, 2026
624f706
Fix circuit breaker IT test assertion following observability changes
mjewildnhs May 1, 2026
5a7a5a7
remove shim for migrated logger
cgitim May 5, 2026
812d50e
removed dead src/config-cache
cgitim May 5, 2026
a8fcc47
CCM-16073 - Performance test changes and concurrency optimisation (#173)
mjewildnhs May 5, 2026
dfe1ed5
intent: generate idempotencyKey from attributes
cgitim May 6, 2026
768d110
CCM-16073 - Initial work on infra refactor (#177)
rhyscoxnhs May 6, 2026
9555f12
Merge branch 'main' into feature/CCM-16073
rhyscoxnhs May 6, 2026
75f1c81
Set appropriate resolutions for all metrics
mjewildnhs May 6, 2026
38595d1
Fix tf example comment
mjewildnhs May 6, 2026
425ce1c
linting fix
cgitim May 6, 2026
fa98f6e
CCM-16073 - Fixed build
rhyscoxnhs May 6, 2026
76384e1
CCM-16073 - PR feedback
rhyscoxnhs May 6, 2026
f524631
CCM-16073 - PR feedback
rhyscoxnhs May 6, 2026
6500c5d
CCM-16073 - PR feedback
rhyscoxnhs May 7, 2026
04c301f
CCM-16073 - PR feedback
rhyscoxnhs May 7, 2026
e08a5d2
Refactor/bolster rate limit unit tests
mjewildnhs May 7, 2026
fda5541
Fix unit test resolves deployment context with defaults' when AWS_PRO…
mjewildnhs May 7, 2026
508a1a6
Refactor metrics test to avoid repetition
mjewildnhs May 7, 2026
b1b0d88
Refactor consistency in http lambda env var test overrides
mjewildnhs May 7, 2026
ec872a6
Refactor http lambda handler admission denied duplication and added D…
mjewildnhs May 7, 2026
e996fbe
Refactor http lambda handler - improved readability of processTargetB…
mjewildnhs May 7, 2026
1a2a922
Merge branch 'main' into feature/CCM-16073
cgitim May 7, 2026
740f513
Improve http lambda error resilience
mjewildnhs May 7, 2026
cbf01fc
Fix flawed http lambda tests
mjewildnhs May 7, 2026
0011275
Better grouping on handler tests
mjewildnhs May 7, 2026
200875c
Fix tls agent test assertions
mjewildnhs May 7, 2026
cc35c24
Update subscription tool README
mjewildnhs May 7, 2026
b8ccd6f
Ensure all sub tool CLI options output when doing dry run
mjewildnhs May 7, 2026
1e9510a
Remove nosiy redis client logging
mjewildnhs May 7, 2026
95bb5ea
Flakey circuit breaker test fix
mjewildnhs May 7, 2026
95bdb23
Log if IT purge fails due do 1 being in progress
mjewildnhs May 7, 2026
3d3b112
Make all perf clients mtls
mjewildnhs May 7, 2026
aae59c1
CCM-16073 - Sonar fixes
rhyscoxnhs May 8, 2026
dc26e09
Update main README
mjewildnhs May 8, 2026
c5c19ec
Retry 401, 407, 409 - dlq negative retry-after
mjewildnhs May 8, 2026
e8cdc53
CCM-16073 - Terraform refactor
rhyscoxnhs May 8, 2026
9c3ac64
CCM-16073 - Terraform refactor
rhyscoxnhs May 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/actions/acceptance-tests/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ runs:
shell: bash
env:
PROJECT: nhs
COMPONENT: ${{ inputs.targetComponent }}
COMPONENT: cb
CLIENT_COMPONENT: cbc
run: |
make test-${{ inputs.testType }}
10 changes: 10 additions & 0 deletions .github/workflows/stage-2-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,16 @@ jobs:
- name: "Run linting"
run: |
make test-lint
test-lua-lint:
name: "Lua linting"
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: "Checkout code"
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: "Run luacheck"
run: |
make test-lua-lint
test-typecheck:
name: "Typecheck"
runs-on: ubuntu-latest
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ version.json

# Please, add your custom content below!

# dependencies
# Dependencies
node_modules
.node-version
*/node_modules
Expand All @@ -22,3 +22,4 @@ node_modules
dist
.DS_Store
.reports
*~
12 changes: 12 additions & 0 deletions .luarc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"diagnostics": {
"globals": [
"KEYS",
"ARGV",
"redis",
"cjson",
"cmsgpack",
"bit"
]
}
}
18 changes: 9 additions & 9 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,25 +23,25 @@ Agents should look for a nested `AGENTS.md` in or near these areas before making

## Root package.json – role and usage

The root `package.json` is the orchestration manifestgit co for this repo. It does not ship application code; it wires up shared dev tooling and delegates to workspace-level projects.
The root `package.json` is the orchestration manifest for this repo. It does not ship application code; it wires up shared dev tooling and delegates to workspace-level projects.

- Workspaces: Declares the set of npm workspaces (e.g. under `lambdas/`, `utils/`, `tests/`, `scripts/`). Agents should add a new workspace path here when introducing a new npm project.
- Scripts: Provides top-level commands that fan out across workspaces using `--workspaces` (lint, typecheck, unit tests) and project-specific runners (e.g. `lambda-build`).
- Workspaces: Declares the set of pnpm workspaces (e.g. under `lambdas/`, `utils/`, `tests/`, `scripts/`). Agents should add a new workspace path here when introducing a new pnpm project.
- Scripts: Provides top-level commands that fan out across workspaces using `pnpm -r` (lint, typecheck, unit tests) and project-specific runners (e.g. `lambda-build`).
- Dev tool dependencies: Centralises Jest, TypeScript, ESLint configurations and plugins to keep versions consistent across workspaces. Workspace projects should rely on these unless a local override is strictly needed.
- Overrides/resolutions: Pins transitive dependencies (e.g. Jest/react-is) to avoid ecosystem conflicts. Agents must not remove overrides without verifying tests across all workspaces.

Agent guidance:

- Before adding or removing a workspace, update the root `workspaces` array and ensure CI scripts still succeed with `npm run lint`, `npm run typecheck`, and `npm run test:unit` at the repo root.
- When adding repo-wide scripts, keep names consistent with existing patterns (e.g. `lint`, `lint:fix`, `typecheck`, `test:unit`, `lambda-build`) and prefer `--workspaces` fan-out.
- Before adding or removing a workspace, update the root `workspaces` array and ensure CI scripts still succeed with `pnpm run lint`, `pnpm run typecheck`, and `pnpm run test:unit` at the repo root.
- When adding repo-wide scripts, keep names consistent with existing patterns (e.g. `lint`, `lint:fix`, `typecheck`, `test:unit`, `lambda-build`) and prefer `pnpm -r` fan-out.
- Do not publish from the root. If adding a new workspace intended for publication, mark that workspace package as `private: false` and keep the root as private.
- Validate changes by running the repo pre-commit hooks: `make githooks-run`.

Success criteria for changes affecting the root `package.json`:

- `npm run lint`, `npm run typecheck`, and `npm run test:unit` pass at the repo root.
- Workspace discovery is correct (new projects appear under `npm run typecheck --workspaces`).
- No regression in lambda build tooling (`npm run lambda-build`).
- `pnpm run lint`, `pnpm run typecheck`, and `pnpm run test:unit` pass at the repo root.
- Workspace discovery is correct (new projects appear under `pnpm run typecheck -r`).
- No regression in lambda build tooling (`pnpm run lambda-build`).

## What Agents Can / Can’t Do

Expand Down Expand Up @@ -81,7 +81,7 @@ When proposing a change, agents should:

to catch formatting and basic lint issues. Domain specific checks will be defined in appropriate nested AGENTS.md files.

- Suggest at least one extra validation step (for example `npm test:unit` in a lambda, or triggering a specific workflow).
- Suggest at least one extra validation step (for example `pnpm run test:unit` in a lambda, or triggering a specific workflow).
- Any required follow up activites which fall outside of the current task's scope should be clearly marked with a 'TODO: CCM-12345' comment. The human user should be prompted to create and provide a JIRA ticket ID to be added to the comment.

## Security & Safety
Expand Down
105 changes: 80 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,19 @@ Event-driven infrastructure for delivering NHS Notify callback notifications to

## Overview

The Client Callbacks infrastructure processes message and channel status events, applies client-specific subscription filters, and delivers callbacks to configured webhook endpoints. Events flow from the Shared Event Bus through an SQS queue, are transformed and filtered by a Lambda function, then routed to clients via per-client API Destination Target Rules.
The Client Callbacks infrastructure processes message and channel status events, applies client-specific subscription filters, and delivers callbacks to configured webhook endpoints. Events flow from the Shared Event Bus through an SQS queue, are transformed and filtered by a Lambda function, then routed to per-client delivery queues where dedicated HTTPS Client Lambdas handle webhook delivery with mTLS support, per-target rate limiting, and circuit breaking.

### Key Features

- **Event-Driven Architecture**: Consumes CloudEvents from the Shared Event Bus (`uk.nhs.notify...` namespace)
- **Event-Driven Architecture**: Consumes CloudEvents from the Shared Event Bus
- **Client Subscription Filtering**: Applies per-client rules for message status and channel status event types
- **Webhook Delivery**: EventBridge API Destinations with per-client configuration and retry policies
- **Failure Handling**: Per-client Dead Letter Queues
- **mTLS Webhook Delivery**: Per-client HTTPS Client Lambdas with mutual TLS and optional certificate pinning
- **Per-Target Rate Limiting**: Token bucket algorithm with configurable delivery rates per client target
- **Circuit Breaking**: Automatic throttling of consistently failing endpoints
- **Retry and Backoff**: Exponential backoff with jitter, configurable retry windows
- **Client Isolation**: Dedicated queues, Lambdas, and DLQs per client — one client's issues do not affect others
- **Event Archive**: 7-day event archive on the Callbacks Event Bus for operational replay
- **Failure Handling**: Per-client Dead Letter Queues for permanently failed deliveries
- **Backward Compatibility**: Maintains callback payload format compatibility with legacy Core domain implementation

## Table of Contents
Expand All @@ -23,6 +28,7 @@ The Client Callbacks infrastructure processes message and channel status events,
- [Architecture](#architecture)
- [Components](#components)
- [Event Flow](#event-flow)
- [Packages](#packages)
- [Setup](#setup)
- [Prerequisites](#prerequisites)
- [Configuration](#configuration)
Expand All @@ -37,23 +43,60 @@ The Client Callbacks infrastructure processes message and channel status events,
### Components

- **Shared Event Bus**: Cross-domain EventBridge bus receiving events from Core, Routing, and other NHS Notify domains
- **Callback Event Queue**: SQS queue subscribed to `uk.nhs.notify...` events via EventBridge Target Rule
- **Callback Event Queue**: SQS queue subscribed to `uk.nhs.notify.message.status.PUBLISHED...` / `uk.nhs.notify.channel.status.PUBLISHED...` events via EventBridge Target Rule
- **Transform & Filter Lambda**: Processes events, loads client configurations, applies subscription filters, and routes to Callbacks Event Bus
- **Callbacks Event Bus**: Domain-specific EventBridge bus for webhook orchestration
- **API Destination Target Rules**: Per-client rules invoking HTTPS endpoints with client-specific authentication
- **Client Config Storage**: S3 bucket storing client subscription configurations (status filters, webhook endpoints)
- **Per-Client Target DLQs**: SQS Dead Letter Queues for failed webhook deliveries (one per client target)
- **Callbacks Event Bus**: Domain-specific EventBridge bus for webhook orchestration, with a 7-day event archive for replay
- **Per-Client SQS Queues**: Dedicated queues per client, receiving events from the Callbacks Event Bus via per-subscription EventBridge rules
- **HTTPS Client Lambdas**: Per-client Lambda functions handling webhook delivery with mTLS, payload signing, rate limiting, and circuit breaking
- **Delivery State Store**: ElastiCache Serverless (Valkey) cluster storing per-target rate-limit token bucket and circuit-breaker state
- **Client Config Storage**: S3 bucket storing client subscription configurations (status filters, webhook endpoints, mTLS settings)
- **Per-Client DLQs**: SQS Dead Letter Queues for permanently failed webhook deliveries (one per client)

### Event Flow

1. Status change events published to Shared Event Bus in `uk.nhs.notify...` namespace
1. Status change events published to Shared Event Bus in `uk.nhs.notify.message.status.PUBLISHED...` / `uk.nhs.notify.channel.status.PUBLISHED...` namespace
2. SQS Target Rule routes events to Callback Event Queue
3. EventBridge Pipe invokes Transform & Filter Lambda with event batches
4. Lambda loads client subscription configs from S3
5. Lambda applies client-specific filters (message status, channel status)
6. Matching events published to Callbacks Event Bus
7. API Destination Target Rules deliver callbacks to client webhook endpoints
8. Failed deliveries moved to per-client DLQs after retry exhaustion
7. Per-subscription EventBridge rules route events to per-client SQS queues
8. HTTPS Client Lambda processes batches from the per-client queue, applying rate limiting and circuit-breaker checks, signing payloads, and delivering via HTTPS with optional mTLS
9. Temporary failures are retried with exponential backoff; permanent failures are sent to per-client DLQs

## Packages

The repository is organised as a pnpm workspace. Each package has its own `package.json`, build configuration, and tests.

### Lambdas

| Package | Path | Description |
| -------------------------------------------------- | ----------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| `nhs-notify-client-transform-filter-lambda` | `lambdas/client-transform-filter-lambda/` | Processes inbound events, applies client subscription filters, and publishes matching events to the Callbacks Event Bus |
| `@nhs-notify-client-callbacks/https-client-lambda` | `lambdas/https-client-lambda/` | Per-client delivery Lambda — signs payloads, delivers via HTTPS with mTLS, handles retries, rate limiting, and circuit breaking |
| `nhs-notify-mock-webhook-lambda` | `lambdas/mock-webhook-lambda/` | Mock webhook endpoint for integration testing |
| `nhs-notify-perf-runner-lambda` | `lambdas/perf-runner-lambda/` | Performance test runner |

### Shared Libraries

| Package | Path | Description |
| -------------------------------------------------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------- |
| `@nhs-notify-client-callbacks/models` | `src/models/` | Shared TypeScript types and zod schemas for client configuration, callback targets, and delivery messages |
| `@nhs-notify-client-callbacks/logger` | `src/logger/` | Structured JSON logging utility |
| `@nhs-notify-client-callbacks/config-subscription-cache` | `src/config-subscription-cache/` | TTL-based in-memory cache for client subscription configs loaded from S3 |

### Tools

| Package | Path | Description |
| --------------------------------- | ---------------------------------------- | ------------------------------------------------------------------------------------------------- |
| `client-subscriptions-management` | `tools/client-subscriptions-management/` | CLI for managing client subscriptions, targets, mTLS certificates, and application mappings in S3 |

### Tests

| Package | Path | Description |
| ----------------------------------------------- | --------------------- | --------------------------------------------------------- |
| `nhs-notify-client-callbacks-integration-tests` | `tests/integration/` | Integration tests run against deployed AWS infrastructure |
| `@nhs-notify-client-callbacks/test-support` | `tests/test-support/` | Shared test helpers, fixtures, and mock configurations |

## Setup

Expand Down Expand Up @@ -102,32 +145,44 @@ make config
Run unit tests for Lambda functions:

```shell
npm test
pnpm test:unit
```

## Infrastructure
### Linting

Infrastructure is managed with Terraform under `infrastructure/terraform/`:
```shell
pnpm lint
```

- `components/`: Terraform components for different environments/accounts
- `modules/`: Reusable Terraform modules for callback infrastructure
### Type Checking

**Deploy infrastructure**:
```shell
pnpm typecheck
```

### Full Verification

Run lint, typecheck, and unit tests together:

```shell
cd infrastructure/terraform/components/<component>
terraform init
terraform plan
terraform apply
pnpm verify
```

## Infrastructure

Infrastructure is managed with Terraform under `infrastructure/terraform/`:

- `components/`: Terraform components for different environments/accounts
- `modules/`: Reusable Terraform modules for callback infrastructure

Key infrastructure modules:

- **callback-event-queue**: SQS queue and EventBridge Target Rule for Shared Event Bus subscription
- **transform-filter-lambda**: Lambda function with EventBridge Pipe trigger
- **callbacks-event-bus**: Domain-specific EventBridge bus
- **api-destinations**: Per-client API Destination Target Rules
- **client-config-storage**: S3 bucket for subscription configurations
- **callbacks-event-bus**: Domain-specific EventBridge bus with 7-day event archive
- **client-delivery**: Per-client SQS queue, HTTPS Client Lambda (VPC-attached), DLQ, and EventBridge rules
- **elasticache-delivery-state**: Shared ElastiCache Serverless (Valkey) cluster for rate-limit and circuit-breaker state
- **client-config-storage**: S3 bucket for subscription configurations (including mTLS and certificate pinning settings)

## Contributing

Expand Down
2 changes: 1 addition & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ h help:
@egrep '^\S|^$$' Makefile

install:
pnpm install
npm install
Comment thread
rhyscoxnhs marked this conversation as resolved.
bundle config set --local path vendor/bundle
bundle install

Expand Down
4 changes: 2 additions & 2 deletions docs/test-standards.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ AI must:
- Verify mock return types match the actual function return types.

7. **The "Test Execution" Mandate**:
- After creating or modifying a test, you MUST run it using the repo's test command - e.g. npm run test:unit
- After creating or modifying a test, you MUST run it using the repo's test command - e.g. pnpm run test:unit
- If the test fails due to incorrect imports, paths, or signatures, fix and re-run.
- Only report completion when the test passes (exit code 0) and test coverage checks also pass.
- See section 6.2 for the full self-correction loop requirements.
Expand Down Expand Up @@ -192,7 +192,7 @@ AI must:

When AI changes tests, it must:

- run all the tests in the npm workspace.
- run all the tests in the pnpm workspace.
- report exactly what it ran and whether it passed.

### 6.2 AI Self-Correction Loop
Expand Down
3 changes: 2 additions & 1 deletion eslint.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ export default defineConfig([
"**/test-results",
"**/playwright-report*",
"eslint.config.mjs",
"**/lua-transform.js",
]),

//imports
Expand Down Expand Up @@ -200,7 +201,7 @@ export default defineConfig([
},
},
{
files: ["**/utils/**", "tests/test-team/**", "tests/performance/helpers/**", "lambdas/**/src/**"],
files: ["**/utils/**", "tests/test-team/**", "tests/performance/helpers/**", "lambdas/**/src/**", "src/**/src/**"],
rules: {
"import-x/prefer-default-export": 0,
},
Expand Down
Loading