NVIDIA · sbaum1994 · Mar 1, 2026 · Mar 1, 2026
diff --git a/examples/function_samples/load_tester_supreme/README.md b/examples/function_samples/load_tester_supreme/README.md
@@ -1,21 +1,79 @@
-# Load Tester Supreme Sample
-This sample provides both HTTP and gRPC echo servers designed for load and throughput testing.
+# Load Tester Supreme
 
-Use `WORKER_COUNT` to control gRPC worker threads (default: `500`).
+A dual-protocol echo server (HTTP + gRPC) purpose-built for load and throughput
+testing of NVCF deployments. Use this instead of the simpler `grpc_echo_sample`
+for any real load testing — it ships with 500 gRPC worker threads (vs 10) and
+exposes tunable response behaviour.
 
-Request tuning fields:
-- `message`: payload content
-- `repeats`: number of response repeats
-- `delay`: delay between responses in seconds
+## What's included
+
+| Endpoint | Port | Description |
+|----------|------|-------------|
+| HTTP `/echo` | 8000 | JSON request/response and SSE streaming |
+| HTTP `/health` | 8000 | Health check (returns 200) |
+| HTTP `/ping-pong` | 8000 | Full-duplex HTTP/2 echo |
+| gRPC `Echo/EchoMessage` | 8001 | Unary gRPC echo |
+| gRPC `Echo/EchoMessageStreaming` | 8001 | Bidirectional streaming gRPC |
+
+### Request tuning fields
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `message` | string | — | Payload content to echo back |
+| `repeats` | int | 1 | Number of times to repeat the response |
+| `delay` | float | ~0 | Seconds to sleep between responses |
+| `size` | int | 0 | Generate a random string of this length instead of echoing `message` (HTTP only) |
+| `stream` | bool | false | Return SSE stream instead of single response (HTTP only, requires `Accept: text/event-stream`) |
+
+### Environment variables
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `WORKER_COUNT` | `500` | Number of gRPC server threads |
+
+## Prerequisites
+
+- Docker (with `buildx` for multi-platform builds)
+- Access to a function container registry
+
+## Build the container
 
-## Build the sample container
 ```bash
+cd examples/function_samples/load_tester_supreme
+
+# Build for linux/amd64 (matches most cloud instances)
+docker build --platform linux/amd64 -t load_tester_supreme .
+
+# Or build multi-platform
 docker buildx build --platform linux/amd64,linux/arm64 -t load_tester_supreme .
 ```
 
-To upload it to NGC refer to [here](https://docs.nvidia.com/cloud-functions/user-guide/latest/cloud-function/quickstart.html#clone-build-and-push-the-docker-image-to-ngc-private-registry)
+## Push to your registry
+
+The image must be pushed to whatever container registry your NVCF cluster has
+credentials for.
+
+**NGC (nvcr.io):**
+
+```bash
+# Authenticate
+echo "$NGC_API_KEY" | docker login nvcr.io -u '$oauthtoken' --password-stdin
+
+# Tag and push
+docker tag load_tester_supreme nvcr.io/<your-org>/load_tester_supreme:latest
+docker push nvcr.io/<your-org>/load_tester_supreme:latest
+```
+
+## Test locally
+
+Run the container:
+
+```bash
+docker run --rm -p 8000:8000 -p 8001:8001 load_tester_supreme
+```
+
+### HTTP
 
-## Invoke the HTTP endpoint locally
 ```bash
 curl --request POST \
   --url localhost:8000/echo \
@@ -27,7 +85,8 @@ curl --request POST \
 }'
 ```
 
-## Invoke the streaming HTTP endpoint locally
+### HTTP streaming
+
 ```bash
 curl --request POST \
   --url localhost:8000/echo \
@@ -41,9 +100,10 @@ curl --request POST \
 }'
 ```
 
-## Invoke the gRPC endpoint locally
+### gRPC
+
 ```bash
 grpcurl -plaintext \
   -d '{"message":"hello","repeats":3,"delay":0.01}' \
   localhost:8001 Echo/EchoMessage
-```
+```
diff --git a/examples/load-tests/README.md b/examples/load-tests/README.md
@@ -84,6 +84,7 @@ Both scripts require the [NGC CLI](https://org.ngc.nvidia.com/setup/installers/c
 | `HTTP_SUPREME_NVCF_URL` | HTTP endpoint URL. |
 | `NVCF_GRPC_URL` | gRPC endpoint URL. |
 | `GRPC_SUPREME_FUNCTION_ID` | Function ID for gRPC calls. |
+| `GRPC_SUPREME_FUNCTION_VERSION_ID` | Function version ID for gRPC calls |
 | `SENT_MESSAGE_SIZE` | Payload size in bytes. |
 | `RESPONSE_COUNT` | Number of response repeats. |
 | `RESPONSE_DELAY_TIME` | Delay between responses in seconds (optional). |