-
Notifications
You must be signed in to change notification settings - Fork 754
Description
Summary
Currently, enabling the Prometheus/PrometheusSimple backend in cardano-testnet is not practical for multi-node testnets because all nodes share the same configuration. This causes port collisions since every node attempts to listen on the same PrometheusSimple endpoint (hardcoded to 0.0.0.0:12798). More broadly, there is no centralised logging — each node logs independently to its own stdout/file.
The relevant code is commented out in cardano-testnet/src/Testnet/Defaults.hs (line 306) with a note explaining the limitation and suggesting cardano-tracer as the proper solution.
Problem
- All testnet nodes share a single config, so there is no way to assign unique Prometheus ports per node — only single-node testnets can enable Prometheus without collisions.
- Each node logs independently (Katip file/stdout scribes). There is no unified view of traces across a multi-node testnet.
- Developers debugging or monitoring multi-node testnets lack an easy built-in metrics endpoint and centralised log aggregation.
Proposed solution: integrate cardano-tracer
cardano-tracer is purpose-built for this. It acts as a centralised aggregator that connects to multiple nodes and provides both unified logging and a single Prometheus endpoint with per-node sub-routes.
Architecture
testnet nodes (N) cardano-tracer (1 process)
+------------+ +---------------------------+
| node-spo1 |--\ | Accepts on local socket |
| node-spo2 |-----> forwarder.sock | |
| node-spo3 |--/ | Exposes: |
+------------+ | - Prometheus :3200 |
| - Per-node log dirs |
+---------------------------+
/tracer-logs
/node-spo1/node.json
/node-spo2/node.json
/node-spo3/node.json
Implementation outline
-
New CLI flag — add an
--enable-traceroption toCardanoTestnetOptions, following the existing--enable-grpc/RpcSupportpattern inTestnet/Start/Types.hs. -
Spawn
cardano-traceras an auxiliary process — following the existingSubmitApipattern inTestnet/SubmitApi.hs:- Generate a tracer config (AcceptAt on a local socket, logging to a per-testnet directory, Prometheus on a free port).
- Spawn
cardano-tracer --config <path>before starting nodes. - Register it for cleanup with the existing
MonadResource/ SIGINT handler infrastructure.
-
Add tracer socket arg to each node — in
Testnet/Start/Cardano.hs, append--tracer-socket-path-connect <socket>to each node's CLI args. All nodes connect to the same socket, so no per-node config divergence is needed. -
Clean up commented-out code — the
PrometheusSimpleworkaround inDefaults.hs:306-329becomes obsolete and can be removed.
What this enables
- Centralised logs: Per-node subdirectories under a single root, with rotation — one place to look at all testnet traces.
- Single Prometheus endpoint: Lists all connected nodes at the root, each with its own metrics sub-route.
- Prometheus service discovery:
GET /targetsfor dynamic scraping configurations. - No port collisions: Only the tracer binds network ports, not individual nodes.
- Scales to any node count:
AcceptAtmode requires zero tracer config changes when adding nodes.
Files likely affected
Testnet/Start/Types.hs— new option typeParsers/Cardano.hs— CLI flag parsingTestnet/Start/Cardano.hs— spawn tracer process, add--tracer-socket-path-connectto node argsTestnet/Defaults.hs— tracer config generation, remove commented-outPrometheusSimplecode
References
- Comment in source:
cardano-testnet/src/Testnet/Defaults.hs:306 - cardano-tracer docs (Cardano Developer Portal)
cardano-tracer/docs/cardano-tracer.md