Monitor

Goal of this document is to explore monitoring for our stake pool
First exploration will be grafana dashboard with prometheus exports
We will research additional options for monitoring

Grafana dashboards

Looking at this: https://developers.cardano.org/docs/operate-a-stake-pool/grafana-dashboard-tutorial
Boo it is a LOT of instructions geared to debian, will need to translate to nixos… wonder what flakes exists for prometheus and grafana?
TLDR I need to find a simple tutorial for spinning up my grafana node.
I am in the mood to try to just lift configuration.nix sections from this into my own and see what comes up....
This should be simple lets first copy one of our node directories and start with that.
O, or I can start with this https://github.com/input-output-hk/cardano-node/blob/master/doc/logging-monitoring/prometheus.md
I am going to build this with a copy of an existing node, just updating configuration.nix, start_node.sh and main.tf to the role of grafana.
I am mostly stealing the configuration changes from here: https://xeiaso.net/blog/prometheus-grafana-loki-nixos-2020-11-20
The biggest changes is to add the following to the configuration.nix

# grafana configuration
 services.grafana = {
   enable = true;
   domain = "grafana.pele";
   port = 2342;
   addr = "127.0.0.1";
 };
 
 # nginx reverse proxy
 services.nginx.virtualHosts.${config.services.grafana.domain} = {
   locations."/" = {
       proxyPass = "http://127.0.0.1:${toString config.services.grafana.port}";
       proxyWebsockets = true;
   };
 };

Lets init

terragrunt init

And apply

terragrunt apply

In the end I enabled the server, exporter and scrape-configs
Note I also added targets for my relay and block producer the process is still manual, we can look into making this tag based, but this is just POC:

services.prometheus = {
  enable = true;
  port = 9001;
  exporters = {
  node = {
      enable = true;
      enabledCollectors = [ "systemd" ];
      port = 9002;
    };
  };
  scrapeConfigs = [
    {
      job_name = "chrysalis";
      static_configs = [{
        targets = [
            "127.0.0.1:${toString config.services.prometheus.exporters.node.port}"
            "100.108.195.88:${toString config.services.prometheus.exporters.node.port}"
            "100.91.15.74:${toString config.services.prometheus.exporters.node.port}"
         ];
      }];
    }
  ];
};

On each of the relay and block producer I added the following to configuration.nix

services.prometheus = {
    exporters = {
      node = {
        enable = true;
        enabledCollectors = [ "systemd" ];
        port = 9002;
      };
    };
};

And like that I have node_exporter metrics in grafana:
Note the dashboard I used for this is a popular example I found when I searched node_exporter on grafana.com

http://100.82.80.131:2342/d/rYdddlPWk/node-exporter-full?orgId=1

The default login for this POC is admin/admin

Try to add cardano metrics

Lets go back to https://developers.cardano.org/docs/operate-a-stake-pool/grafana-dashboard-tutorial/#5-add-data-from-cexplorer-to-the-dashboard
That has script I need to run to poll metrics from cardano-explorer, but I am not sure how I publish pom metrics using nixos.
I also discovered cardano-tracer: https://github.com/input-output-hk/cardano-node/blob/master/cardano-tracer/docs/cardano-tracer.md#Prometheus
It looks like it already has prometheus exporter built in, going to play with it a bit.
Lets just try on the relay

cd /cardano-node/
  nix profile install .#cardano-tracer

echo hi

Lets see if we can create unix sockets like they reccomend.
Mon is still coming up, so going to try between relay and bp
I think I need to create the socket files on each first:

mkfifo /tmp/forwarder.sock

No, deleted the above
This creates a socket, but I want to leave complexity of networking for now

ssh -nNT -L /tmp/forwarder.sock:/tmp/forwarder.sock -o "ExitOnForwardFailure yes" root@100.93.133.110

I am spinning cardano-tracer up with this config:

cat /cardano-node/cardano-tracer/configuration/minimal-example.yaml 
---
networkMagic: 1
network:
  tag: AcceptAt
  contents: "/tmp/forwarder.sock"
logging:
- logRoot: "/tmp/cardano-tracer-logs"
  logMode: FileMode
  logFormat: ForMachine
hasPrometheus:
  epHost: 127.0.0.1
  epPort: 9031

Running:

cardano-tracer -c /cardano-node/cardano-tracer/configuration/minimal-example.yaml

This looks happy and ens with:

Listening on http://127.0.0.1:9031

Lets see what the socket gets.

curl http://127.0.0.1:9031                                                 
There are no connected nodes yet.

I restart my node, I make sure my node starts with

--tracer-socket-path-connect /tmp/forwarder.sock

But lsof and netstat -an tells me only the tracer process is binding to that socket.

# look at the last few lines of a service
journalctl -xeu <service name that you got from status above>
# keep following a service
journalctl -e -f -u <service name that you got from status above>

It acknowledges the existence of the /tmp/forwarder.sock when I stop/start the node.
It does not however acknowledge connecting, and the curl still shows no connections.
I start the ssh session to attach to the socket from the bp.
Connection on bp side looks fine, but cardano node also does not connect to trace via remote socket.
U suspect tracing is not enabled in the node itself, lots of things in config makes me think it should be:

at /cardano-node/configuration/cardano/testnet-config.json | grep -i trace | grep true
  "TraceAcceptPolicy": true,
  "TraceChainDb": true,
  "TraceConnectionManager": true,
  "TraceDNSResolver": true,
  "TraceDNSSubscription": true,
  "TraceDiffusionInitialization": true,
  "TraceErrorPolicy": true,
  "TraceForge": true,
  "TraceInboundGovernor": true,
  "TraceIpSubscription": true,
  "TraceLedgerPeers": true,
  "TraceLocalErrorPolicy": true,
  "TraceLocalRootPeers": true,
  "TraceMempool": true,
  "TracePeerSelection": true,
  "TracePeerSelectionActions": true,
  "TracePublicRootPeers": true,
  "TraceServer": true,

Right? But wondering if there is master config for this not set?
Nothing obvious in the config, no luck in google or LLM,
Need a good code spelunker, I should ask Rob for some pointers on where to look
Good morning, goal today is to get relay node to give me some trace data.
Way to profile install a nixpkg

nix profile install nixpkgs#socat
#+end
- SAD PANDA! There was always a prometheus section in the /cardano-node/configuration/cardano/testnet-config.json 
  "hasPrometheus": [
    "127.0.0.1",
    12798
  ],
- Sooo we simply add a scrape config on our monitor prometheus, inside the configuration.nix
- TODO: Make this more modular, define this above in exporter
 scrapeConfigs = [
      {
        job_name = "chrysalis";
        static_configs = [{
          targets = [
              "127.0.0.1:${toString config.services.prometheus.exporters.node.port}"
              "100.84.19.134:${toString config.services.prometheus.exporters.node.port}" 
++            "100.84.19.134:12798" 
              "100.113.176.70:${toString config.services.prometheus.exporters.node.port}" 
++            "100.113.176.70:12798"
           ];
        }];
- And from there it was a quick step to stand up https://github.com/sanskys/SNSKY/blob/main/SNSKY_Dashboard_v2.json which is part of this tutorial https://github.com/input-output-hk/cardano-node/blob/master/doc/logging-monitoring/grafana.md
- I also have the node exporter https://grafana.com/grafana/dashboards/854-simple-prometheus-node-exporter/



*** Next step
- I need to add the cardano metrics_exporter and dashboard.
- For future I would also still learn more how to get to the trace data, I think it is writing to the journalctl log but I would like to consume this in prometheus.
- Research additional cardano metrics sources we can use.
- Research doing this in datadog
- Still need to stand up https://developers.cardano.org/docs/operate-a-stake-pool/grafana-dashboard-tutorial/#5-add-data-from-cexplorer-to-the-dashboard prometheus gauge so I can add the dashboard.
- We need actual alerting to send out from this, we can hook this into pagerduty or other SAAS platform, just need to decide what we are doing with it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitor

Grafana dashboards

Try to add cardano metrics

FilesExpand file tree

7_add_monitor.org

Latest commit

History

7_add_monitor.org

File metadata and controls

Monitor

Grafana dashboards

Try to add cardano metrics