We have access to cluster's metrics through the kube-prometheus-stack-kube-state-metrics Prometheus instance that should be available on the kube-prometheus-stack namespace.
Federated Prometheus is already configured to scrape metrics from the kube state metrics service, but since it filters out metrics that are not aggregated, we need to:
- Aggregate some starting kube metrics into a
kube: recording rule - I'm thinking we could start with kube_pod_container_status_restarts_total, which is what the KubePodCrashLooping alert is based on
- Update Federated Prometheus' filtering to match against
kube: recording rules
We have access to cluster's metrics through the kube-prometheus-stack-kube-state-metrics Prometheus instance that should be available on the kube-prometheus-stack namespace.
Federated Prometheus is already configured to scrape metrics from the kube state metrics service, but since it filters out metrics that are not aggregated, we need to:
kube:recording rule - I'm thinking we could start withkube_pod_container_status_restarts_total, which is what the KubePodCrashLooping alert is based onkube:recording rules