Skip to content
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ lint:
vet:
go vet $(go list ./... | grep -v /vendor/)
test:
go test -v -cover ./...
go test -v -cover -race ./...
coverage:
go test -v -cover -coverprofile=coverage.out ./... &&\
go tool cover -html=coverage.out -o coverage.html
Expand Down
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ The tool respects the environment variables HTTP_PROXY, HTTPS_PROXY and NO_PROXY

The `/webhook` endpoint accepts alerts from the Alertmanager's [generic webhook receiver](https://prometheus.io/docs/alerting/latest/configuration/#webhook_config).

Alertmanager-Icinga-Bridge expects a the following to be part of an alert.
Alertmanager-Icinga-Bridge expects the following to be part of an alert.

Alert fields:
* `generatorURL`: Is mapped to the Icinga service `action_url`
Expand All @@ -106,7 +106,7 @@ Alternatively, if you enable `--plugin-output-by-states` then the Alertmanager-I

This allows you to configure multiple annotations with different values that are then used with the corresponding service state to set the plugin output.

If an annotation is not found for that specific service state then Alertmanager-Icinga-Bridge will fallback on using the annotation name as configured.
If an annotation is not found for that specific service state then Alertmanager-Icinga-Bridge will fall back on using the annotation name as configured.

### Example Alertmanager Configuration

Expand Down Expand Up @@ -137,7 +137,7 @@ receivers:

## Integration with Icinga

You need to create an Icinga host which the Alertmanager-Icinga-Bridge can use to manage service's for.
You need to create an Icinga host which the Alertmanager-Icinga-Bridge can use to manage services for.

Alertmanager-Icinga-Bridge expects that it has full control over this host.
Therefore, you should create a host for each Alertmanager-Icinga-Bridge instance which you're running.
Expand Down Expand Up @@ -243,9 +243,9 @@ object Service "heartbeat" {
All alert labels and annotations will be mapped to custom variables.
Keys of labels will be prefixed with `label_` and keys of annotations with `annotation_`.

If the key an annotation or label starts with `icinga_` it will also be added as custom variable without any prefix.
If the key of an annotation or label starts with `icinga_` it will also be added as custom variable without any prefix.

Since all labels and annotations are strings, a type information can be provided.
Since all labels and annotations are strings, type information can be provided.
This is done by adding the type as part of the prefix (`icinga_<type>_`).

Current supported types are `number` and `string`.
Expand All @@ -265,13 +265,13 @@ In case there is a label and an annotation with the `icinga_<type>` prefix, the

## Custom Host/Zone/Template

By default, the `--icinga-hostname` is used to create services and `--templates` for the service's template. This can be overridden by the following labels:
By default, the `--icinga-hostname` is used to create services and `--templates` for the service template. This can be overridden by the following labels:

| Alert | Icinga |
| ---------- | ----------- |
| Label: `icinga_use_host: MyHost` | If present, use given host for the new service. The host must exist beforehand |
| Label: `icinga_use_zone: MyZone` | If present, use given zone for the new service The zone must exist beforehand |
| Label: `icinga_use_template: MyTemplate` | If present, use given template for the new service The template must exist beforehand |
| Label: `icinga_use_template: MyTemplate` | If present, use given template for the new service. The template must exist beforehand |

Note that this requires the Alertmanager-Icinga-Bridge user to have the necessary permissions on the host.

Expand All @@ -280,7 +280,9 @@ Note that this requires the Alertmanager-Icinga-Bridge user to have the necessar
Alertmanager-Icinga-Bridge supports creating "heartbeat services" in Icinga.
This can be used to map alerts like a `DeadMansSwitch`. In Prometheus a "watchdog" or "dead man's switch" is an alert that is always firing to ensure alerting pipeline is working.

To treat an alert as a "heartbeat" the alert must have a label `heartbeat` with a [Golang duration](https://pkg.go.dev/time#ParseDuration) as value (e.g. `heartbeat: "1d"`).
To treat an alert as a "heartbeat", the alert must have a label `heartbeat` with a [Golang duration](https://pkg.go.dev/time#ParseDuration) as value (e.g. `heartbeat: "1d"`).

To enable garbage collection on these alerts, they can be set to "downtime" in Icinga. A heartbeat service with an active downtime will be removed by the garbage collection.

The Alertmanager-Icinga-Bridge will create an Icinga service check with active checks enabled and with the check interval set to the parsed duration.
We add 10% to the parsed duration to account for network latency etc., which could otherwise lead to flapping heartbeat checks.
Expand Down
11 changes: 9 additions & 2 deletions internal/api/listener.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: BSD-3-Clause

// Package api provides the HTTP handler that exposes the bride's HTTP API
package api

import (
Expand Down Expand Up @@ -32,6 +33,9 @@ var (
serviceNamePattern = regexp.MustCompile(`^[-+_.:,a-zA-Z0-9 %]{1,128}$`)
)

// Maximum number of bytes to accept as JSON. 100MB should be more than enough
const maxBytesToAccept = 100 << 20

// Listener represents the daemon's API
type Listener struct {
mux http.Handler
Expand Down Expand Up @@ -153,9 +157,12 @@ func (l *Listener) handleHealthy(w http.ResponseWriter, _ *http.Request) {
func (l *Listener) handleIncomingAlert(w http.ResponseWriter, r *http.Request) {
l.logger.Debug("Handling incoming alert", "component", "listener")

// We're only reading a maximum just to be safe.
body := http.MaxBytesReader(w, r.Body, maxBytesToAccept)

var payload WebhookPayload

errDecode := json.NewDecoder(r.Body).Decode(&payload)
errDecode := json.NewDecoder(body).Decode(&payload)

if errDecode != nil {
l.logger.Error("Received invalid JSON", "component", "listener", "error", errDecode.Error())
Expand Down Expand Up @@ -382,7 +389,7 @@ func (l *Listener) generatePluginOutput(alert Alert, exitCode int) string {
// If the PluginOutputByStates option is enabled then first look for an annotation with the state suffix
// otherwise fall back to just using the PluginOutputAnnotations value as is
if l.config.PluginOutputByStates {
// Note, I don't like PluginOutputStateSuffixes being a slide and exitCode being the index
// Note, I don't like PluginOutputStateSuffixes being a slice and exitCode being the index
if value, ok := alert.Annotations[fmt.Sprintf("%s_%s", v, pluginOutputStateSuffixes[exitCode])]; ok {
return value
}
Expand Down
2 changes: 1 addition & 1 deletion internal/config/cli.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ type CLI struct {
IcingaURL []string `kong:"required,env='ALERTMANAGER_ICINGA_BRIDGE_ICINGA_URL',help='Icinga API URL (can be repeated)'"`
IcingaHostname string `kong:"required,env='ALERTMANAGER_ICINGA_BRIDGE_ICINGA_HOSTNAME',help='Icinga host name to manage services for'"`
DisableKeepAlives bool `kong:"default=false,env='ALERTMANAGER_ICINGA_BRIDGE_DISABLE_KEEPALIVES',help='Disable HTTP keepalives'"`
DisplayNameAsServiceName bool `kong:"default='false',env='ALERTMANAGER_ICINGA_BRIDGE_DISPLAY_NAME_AS_SERVICE_NAME',help='Set the Icinga service display name to the generated service name'"`
DisplayNameAsServiceName bool `kong:"default=false,env='ALERTMANAGER_ICINGA_BRIDGE_DISPLAY_NAME_AS_SERVICE_NAME',help='Set the Icinga service display name to the generated service name'"`
IcingaInsecureTLS bool `kong:"default=false,env='ALERTMANAGER_ICINGA_BRIDGE_ICINGA_INSECURE_TLS',help='Skip Icinga TLS verification'"`
IcingaCAFile string `kong:"env='ALERTMANAGER_ICINGA_BRIDGE_ICINGA_CA',help='Path of a custom CA certificate to use when connecting to the Icinga API'"`
IcingaPassword string `kong:"required,env='ALERTMANAGER_ICINGA_BRIDGE_ICINGA_PASSWORD',help='Icinga API password'"`
Expand Down
14 changes: 7 additions & 7 deletions internal/config/config.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: BSD-3-Clause

// Package config provides the central configuration of the tool and the CLI options
package config

import (
Expand Down Expand Up @@ -30,13 +31,12 @@ type Config struct {
CustomSeverityLevels map[string]string
MergedSeverityLevels map[string]int

PluginOutputByStates bool
BearerToken string
ListenAddr string
TLSCertPath string
TLSKeyPath string
PluginOutputAnnotations []string
PluginOutputStateSuffixes []string
PluginOutputByStates bool
BearerToken string
ListenAddr string
TLSCertPath string
TLSKeyPath string
PluginOutputAnnotations []string

IcingaDisableKeepAlives bool
IcingaHostname string
Expand Down
6 changes: 3 additions & 3 deletions internal/gc/gc.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: BSD-3-Clause

// Package gc provides the garbage collector that handles cleanup at the Icinga API
package gc

import (
Expand Down Expand Up @@ -107,7 +108,6 @@ func (g *GarbageCollector) start(ctx context.Context) {

if errSvcRemove != nil {
g.logger.Error("Could not remove service from Icinga", "component", "gc", "service", svc.Name, "error", errSvcRemove.Error())
return
}
}

Expand Down Expand Up @@ -165,7 +165,7 @@ func (g *GarbageCollector) heartbeat(ctx context.Context) {
func (g *GarbageCollector) removeServiceIfRequired(ctx context.Context, service icinga2.Service) error {
_, heartbeat := service.Vars["label_heartbeat"]

if heartbeat && service.HasDowntime() {
if heartbeat && !service.HasDowntime() {
g.logger.Debug("Skipping heartbeat and not downtimed service", "component", "gc", "service", service.Name)
return nil
}
Expand All @@ -186,7 +186,7 @@ func (g *GarbageCollector) removeServiceIfRequired(ctx context.Context, service

if errDel != nil {
g.logger.Error("Could not remove service", "component", "gc", "service", svcName, "error", errDel.Error())
return fmt.Errorf("could remove service: %w", errDel)
return fmt.Errorf("could not remove service: %w", errDel)
}

g.logger.Info("Successfully removed service from Icinga", "component", "gc", "service", svcName)
Expand Down
45 changes: 41 additions & 4 deletions internal/gc/gc_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ func TestGCRemoveService_WithRemoved(t *testing.T) {
}
}

func TestGCRemoveService_WithSkippedDowntime(t *testing.T) {
func TestGCRemoveService_WithHeartbeatNoDowntime(t *testing.T) {
ts := testServerForDelete()
defer ts.Close()

Expand All @@ -89,10 +89,9 @@ func TestGCRemoveService_WithSkippedDowntime(t *testing.T) {
Name: "svc",
Vars: icinga2.Vars{
"keep_for": 20.0,
"label_heartbeat": "true",
"label_heartbeat": "300s",
},
LastStateChange: 1770000000.0,
DowntimeDepth: 1,
}

actualErr := gc.removeServiceIfRequired(context.Background(), svc)
Expand All @@ -105,7 +104,45 @@ func TestGCRemoveService_WithSkippedDowntime(t *testing.T) {
expected := "Skipping heartbeat and not downtimed service"

if !strings.Contains(actual, expected) {
t.Fatalf("expected %v, got %v", expected, actual)
t.Fatalf("expected:\n %v, got:\n %v", expected, actual)
}
}

func TestGCRemoveService_WithHeartbeatDowntime(t *testing.T) {
ts := testServerForDelete()
defer ts.Close()

var buf bytes.Buffer
logger := slog.New(slog.NewTextHandler(&buf, &slog.HandlerOptions{Level: slog.LevelDebug}))

config := testConfig(ts.URL)

icingaClient := icinga2.NewClient(config, logger)

gc := NewGarbageCollector(config, logger, icingaClient)

svc := icinga2.Service{
HostName: "unittest",
Name: "svc",
Vars: icinga2.Vars{
"keep_for": 20.0,
"label_heartbeat": "300s",
},
LastStateChange: 1770000000.0,
DowntimeDepth: 1,
}

actualErr := gc.removeServiceIfRequired(context.Background(), svc)

if actualErr != nil {
t.Errorf("expected no error got %v", actualErr)
}

actual := buf.String()
expected := "Deleting service at Icinga API"

if !strings.Contains(actual, expected) {
t.Fatalf("expected:\n %v, got:\n %v", expected, actual)
}
}

Expand Down
3 changes: 2 additions & 1 deletion internal/icinga2/icinga.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: Apache-2.0

// Package icinga2 provides the HTTP client to talk to the Icinga API
package icinga2

import (
Expand All @@ -21,7 +22,6 @@ import (
const (
icingaActionProcessCheckResultEndpoint = "/v1/actions/process-check-result/"
icingaHostEndpoint = "/v1/objects/hosts/"
icingaHostgroupEndpoint = "/v1/objects/hostgroups/"
icingaServiceEndpoint = "/v1/objects/services/"
)

Expand Down Expand Up @@ -78,6 +78,7 @@ func (c *Client) Do(req *http.Request, path string) (*http.Response, error) {

c.logger.Debug(fmt.Sprintf("Calling Icinga API at %s", req.URL), "component", "icinga")

req.Header.Set("User-Agent", "alertmanager-icinga-bridge")
req.Header.Set("Accept", "application/json")
req.Header.Set("Content-Type", "application/json")

Expand Down
5 changes: 3 additions & 2 deletions main.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: BSD-3-Clause

// Package main parses the CLI flags and starts the various components
package main

import (
Expand Down Expand Up @@ -42,7 +43,7 @@ func buildVersion() string {

func main() {
var cli config.CLI
// Create and parse CLI flags -> move to kong
// Create and parse CLI flags
kong.Parse(&cli,
kong.Name("alertmanager-icinga-bridge"),
kong.Description(`The Alertmanager to Icinga bridge can receive alerts from the Prometheus Alertmanager's generic webhook receiver and creates Icinga Services for these alerts.`),
Expand All @@ -67,7 +68,7 @@ func main() {
// Create Icinga Client
icingaClient := icinga2.NewClient(cfg, logger)

logger.Info("Starting alertmanager-icinga-brigde", "version", version, "commit", commit, "date", date, "component", "main")
logger.Info("Starting alertmanager-icinga-bridge", "version", version, "commit", commit, "date", date, "component", "main")

// Create and start the Service Garbage Collector
garbagecol := gc.NewGarbageCollector(cfg, logger, icingaClient)
Expand Down