Skip to main content

Alertmanager

Alertmanager is a component of NKE that allows you to monitor your applications.

Availability

Alertmanager is available as an optional service for NKE and it can be deployed using Cockpit.

Usage

Configuring Alertmanager

Alertmanager is the component responsible to send out notifications in case of Prometheus alerts. Alertmanager supports various channels for notifications, like Slack, Email, Hipchat, PagerDuty, etc. Please have a look at the official documentation for detailed information about the configuration. We also supply example configurations.

When an Alertmanager instance is created, it does not have any notification receivers configured by default. You will have to create a full Alertmanager configuration and send it to us. The best way would be to create a secret in your cluster and populate it with the desired config. We will then add the Alertmanager configuration to the instance. Note that this process is only temporary, and you will soon be able to configure the Alertmanager on your own in Cockpit.

Alertmanager configuration examples

1. Send all alerts via email

global:
resolve_timeout: 5m
route:
receiver: "email"
group_by: ["alertname"]
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
routes: []
receivers:
- name: "email"
email_configs:
- to: "monitoring-alerts-list@your-domain.ch"
send_resolved: true
# when using STARTTLS (port 587) this needs to be 'true'
require_tls: false
from: "Alertmanager@your-domain.ch"
smarthost: smtp.your-domain.ch:465
auth_username: "Alertmanager@your-domain.ch"
auth_password: "verysecretsecret"
headers: { Subject: "[Alert] Prometheus Alert Email" }

2. Send all critical alerts via slack. All other severities will be sent out via email. Please make sure to add a severity label to your alerts.

global:
resolve_timeout: 5m
route:
# this specifices the default receiver which will be used if no route matches
receiver: "email"
group_by: ["alertname"]
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
routes:
- receiver: "slack"
match_re:
severity: "[cC]ritical"
receivers:
- name: "email"
email_configs:
- to: "monitoring-alerts-list@your-domain.ch"
send_resolved: true
# when using STARTTLS (port 587) this needs to be 'true'
require_tls: false
from: "Alertmanager@your-domain.ch"
smarthost: smtp.your-domain.ch:465
auth_username: "Alertmanager@your-domain.ch"
auth_password: "verysecretsecret"
headers: { Subject: "[Alert] Prometheus Alert Email" }
- name: "slack"
slack_configs:
- send_resolved: true
api_url: https://hooks.slack.com/services/s8o3m2e0r8a8n2d/8snx2X983
channel: "#alerts"

3. Send all alerts of the production environment via slack. Drop all other alerts. Please make sure to define the label 'environment' in your alerts.

global:
resolve_timeout: 5m
route:
# this specifices the default receiver which will be used if no route matches
receiver: "devnull"
group_by: ["alertname"]
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
routes:
- receiver: "slack"
match:
environment: production
receivers:
- name: "slack"
slack_configs:
- send_resolved: true
api_url: https://hooks.slack.com/services/s8o3m2e0r8a8n2d/8snx2X983
channel: "#alerts"
- name: devnull

4. Use templates to customize your notifications and send all alerts via slack. Here we define some templates in a file called 'slack.tmpl'.

filename: Alertmanager.yaml

global:
resolve_timeout: 5m
# THIS LINE IS VERY IMPORTANT AS OTHERWISE YOUR TEMPLATES WILL NOT BE LOADED
templates:
- "/etc/alertmanager/config/*.tmpl"
route:
receiver: "slack"
group_by: ["alertname"]
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
routes: []
receivers:
- name: "slack"
slack_configs:
- send_resolved: true
api_url: https://hooks.slack.com/services/s8o3m2e0r8a8n2d/8snx2X983
channel: "#alerts"
pretext: "{{ .CommonAnnotations.description }}"
text: '{{ template "slack.myorg.text" . }}'

filename: slack.tmpl

{{ define "slack.myorg.text" -}}
{{ range .Alerts -}}
*Alert:* {{ .Labels.alertname }} - `{{ .Labels.severity }}`
*Description:* {{ .Annotations.description }}
*Details:*
{{ range .Labels.SortedPairs -}}
• *{{ .Name }}:* `{{ .Value }}`
{{ end -}}
{{ template "slack.default.text" . }}
{{ end -}}
{{ end -}}

Video Guide

Checkout our video guide series for GKE Application Monitoring. While the videos are done on our GKE product, the concepts are the same.