下载二进制包:Download | Prometheus

前提条件

  • Prometheus

  • alertmanager

  • 设置告警规则

  • 已有监控节点/服务

创建告警机器人

创建群聊

添加机器人

配置安全设置为加签,并记录Webhook和加签密钥

安装dingtalk-webhook

下载地址:Releases · timonwong/prometheus-webhook-dingtalk (github.com)

安装

tar zxvf prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /usr/local/prometheus
mv /usr/local/prometheus/prometheus-webhook-dingtalk-2.1.0.linux-amd64 /usr/local/prometheus/dingtalk

修改配置文件

配置告警消息
vim /usr/local/prometheus/dingtalk/default.tmpl
{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}
​
​
{{ define "__alert_list" }}{{ range . }}
---
​
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
​
**告警主题**: {{ .Annotations.summary }}
​
**告警类型**: {{ .Labels.alertname }}
​
**告警级别**: {{ .Labels.severity }} 
​
**告警主机**: {{ .Labels.instance }} 
​
**告警信息**: {{ index .Annotations "description" }}
​
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
​
{{ define "__resolved_list" }}{{ range . }}
---
​
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
​
**告警主题**: {{ .Annotations.summary }}
​
**告警类型**: {{ .Labels.alertname }} 
​
**告警级别**: {{ .Labels.severity }}
​
**告警主机**: {{ .Labels.instance }}
​
**告警信息**: {{ index .Annotations "description" }}
​
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
​
**恢复时间**: {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
​
​
{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}
​
{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====侦测到{{ .Alerts.Firing | len  }}个故障====**
​
{{ template "__alert_list" .Alerts.Firing }}
---
​
{{ end }}
​
{{ if gt (len .Alerts.Resolved) 0 }}
**====恢复{{ .Alerts.Resolved | len  }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}
​
​
{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
钉钉机器人集成
vim /usr/local/prometheus/dingtalk/config.yml
## Request timeout
# timeout: 5s
​
## Uncomment following line in order to write template from scratch (be careful!)
#no_builtin_template: true
​
## Customizable templates path
templates:
  - /usr/local/prometheus/dingtalk/default.tmpl
​
## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
#default_message:
#  title: '{{ template "legacy.title" . }}'
#  text: '{{ template "legacy.content" . }}'
​
## Targets, previously was known as "profiles"
targets:
  webhook1:
    # token
    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxxxxxxx
    # 加签密钥
    secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

修改alertmanager配置文件

vim /usr/local/prometheus/alertmanager/alertmanager.yml
route:
  group_by: ['dingtalk']
  group_wait: 1s
  group_interval: 5m
  repeat_interval: 1h
  receiver: 'dingtalk.webhook1'
  routes:
  - receiver: "dingtalk.webhook1"
    match_re:
      altername: ".*"
receivers:
  - name: 'dingtalk.webhook1'
    webhook_configs:
      - url: 'http://localhost:8060/dingtalk/webhook1/send'
        send_resolved: true
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

重启alertmanager

systemctl restart alertmanager

验证

访问alertmanager地址:http://ip:9093/#/status,验证配置生效。

测试

找一个节点或服务,我这里停掉当前节点的node_exporter服务

systemctl stop node_exporter

触发以下告警规则

  - alert: 服务器宕机
    expr: up == 0
    for: 1s
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} 服务器宕机, 请尽快处理!"
      description: "{{$labels.instance}} 服务器node_exporter服务被关闭,当前状态{{ $value }}. "

稍等一下,收到