blackbox_exporter监控web 并实现钉钉报警

2023-10-17 23:59

本文主要是介绍blackbox_exporter监控web 并实现钉钉报警,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

1、介绍

Blackbox Exporter是Prometheus社区提供的官方黑盒监控解决方案,其允许用户通过:HTTP、HTTPS、DNS、TCP以及ICMP的方式对网络进行探测。
在prometheus中创建相关的alert rule,再由alert manager发送告警

2、在web服务器10.0.0.6安装blackbox_exporter

# 下载源码包并解压
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.23.0/blackbox_exporter-0.23.0.linux-amd64.tar.gz
tar zxvf  blackbox_exporter-0.23.0.linux-amd64.tar.gz  -C /usr/local/
ln -sv   blackbox_exporter-0.23.0.linux-amd64.tar.gz   blackbox_exporter
创建service文件设置开机启动
vim /usr/lib/systemd/system/black_exporter.service
[Unit]
Description=blackbox_exporter
After=network.target[Service]
User=root
Type=simple
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml
Restart=on-failure[Install]
WantedBy=multi-user.targetsystemctl enable --now blackbox_exporter.service
编辑配置文件blackbox.yml
modules:http_2xx:prober: httptimeout: 5shttp:valid_http_versions:- "HTTP/1.1"- "HTTP/2"valid_status_codes: []  # Defaults to 2xxenable_http2: falsemethod: GETno_follow_redirects: false# fail_if_ssl为true时,表示如果站点启用了SSL则探针失败,反之成功; # fail_if_not_ssl刚好相反;fail_if_ssl: falsefail_if_not_ssl: false#  fail_if_body_matches_regexp, fail_if_body_not_matches_regexp, fail_if_header_matches, fail_if_header_not_matches#  可以定义一组正则表达式,用于验证HTTP返回内容是否符合或者不符合正则表达式的内容fail_if_body_matches_regexp:- "Could not connect to database"tls_config:insecure_skip_verify: falsepreferred_ip_protocol: "ip4" # defaults to "ip6"http_post_2xx:prober: httphttp:method: POSTtcp_connect:prober: tcppop3s_banner:prober: tcptcp:query_response:- expect: "^+OK"tls: truetls_config:insecure_skip_verify: falsegrpc:prober: grpcgrpc:tls: truepreferred_ip_protocol: "ip4"grpc_plain:prober: grpcgrpc:tls: falseservice: "service1"ssh_banner:prober: tcptcp:query_response:- expect: "^SSH-2.0-"- send: "SSH-2.0-blackbox-ssh-check"irc_banner:prober: tcptcp:query_response:- send: "NICK prober"- send: "USER prober prober prober :prober"- expect: "PING :([^ ]+)"send: "PONG ${1}"- expect: "^:[^ ]+ 001"icmp:prober: icmpicmp_ttl5:prober: icmptimeout: 5sicmp:ttl: 5
编辑prometheus配置文件
vim /usr/local/prometheus/prometheus.yml
#添加- job_name: 'blackbox'metrics_path: /probeparams:module: [http_2xx]  # Look for a HTTP 200 response.static_configs:- targets:- 10.0.0.6- www.google.com- www.baidu.comrelabel_configs:- source_labels: [__address__]target_label: __param_target- source_labels: [__param_target]target_label: instance- target_label: __address__replacement: "10.0.0.6:9115"  # Blackbox exporter.- target_label: regionreplacement: "remote"
重启服务

在这里插入图片描述
在这里插入图片描述

3 创建告警规则

编辑rule文件
#修改prometheus配置启用告警规则
vim /usr/local/prometheus/prometheus.yml
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configuration
alerting:alertmanagers:- static_configs:- targets:- 10.0.0.5:9093# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:- rules/alert-rules-*.yml# - "first_rules.yml"# - "second_rules.yml"vim /usr/local/prometheus/rules/alert-rules-blackbox-exporter.ymlgroups:
- name: blackboxrules:# Blackbox probe failed- alert: BlackboxProbeFailedexpr: probe_success == 0for: 0mlabels:severity: criticalannotations:summary: Blackbox probe failed (instance {{ $labels.instance }})description: "Probe failed\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox slow probe- alert: BlackboxSlowProbeexpr: avg_over_time(probe_duration_seconds[1m]) > 1for: 1mlabels:severity: warningannotations:summary: Blackbox slow probe (instance {{ $labels.instance }})description: "Blackbox probe took more than 1s to complete\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox probe HTTP failure- alert: BlackboxProbeHttpFailureexpr: probe_http_status_code <= 199 OR probe_http_status_code >= 400for: 0mlabels:severity: criticalannotations:summary: Blackbox probe HTTP failure (instance {{ $labels.instance }})description: "HTTP status code is not 200-399\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox probe slow HTTP- alert: BlackboxProbeSlowHttpexpr: avg_over_time(probe_http_duration_seconds[1m]) > 1for: 1mlabels:severity: warningannotations:summary: Blackbox probe slow HTTP (instance {{ $labels.instance }})description: "HTTP request took more than 1s\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox probe slow ping- alert: BlackboxProbeSlowPingexpr: avg_over_time(probe_icmp_duration_seconds[1m]) > 1for: 1mlabels:severity: warningannotations:summary: Blackbox probe slow ping (instance {{ $labels.instance }})description: "Blackbox ping took more than 1s\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
重启服务验证

在这里插入图片描述
在这里插入图片描述

4、安装alert manager

###二进制安装alert manager

wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
tar -zxvf alertmanager-0.25.0.linux-amd64.tar.gz -C /usr/local/
ln -sv alertmanager-0.25.0.linux-amd64.tar.gz   alertmanager 
###编辑service文件设置开机启动
[Unit]
Description=alertmanager
After=network.target[Service]
User=root
Type=simple
ExecStart=/usr/local/alertmanager --config.file=/usr/local/alertmanager/alertmanager.yml
Restart=on-failure[Install]
WantedBy=multi-user.targetsystemctl enable --now alertmanager.service

下载钉钉插件

wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz
tar -zxvf  prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz  -C /usr/local
ln -sv  prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz    dingtalk
钉钉设置webhook获取地址

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

编辑prometheus-webhook-dingtalk的配置文件
vim config.example.yml
## Request timeout
# timeout: 5s## Uncomment following line in order to write template from scratch (be careful!)
#no_builtin_template: true## Customizable templates path
templates:- ./templates/*.tmpl # 这里指向你生成的模板## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
#default_message:
#  title: '{{ template "legacy.title" . }}'
#  text: '{{ template "legacy.content" . }}'## Targets, previously was known as "profiles"
targets:webhook1:# 钉钉机器人的webhookurl: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx# secret for signature 加签后得到的值secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#  webhook2:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#  webhook_legacy:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    # Customize template content
#    message:
#      # Use legacy template
#      title: '{{ template "legacy.title" . }}'
#      text: '{{ template "legacy.content" . }}'
#  webhook_mention_all:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    mention:
#      all: true
#  webhook_mention_users:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    mention:
#      mobiles: ['156xxxx8827', '189xxxx8325']
编辑告警模板文件

mkdir templates
vim templates/template.tmpl

{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}{{ define "__alert_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}**告警名称**: {{ index .Annotations "title" }} **告警级别**: {{ .Labels.severity }} **告警主机**: {{ .Labels.instance }} **告警信息**: {{ index .Annotations "description" }}**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}{{ define "__resolved_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}**告警名称**: {{ index .Annotations "title" }}**告警级别**: {{ .Labels.severity }}**告警主机**: {{ .Labels.instance }}**告警信息**: {{ index .Annotations "description" }}**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}**恢复时间**: {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====侦测到{{ .Alerts.Firing | len  }}个故障====**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}{{ if gt (len .Alerts.Resolved) 0 }}
**====恢复{{ .Alerts.Resolved | len  }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
配置prometheus的alert manager
vim /usr/local/alertmanager/alertmanager.yml
global:resolve_timeout: 1msmtp_smarthost: 'smtp.163.com:465'smtp_from: '17358273990@163.com'smtp_auth_username: '17358273990@163.com'smtp_auth_password: 'LAJLGFXHNVWPSOAG'smtp_hello: '@163.com'smtp_require_tls: false
route:group_by: ['alertname']group_wait: 30sgroup_interval: 10srepeat_interval: 1m#  receiver: 'email'receiver: 'dingding.webhook1'
receivers:- name: 'email'email_configs:- to: '1305783815@qq.com'send_resolved: true- name: 'dingding.webhook1'webhook_configs:- url: 'http://10.0.0.5:8060/dingtalk/webhook1/send' #这里的webhook1,根据我们在钉钉告警插件配置文件>中targets中指定的值做修改send_resolved: true
inhibit_rules:- source_match:severity: 'critical'target_match:severity: 'warning'equal: ['alertname', 'dev', 'instance']
启动prometheus-webhook-dingtalk
[root@Rocky8 webhook]#./prometheus-webhook-dingtalk --config.file=config.example.yml
测试告警

#重启服务测试告警
在这里插入图片描述

5、grafana

在这里插入图片描述

这篇关于blackbox_exporter监控web 并实现钉钉报警的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/228822

相关文章

SpringBoot集成redisson实现延时队列教程

《SpringBoot集成redisson实现延时队列教程》文章介绍了使用Redisson实现延迟队列的完整步骤,包括依赖导入、Redis配置、工具类封装、业务枚举定义、执行器实现、Bean创建、消费... 目录1、先给项目导入Redisson依赖2、配置redis3、创建 RedissonConfig 配

Python的Darts库实现时间序列预测

《Python的Darts库实现时间序列预测》Darts一个集统计、机器学习与深度学习模型于一体的Python时间序列预测库,本文主要介绍了Python的Darts库实现时间序列预测,感兴趣的可以了解... 目录目录一、什么是 Darts?二、安装与基本配置安装 Darts导入基础模块三、时间序列数据结构与

Python使用FastAPI实现大文件分片上传与断点续传功能

《Python使用FastAPI实现大文件分片上传与断点续传功能》大文件直传常遇到超时、网络抖动失败、失败后只能重传的问题,分片上传+断点续传可以把大文件拆成若干小块逐个上传,并在中断后从已完成分片继... 目录一、接口设计二、服务端实现(FastAPI)2.1 运行环境2.2 目录结构建议2.3 serv

C#实现千万数据秒级导入的代码

《C#实现千万数据秒级导入的代码》在实际开发中excel导入很常见,现代社会中很容易遇到大数据处理业务,所以本文我就给大家分享一下千万数据秒级导入怎么实现,文中有详细的代码示例供大家参考,需要的朋友可... 目录前言一、数据存储二、处理逻辑优化前代码处理逻辑优化后的代码总结前言在实际开发中excel导入很

SpringBoot+RustFS 实现文件切片极速上传的实例代码

《SpringBoot+RustFS实现文件切片极速上传的实例代码》本文介绍利用SpringBoot和RustFS构建高性能文件切片上传系统,实现大文件秒传、断点续传和分片上传等功能,具有一定的参考... 目录一、为什么选择 RustFS + SpringBoot?二、环境准备与部署2.1 安装 RustF

Nginx部署HTTP/3的实现步骤

《Nginx部署HTTP/3的实现步骤》本文介绍了在Nginx中部署HTTP/3的详细步骤,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学... 目录前提条件第一步:安装必要的依赖库第二步:获取并构建 BoringSSL第三步:获取 Nginx

MyBatis Plus实现时间字段自动填充的完整方案

《MyBatisPlus实现时间字段自动填充的完整方案》在日常开发中,我们经常需要记录数据的创建时间和更新时间,传统的做法是在每次插入或更新操作时手动设置这些时间字段,这种方式不仅繁琐,还容易遗漏,... 目录前言解决目标技术栈实现步骤1. 实体类注解配置2. 创建元数据处理器3. 服务层代码优化填充机制详

Python实现Excel批量样式修改器(附完整代码)

《Python实现Excel批量样式修改器(附完整代码)》这篇文章主要为大家详细介绍了如何使用Python实现一个Excel批量样式修改器,文中的示例代码讲解详细,感兴趣的小伙伴可以跟随小编一起学习一... 目录前言功能特性核心功能界面特性系统要求安装说明使用指南基本操作流程高级功能技术实现核心技术栈关键函

Java实现字节字符转bcd编码

《Java实现字节字符转bcd编码》BCD是一种将十进制数字编码为二进制的表示方式,常用于数字显示和存储,本文将介绍如何在Java中实现字节字符转BCD码的过程,需要的小伙伴可以了解下... 目录前言BCD码是什么Java实现字节转bcd编码方法补充总结前言BCD码(Binary-Coded Decima

SpringBoot全局域名替换的实现

《SpringBoot全局域名替换的实现》本文主要介绍了SpringBoot全局域名替换的实现,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一... 目录 项目结构⚙️ 配置文件application.yml️ 配置类AppProperties.Ja