blackbox_exporter监控web 并实现钉钉报警

2023-10-17 23:59

本文主要是介绍blackbox_exporter监控web 并实现钉钉报警,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

1、介绍

Blackbox Exporter是Prometheus社区提供的官方黑盒监控解决方案,其允许用户通过:HTTP、HTTPS、DNS、TCP以及ICMP的方式对网络进行探测。
在prometheus中创建相关的alert rule,再由alert manager发送告警

2、在web服务器10.0.0.6安装blackbox_exporter

# 下载源码包并解压
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.23.0/blackbox_exporter-0.23.0.linux-amd64.tar.gz
tar zxvf  blackbox_exporter-0.23.0.linux-amd64.tar.gz  -C /usr/local/
ln -sv   blackbox_exporter-0.23.0.linux-amd64.tar.gz   blackbox_exporter
创建service文件设置开机启动
vim /usr/lib/systemd/system/black_exporter.service
[Unit]
Description=blackbox_exporter
After=network.target[Service]
User=root
Type=simple
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml
Restart=on-failure[Install]
WantedBy=multi-user.targetsystemctl enable --now blackbox_exporter.service
编辑配置文件blackbox.yml
modules:http_2xx:prober: httptimeout: 5shttp:valid_http_versions:- "HTTP/1.1"- "HTTP/2"valid_status_codes: []  # Defaults to 2xxenable_http2: falsemethod: GETno_follow_redirects: false# fail_if_ssl为true时,表示如果站点启用了SSL则探针失败,反之成功; # fail_if_not_ssl刚好相反;fail_if_ssl: falsefail_if_not_ssl: false#  fail_if_body_matches_regexp, fail_if_body_not_matches_regexp, fail_if_header_matches, fail_if_header_not_matches#  可以定义一组正则表达式,用于验证HTTP返回内容是否符合或者不符合正则表达式的内容fail_if_body_matches_regexp:- "Could not connect to database"tls_config:insecure_skip_verify: falsepreferred_ip_protocol: "ip4" # defaults to "ip6"http_post_2xx:prober: httphttp:method: POSTtcp_connect:prober: tcppop3s_banner:prober: tcptcp:query_response:- expect: "^+OK"tls: truetls_config:insecure_skip_verify: falsegrpc:prober: grpcgrpc:tls: truepreferred_ip_protocol: "ip4"grpc_plain:prober: grpcgrpc:tls: falseservice: "service1"ssh_banner:prober: tcptcp:query_response:- expect: "^SSH-2.0-"- send: "SSH-2.0-blackbox-ssh-check"irc_banner:prober: tcptcp:query_response:- send: "NICK prober"- send: "USER prober prober prober :prober"- expect: "PING :([^ ]+)"send: "PONG ${1}"- expect: "^:[^ ]+ 001"icmp:prober: icmpicmp_ttl5:prober: icmptimeout: 5sicmp:ttl: 5
编辑prometheus配置文件
vim /usr/local/prometheus/prometheus.yml
#添加- job_name: 'blackbox'metrics_path: /probeparams:module: [http_2xx]  # Look for a HTTP 200 response.static_configs:- targets:- 10.0.0.6- www.google.com- www.baidu.comrelabel_configs:- source_labels: [__address__]target_label: __param_target- source_labels: [__param_target]target_label: instance- target_label: __address__replacement: "10.0.0.6:9115"  # Blackbox exporter.- target_label: regionreplacement: "remote"
重启服务

在这里插入图片描述
在这里插入图片描述

3 创建告警规则

编辑rule文件
#修改prometheus配置启用告警规则
vim /usr/local/prometheus/prometheus.yml
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configuration
alerting:alertmanagers:- static_configs:- targets:- 10.0.0.5:9093# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:- rules/alert-rules-*.yml# - "first_rules.yml"# - "second_rules.yml"vim /usr/local/prometheus/rules/alert-rules-blackbox-exporter.ymlgroups:
- name: blackboxrules:# Blackbox probe failed- alert: BlackboxProbeFailedexpr: probe_success == 0for: 0mlabels:severity: criticalannotations:summary: Blackbox probe failed (instance {{ $labels.instance }})description: "Probe failed\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox slow probe- alert: BlackboxSlowProbeexpr: avg_over_time(probe_duration_seconds[1m]) > 1for: 1mlabels:severity: warningannotations:summary: Blackbox slow probe (instance {{ $labels.instance }})description: "Blackbox probe took more than 1s to complete\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox probe HTTP failure- alert: BlackboxProbeHttpFailureexpr: probe_http_status_code <= 199 OR probe_http_status_code >= 400for: 0mlabels:severity: criticalannotations:summary: Blackbox probe HTTP failure (instance {{ $labels.instance }})description: "HTTP status code is not 200-399\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox probe slow HTTP- alert: BlackboxProbeSlowHttpexpr: avg_over_time(probe_http_duration_seconds[1m]) > 1for: 1mlabels:severity: warningannotations:summary: Blackbox probe slow HTTP (instance {{ $labels.instance }})description: "HTTP request took more than 1s\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"# Blackbox probe slow ping- alert: BlackboxProbeSlowPingexpr: avg_over_time(probe_icmp_duration_seconds[1m]) > 1for: 1mlabels:severity: warningannotations:summary: Blackbox probe slow ping (instance {{ $labels.instance }})description: "Blackbox ping took more than 1s\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
重启服务验证

在这里插入图片描述
在这里插入图片描述

4、安装alert manager

###二进制安装alert manager

wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
tar -zxvf alertmanager-0.25.0.linux-amd64.tar.gz -C /usr/local/
ln -sv alertmanager-0.25.0.linux-amd64.tar.gz   alertmanager 
###编辑service文件设置开机启动
[Unit]
Description=alertmanager
After=network.target[Service]
User=root
Type=simple
ExecStart=/usr/local/alertmanager --config.file=/usr/local/alertmanager/alertmanager.yml
Restart=on-failure[Install]
WantedBy=multi-user.targetsystemctl enable --now alertmanager.service

下载钉钉插件

wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz
tar -zxvf  prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz  -C /usr/local
ln -sv  prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz    dingtalk
钉钉设置webhook获取地址

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

编辑prometheus-webhook-dingtalk的配置文件
vim config.example.yml
## Request timeout
# timeout: 5s## Uncomment following line in order to write template from scratch (be careful!)
#no_builtin_template: true## Customizable templates path
templates:- ./templates/*.tmpl # 这里指向你生成的模板## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
#default_message:
#  title: '{{ template "legacy.title" . }}'
#  text: '{{ template "legacy.content" . }}'## Targets, previously was known as "profiles"
targets:webhook1:# 钉钉机器人的webhookurl: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx# secret for signature 加签后得到的值secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#  webhook2:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#  webhook_legacy:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    # Customize template content
#    message:
#      # Use legacy template
#      title: '{{ template "legacy.title" . }}'
#      text: '{{ template "legacy.content" . }}'
#  webhook_mention_all:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    mention:
#      all: true
#  webhook_mention_users:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    mention:
#      mobiles: ['156xxxx8827', '189xxxx8325']
编辑告警模板文件

mkdir templates
vim templates/template.tmpl

{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}{{ define "__alert_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}**告警名称**: {{ index .Annotations "title" }} **告警级别**: {{ .Labels.severity }} **告警主机**: {{ .Labels.instance }} **告警信息**: {{ index .Annotations "description" }}**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}{{ define "__resolved_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}**告警名称**: {{ index .Annotations "title" }}**告警级别**: {{ .Labels.severity }}**告警主机**: {{ .Labels.instance }}**告警信息**: {{ index .Annotations "description" }}**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}**恢复时间**: {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====侦测到{{ .Alerts.Firing | len  }}个故障====**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}{{ if gt (len .Alerts.Resolved) 0 }}
**====恢复{{ .Alerts.Resolved | len  }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
配置prometheus的alert manager
vim /usr/local/alertmanager/alertmanager.yml
global:resolve_timeout: 1msmtp_smarthost: 'smtp.163.com:465'smtp_from: '17358273990@163.com'smtp_auth_username: '17358273990@163.com'smtp_auth_password: 'LAJLGFXHNVWPSOAG'smtp_hello: '@163.com'smtp_require_tls: false
route:group_by: ['alertname']group_wait: 30sgroup_interval: 10srepeat_interval: 1m#  receiver: 'email'receiver: 'dingding.webhook1'
receivers:- name: 'email'email_configs:- to: '1305783815@qq.com'send_resolved: true- name: 'dingding.webhook1'webhook_configs:- url: 'http://10.0.0.5:8060/dingtalk/webhook1/send' #这里的webhook1,根据我们在钉钉告警插件配置文件>中targets中指定的值做修改send_resolved: true
inhibit_rules:- source_match:severity: 'critical'target_match:severity: 'warning'equal: ['alertname', 'dev', 'instance']
启动prometheus-webhook-dingtalk
[root@Rocky8 webhook]#./prometheus-webhook-dingtalk --config.file=config.example.yml
测试告警

#重启服务测试告警
在这里插入图片描述

5、grafana

在这里插入图片描述

这篇关于blackbox_exporter监控web 并实现钉钉报警的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/228822

相关文章

python使用Akshare与Streamlit实现股票估值分析教程(图文代码)

《python使用Akshare与Streamlit实现股票估值分析教程(图文代码)》入职测试中的一道题,要求:从Akshare下载某一个股票近十年的财务报表包括,资产负债表,利润表,现金流量表,保存... 目录一、前言二、核心知识点梳理1、Akshare数据获取2、Pandas数据处理3、Matplotl

分布式锁在Spring Boot应用中的实现过程

《分布式锁在SpringBoot应用中的实现过程》文章介绍在SpringBoot中通过自定义Lock注解、LockAspect切面和RedisLockUtils工具类实现分布式锁,确保多实例并发操作... 目录Lock注解LockASPect切面RedisLockUtils工具类总结在现代微服务架构中,分布

Java使用Thumbnailator库实现图片处理与压缩功能

《Java使用Thumbnailator库实现图片处理与压缩功能》Thumbnailator是高性能Java图像处理库,支持缩放、旋转、水印添加、裁剪及格式转换,提供易用API和性能优化,适合Web应... 目录1. 图片处理库Thumbnailator介绍2. 基本和指定大小图片缩放功能2.1 图片缩放的

Python使用Tenacity一行代码实现自动重试详解

《Python使用Tenacity一行代码实现自动重试详解》tenacity是一个专为Python设计的通用重试库,它的核心理念就是用简单、清晰的方式,为任何可能失败的操作添加重试能力,下面我们就来看... 目录一切始于一个简单的 API 调用Tenacity 入门:一行代码实现优雅重试精细控制:让重试按我

Redis客户端连接机制的实现方案

《Redis客户端连接机制的实现方案》本文主要介绍了Redis客户端连接机制的实现方案,包括事件驱动模型、非阻塞I/O处理、连接池应用及配置优化,具有一定的参考价值,感兴趣的可以了解一下... 目录1. Redis连接模型概述2. 连接建立过程详解2.1 连php接初始化流程2.2 关键配置参数3. 最大连

Python实现网格交易策略的过程

《Python实现网格交易策略的过程》本文讲解Python网格交易策略,利用ccxt获取加密货币数据及backtrader回测,通过设定网格节点,低买高卖获利,适合震荡行情,下面跟我一起看看我们的第一... 网格交易是一种经典的量化交易策略,其核心思想是在价格上下预设多个“网格”,当价格触发特定网格时执行买

python设置环境变量路径实现过程

《python设置环境变量路径实现过程》本文介绍设置Python路径的多种方法:临时设置(Windows用`set`,Linux/macOS用`export`)、永久设置(系统属性或shell配置文件... 目录设置python路径的方法临时设置环境变量(适用于当前会话)永久设置环境变量(Windows系统

SpringBoot监控API请求耗时的6中解决解决方案

《SpringBoot监控API请求耗时的6中解决解决方案》本文介绍SpringBoot中记录API请求耗时的6种方案,包括手动埋点、AOP切面、拦截器、Filter、事件监听、Micrometer+... 目录1. 简介2.实战案例2.1 手动记录2.2 自定义AOP记录2.3 拦截器技术2.4 使用Fi

Python对接支付宝支付之使用AliPay实现的详细操作指南

《Python对接支付宝支付之使用AliPay实现的详细操作指南》支付宝没有提供PythonSDK,但是强大的github就有提供python-alipay-sdk,封装里很多复杂操作,使用这个我们就... 目录一、引言二、准备工作2.1 支付宝开放平台入驻与应用创建2.2 密钥生成与配置2.3 安装ali

Spring Security 单点登录与自动登录机制的实现原理

《SpringSecurity单点登录与自动登录机制的实现原理》本文探讨SpringSecurity实现单点登录(SSO)与自动登录机制,涵盖JWT跨系统认证、RememberMe持久化Token... 目录一、核心概念解析1.1 单点登录(SSO)1.2 自动登录(Remember Me)二、代码分析三、