Prometheus简介:
略
部署Prometheus Server:
1.下载Prometheus: root@dokcer:~# wget https://github.com/prometheus/prometheus/releases/download/v2.37.0/prometheus-2.37.0.linux-amd64.tar.gz 2.解压Prometheus: root@dokcer:~# tar -xzvf prometheus-2.37.0.linux-amd64.tar.gz -C /usr/local/ 3.建立软连接: root@dokcer:~# ln -sv /usr/local/prometheus-2.37.0.linux-amd64/ /usr/local/prometheus 4.启动Prometheus,启用默认配置文件: root@dokcer:~# /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml & & 表示命令在后台运行 5.如果出现端口占用情况,先确认9090端口是否被占用: root@dokcer:~# netstat -tulp | grep 9090 #端口9090已经开启被prometheus占用 tcp6 0 0 [::]:9090 [::]:* LISTEN 27047/prometheus root@dokcer:~# lsof -i:9090 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME prometheu 27047 root 3u IPv6 58459 0t0 TCP dokcer:9090->192.168.0.5:60420 (ESTABLISHED) prometheu 27047 root 7u IPv6 58455 0t0 TCP *:9090 (LISTEN) root@dokcer:~# ss -natlp | grep 9090 LISTEN 0 128 *:9090 *:* users:(("prometheus",pid=27047,fd=7)) Prometheus作为一个时间序列数据库,收集的数据会以文件的形式存储在本地中,默认的存储路径为 ~/data 通过--storage.tsdb.path="path"修改数据文件地址
浏览器查看prometheus界面:
访问地址http://:IP:9090
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F15%2Fs2RnVP4broBUiDN.png?table=block&id=208c365f-efb1-484c-9258-7b22f152a2cc&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fb4079e60-97cc-4442-93b1-312bd5edc8af%2F2022-07-15_214319.png?table=block&id=7dd093b3-4908-4bf7-8157-6a9975358e80&cache=v2)
查看监控的机器:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F15%2FALwK5xiaXtoEqkd.png?table=block&id=2c9604cd-853f-4e89-856f-313ed1fd8f1c&cache=v2)
目前只有本地的机器被监控,访问http://IP:9090/metrics,就可以看到被监控机器的各种数据
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F211b2e18-f8aa-4c7d-9bef-d61fa8953ee7%2F2022-07-15_214528.png?table=block&id=93408410-53a9-4d44-ad57-043904589561&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F15%2FfYp3tSKI1yaBlig.png?table=block&id=5984c446-11df-4b56-9b1a-530f17280a1f&cache=v2)
输入表达式进行数据的过滤,并可以选择使用图标的形式来查看数据
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F15%2FgykuVlabMUWSitm.png?table=block&id=9cca7449-b4d1-4c7f-b855-44688d3cad72&cache=v2)
部署被监控机器agent:
被监控的远程主机只需要安装node_exporter组件,就可以通过http://被监控机器IP:9100/metrics,查看到被监控机器的数据指标(cpu,内存,网络)。
1.下载node_exporter: root@dokcer:~# wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz 2.解压到/usr/local/ root@dokcer:~# tar -xzvf node_exporter-1.3.1.linux-amd64.tar.gz -C /usr/local/ 3.创建软连接: root@dokcer:~# ln -sv /usr/local/node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/node_exporter '/usr/local/bin/node_exporter' -> '/usr/local/node_exporter-1.3.1.linux-amd64/node_exporter' 4.运行node_exporter root@dokcer:~# nohup node_exporter & #用nohup后台运行node_exporter root@dokcer:~# nohup node_exporter & [1] 27954 root@dokcer:~# jobs -l #查看当前终端的nohup运行的命令,其他的supervisord,screen命令也可以 [1]+ 27954 Running nohup node_exporter & root@dokcer:~# nohup: ignoring input and appending output to 'nohup.out' #可以看到后台运行命令的输出信息 5.查看端口占用状态: root@dokcer:~# netstat -tulp | grep 9100 #服务开启成功 tcp6 0 0 [::]:9100 [::]:* LISTEN 27954/node_exporter root@dokcer:~# lsof -i:9100 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME node_expo 27954 root 3u IPv6 63234 0t0 TCP *:9100 (LISTEN) root@dokcer:~# ss -natlp | grep 9100 LISTEN 0 128 *:9100 *:* users:(("node_exporter",pid=27954,fd=3)) 6.prometheus server配置文件添加被控端: 编辑/usr/local/prometheus/prometheus.yml 文件添加内容: scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ["localhost:9090"] - job_name: "agent" static_configs: - targets: ["192.168.0.50:9100"] ~ 7.重启prometheus server root@dokcer:~# pkill prometheus #关闭prometheus server / kill prometheus pid root@dokcer:~# lsof -i:9090 #查看端口状态 root@dokcer:~# /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml &
打开http://IP:9090 → Status → Targets, 就可以看到被添加上的被监控的主机:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FKasNSdBYZPW7uXe.png?table=block&id=be0050ee-01e0-4257-af6b-d8c03155b84d&cache=v2)
部署Prometheus图形显示界面Grafana
Grafana可以使用web界面直观显示出Prometheus所监控的机器的数据指标:
1.添加apt源: root@dokcer:~# wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - OK root@dokcer:~# echo "deb https://packages.grafana.com/enterprise/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list root@dokcer:~# sudo apt-get update root@dokcer:~# sudo apt-get install grafana-enterprise root@dokcer:~# dpkg -l gra* #查看是否安装成功 Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==============================-====================-====================-================================================================= un grafana <none> <none> (no description available) ii grafana-enterprise 9.0.3 amd64 Grafana 2.启动grafana server root@dokcer:~# systemctl start grafana-server root@dokcer:~# lsof -i:3000 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME grafana-s 29257 grafana 8u IPv6 68427 0t0 TCP *:3000 (LISTEN) root@dokcer:~# sudo systemctl status grafana-server #查看grafana server是否已经启动 root@dokcer:~# systemctl enable grafana-server.service #设置开机自启 Synchronizing state of grafana-server.service with SysV service script with /lib/systemd/systemd-sysv-install. Executing: /lib/systemd/systemd-sysv-install enable grafana-server Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service. root@dokcer:~# systemctl is-enabled grafana-server #服务已经开机自启 enabled
浏览器访问http://IP:3000就可以看到grafana-server界面:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FCxzgiUhkXdwOHfa.png?table=block&id=720e0522-102a-44b0-8a5c-b1193278bbb9&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FhqtWoxauVImelsK.png?table=block&id=85bf2e61-ec84-4afe-8451-bfd3eca8be99&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FbDYpKXfBk6Qjvcw.png?table=block&id=74aee1eb-70c2-450e-841a-68b153c678cf&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FSrhYvlXpOFxmHj6.png?table=block&id=6b091f42-5ed2-485f-9c5a-446b2ffb9d0f&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2F1lyFm4sugTH6kin.png?table=block&id=4eb30dde-c8ef-4353-9446-7438857afd7d&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FdHY1ZBG3C7Xerbu.png?table=block&id=54551e86-32c4-455d-8021-c553a3da2ba4&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FS94WONhlVsBwI5Z.png?table=block&id=40218c29-b737-40f5-88f8-2644ac2106ea&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FmQMBDR5tqVKb7go.png?table=block&id=0dc039ad-2b04-4c1e-a9a2-69d980d1b061&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FRaqCAV1TwulLcn5.png?table=block&id=1d050be1-2fbe-45de-81b4-cd39644d0ede&cache=v2)
已经可以显示出采集到的数据了:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FFIPHMLEfX94ASmv.png?table=block&id=87e1e7c5-c0e3-4b00-b486-f471374a4cf4&cache=v2)
可以自定义查询和显示采集到的数据:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2F94KzQVYliAEkaS8.png?table=block&id=488c0300-64e5-43c9-964c-9fcb964304c8&cache=v2)
带可以选择导入现成的模板文件导入,这样就不用自己定义视图了,方便快捷:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FdIeZF16VWMUGXKA.png?table=block&id=66d1c331-4198-4714-bc5c-703172f48dc7&cache=v2)
上传模板json文件:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2FbU1Zawqcg75hyLR.png?table=block&id=ce560a62-6c3e-4c38-a6bc-a2e71fc7949b&cache=v2)
显示效果:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs2.loli.net%2F2022%2F07%2F16%2Fd3uMBtobz6jPG2E.png?table=block&id=0906d93e-4d5d-4703-8631-3dd741e2e1f5&cache=v2)
可以Ctrl +/-调整浏览器缩放以达到设置显示面板大小的作用
参考资料:
Prometheus service文件制作:
编辑service文件: vim /usr/lib/systemd/system/prometheus.service 或者 /etc/systemd/system/prometheus.service [Unit] Description=prometheus.io [Service] Restart=on-failure ##ExecStart=/usr/local/mysql_exporter/mysqld_exporter --config.my-cnf=.my.cnf ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml [Install] WantedBy=multi-user.target 启动prometheus.service systemctl start prometheus.service
Node_exporter service:
[Unit] Description=node_exporter [Service] Restart=on-failure ##ExecStart=/usr/local/mysql_exporter/mysqld_exporter --config.my-cnf=.my.cnf ExecStart=node_exporter [Install] WantedBy=multi-user.target
Prometheus监控内容:
监控级别 | 监控内容 | Exporter |
网络 | 网络协议:http、dns、tcp、icmp;网络硬件:路由器,交换机等 | BlackBox Exporter;SNMP Exporter |
主机 | 资源用量 | node exporter |
容器 | 资源用量 | cAdvisor |
应用(Library) | 延迟,错误,QPS,内部状态等 | 代码中集成Prmometheus Client |
中间件 | 资源用量,以及服务状态 | 代码中集成Prmometheus Client |
编排工具 | 集群资源用量,调度等 | Kubernetes Components |