Prometheus(三)——基于prometheus联邦收集node的指标数据

发布时间 2023-09-22 15:14:18作者: areke

第十二周-Prometheus(三)

一、实现基于prometheus联邦收集node的指标数据

1.1 部署prometheusprometheus server、联邦节点方法相同

下载

mkdir /apps
cd /apps
wget https://github.com/prometheus/prometheus/releases/download/v2.40.7/prometheus-2.40.7.linux-amd64.tar.gz
tar -xvf prometheus-2.40.7.linux-amd64.tar.gz
ln -s /apps/prometheus-2.40.7.linux-amd64 /apps/prometheus

启动prometheus服务

cat >>/etc/systemd/system/prometheus.service <<EOF
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target

[Service]
Restart=on-failure
WorkingDirectory=/apps/prometheus/
ExecStart=/apps/prometheus/prometheus   --config.file=/apps/prometheus/prometheus.yml --web.enable-lifecycle

[Install]
WantedBy=multi-user.target
EOF

启动服务

systemctl daemon-reload
systemctl enable --now prometheus.service

1.2 部署node_exporter

说明:若k8s环境中已通过其他方式部署prometheus node-exporter,需先停止或更改监听端口,防止端口冲突

下载二进制程序

下载地址:https://github.com/prometheus/node_exporter/releases

mkdir /apps
cd /apps
wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz
tar -xvf node_exporter-1.4.0.linux-amd64.tar.gz
ln -s /apps/node_exporter-1.4.0.linux-amd64 /apps/node_exporter

创建service

cat >>/etc/systemd/system/node-exporter.service <<EOF
[Unit]
Description=Prometheus Node Exporter
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target

[Service]
ExecStart=/apps/node_exporter/node_exporter

[Install]
WantedBy=multi-user.target
EOF

启动node-exporter服务

systemctl daemon-reload
systemctl enable --now node-exporter.service

验证状态

# 查看服务状态
[root@k8s-node1 apps]#systemctl is-active node-exporter.service 
active

# 查看监听端口
[root@k8s-node1 apps]#netstat -ntlp|grep 9100
tcp6       0      0 :::9100                 :::*                    LISTEN      3276156/node_export 

验证node-exporter web页面

查看node-exporter指标数据

https://knowledge.zhaoweiguo.com/build/html/cloudnative/prometheus/metrics/kubernetes-nodes.html

image

常见指标说明

node_boot_time		系统自启动以后的总运行时间
node_cpu		系统CPU使用量
node_disk*		磁盘IO
node_filesystem*	系统文件使用量
node_load1		系统CPU负载
node_memory*		内存使用量
node_network*		网络带宽指标
go_*			node exporter中go相关指标
process_*		node exporter自身进程相关运行指标

1.3 联邦节点配置监控

联邦节点1监控node1

vim /apps/prometheus/prometheus.yml 
......
  - job_name: "prometheus-node1"
    static_configs:
      - targets: ["10.0.0.84:9100"]

联邦节点2监控node2

vim /apps/prometheus/prometheus.yml 
......
  - job_name: "prometheus-node2"
    static_configs:
      - targets: ["10.0.0.85:9100"]

验证

1.4 server采集联邦节点

数据采集配置

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "prometheus-federate-1-82"
    scrape_interval: 10s
    honor_labels: true
    metrics_path: '/federate'
    params:
      'match[]':
        - '{job="prometheus"}'
        - '{__name__=~"job:.*"}'
        - '{__name__=~"node.*"}'
    static_configs:
      - targets:
        - '10.0.0.82:9090'

  - job_name: "prometheus-federate-2-83"
    scrape_interval: 10s
    honor_labels: true
    metrics_path: '/federate'
    params:
      'match[]':
        - '{job="prometheus"}'
        - '{__name__=~"job:.*"}'
        - '{__name__=~"node.*"}'
    static_configs:
      - targets:
        - '10.0.0.83:9090'

#k8s集群prometheus
  - job_name: "prometheus-k8s-11"
    scrape_interval: 10s
    honor_labels: true
    metrics_path: '/federate'
    params:
      'match[]':
        - '{job="prometheus"}'
        - '{__name__=~"job:.*"}'
        - '{__name__=~"node.*"}'
    static_configs:
      - targets:
        - '10.0.0.11:9090'

验证状态

​​

查询指标数据

​​

1.5 grafana

1.5.1 部署grafana

3.1 二进制部署grafana

下载:https://grafana.com/grafana/download

国内镜像源下载:https://mirrors.tuna.tsinghua.edu.cn/grafana/

安装说明:https://grafana.com/docs/grafana/latest/setup-grafana/installation/

3.1.1 下载并安装

wget https://mirrors.tuna.tsinghua.edu.cn/grafana/apt/pool/main/g/grafana-enterprise/grafana-enterprise_9.3.0_amd64.deb
apt update
apt-get install -y adduser libfontconfig1
dpkg -i grafana-enterprise_9.3.0_amd64.deb

3.1.2 修改grafana配置文件

vim /etc/grafana/grafana.ini 
......
# 配置端口类型、地址、端口号
[server]
protocol = http

http_addr = 10.0.0.62

http_port = 3000

3.1.3 启动服务

systemctl enable grafana-server.service 
systemctl restart grafana-server.service 

查看端口

[root@grafana opt]#netstat -ntlp|grep 3000
tcp        0      0 10.0.0.62:3000          0.0.0.0:*               LISTEN      5268/grafana-server

3.1.4 验证grafana web界面

  1. 登录http://10.0.0.62:3000

  1. 进入首页

3.1.5 添加数据源

选择prometheus

设置数据源名称,访问prometheus server的URL地址

配置数据源信息如下:

1.5.2 导入模板

11074

8919

二、总结prometheus单机存储、实现victoriametrics单机远程存储

2.1 prometheus单机存储

Prometheus有着非常高效的时间序列数据存储方法,每个采样数据仅仅占用3.5byte左右空间,上百万条时间序列,30秒间隔,保留60天,大概200多G空间。

2.1.1 本地存储简介

默认情况下,prometheus将采集到的数据存储在本地的TSDB数据库中,路径默认为prometheus安装目录的data目录,数据写入过程为先把数据写入wal日志并放在内存,然后2小时后将内存数据保存至一个新的block块,同时再把新采集的数据写入内存并在2小时后再保存至一个新的block 块,以此类推。

2.1.2 block简介

每个block为一个data目录中以01开头的存储目录

2.1.3 block特性

block会压缩、合并历史数据块,已经删除过期的块,随着压缩、合并,block的数量会减少,在压缩过程中会发生三件事:定期执行压缩、合并小的block到大的block、清理过期的块。

每个块有4部分组成:

~# tree /apps/prometheus/data/01FQNCYZOBPFA8AQDDZM1C5PRN/
/apps/prometheus/data/01FQNCYZOBPFA8AQDDZM1C5PRN/
├── chunks
│   └── 000001	#数据目录每个大小为512MB超过会被切分为多个
├── index		#索引文件,记录存储的数据的索引信息,通过文件内的几个表来查找时序数据
├── meta.json	#block元数据信息,包含了样本数、采集数据的起始时间、压缩历史
└── tombstones	#逻辑数据,主要记载删除记录和标记要删除的内客,删除标记,可在查询块时排除样本。

2.1.4 本地存储配置参数

--config.file="prometheus.yml"		#指定配置文件
--web.listen-address="0.0.0.0:9090"	#指定监听地址
--storage.tsdb.path="data/"			#指定数存储目录
--storage.tsdb.retention.size=Bl KB,MB,GB,TB,PB,EB	#指定chunk 大小,默认512MB
--storage.tsdb.retention.time= 		#数据保存时长,默认15天
--query.timeout=2m					#最大查询超时时间
-query.max-concurrency=20			#最大查询并发数
--web.read-timeout=5m				#最大空闲超时时间
--web.max-connections=512			#最大并发连接数
--web.enable-lifecycle				#启用API动态加载配置功能

2.2 victoriaMetrics单机远程存储

2.2.1 下载

https://github.com/VictoriaMetrics/VictoriaMetrics/releases

https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.1/victoria-metrics-linux-amd64-v1.93.1.tar.gz

wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.1/victoria-metrics-linux-amd64-v1.93.1.tar.gz
tar xvf victoria-metrics-linux-amd64-v*.*.*.tar.gz
mv victoria-metrics-prod /usr/local/bin/

2.2.2 service启动文件

cat >> /etc/systemd/system/victoria-metrics-prod.service <<EOF
[Unit]
Description=For Victoria-metrics-prod Service
After=network.target

[Service]
ExecStart=/usr/local/bin/victoria-metrics-prod   -httpListenAddr=0.0.0.0:8428 -storageDataPath=/data/victoria -retentionPeriod=3

[Install]
WantedBy=multi-user.target
EOF


systemctl start victoria-metrics-prod.service 
systemctl status victoria-metrics-prod.service 

参数

-httpListenAddr=O.0.0.0:8428	#监听地址及端口
-storageDataPath 	#VictoriaMetrics将所有数据存储在此目录中,默认为执行启动victoria的当前目录下的victoria-metrics-data目录中。
-retentionPeriod	#存储数据的保留,较旧的数据会自动删除,默认保留期为1个月,默认单位为m(月),支持的单位有h (hour), d (day), w (week),y (year)。

2.2.3 访问网页

2.2.4 配置Prometheus

global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# 单机配置
remote_write:
  - url: http://10.0.0.84:8428/api/v1/write

2.2.5 验证VictoriaMetrics数据

进入web页面

查询node_load1

2.2.6 grafana设置数据源

数据源设置Victoria地址

2.2.7 grafana模板设置

导入模板8919

三、实现prometheus 基于victoriametrics 集群远程存储

3.1 架构

3.2 组件介绍

3.2.1 vminsert

写入组件(写),vminsert负责接收数据写入,并根据对度量名称及其所有标签的一致hash结果将数据分散写入不同的后端vmstorage节点,vminsert默认端口8480

3.2.2 vmstroage

存储原始数据并返回给定时间范围内给定标签过滤器的查询数据,默认端口8482

3.2.3 vmselect

查询组件(读),连接vmstorage,默认端口8481

3.2.4 其他可选组件

vmagent

是一个很小但功能强大的代理,它可以从node_exporter各种来源收集度量数据,并将它们存储在VictoriaMetrics或任何其他支持远程写入协议的与 prometheus兼容的存储系统中,有替代prometheus server的意向。

vmalert

替代Prometheus server,以VictoriaMetrics为数据源,基于兼容Prometheus的告警规则,判断数据是否异常,并将产生的通知发送给alertmanager

vmgateway

读写VictoriaMetrics数据的代理网关,可实现限速和访问控制等功能,目前为企业组件

vmctl

VictoriaMetrics的命令行工具,目前主要用于将prometheus,opentsdb等数据源的数据迁移到VictoriaMetrics

3.3 下载

主机清单

vm1 10.0.0.86
vm2 10.0.0.87
vm3 10.0.0.88

https://github.com/VictoriaMetrics/VictoriaMetrics/releases

https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.1/victoria-metrics-linux-amd64-v1.93.1-cluster.tar.gz

tar xvf victoria-metrics-linux-amd64-v*.*-cluster.tar.gz 
cp vm* /usr/local/bin/

3.4 service

3.4.1 vmstorage-prod

负责数据的持久化,监控端口:API 8482,数据写入端口:8400,数据读取端口:8401

cat >> /etc/systemd/system/vmstorage.service <<EOF
[Unit]
Description=Vmstorage Server
After=network.target

[Service]
Restart=on-failure
WorkingDirectory=/tmp
ExecStart=/usr/local/bin/vmstorage-prod -loggerTimezone=Asia/Shanghai -storageDataPath=/data/vmstorage-data -httpListenAddr=:8482 -vminsertAddr=:8400 -vmselectAddr=:8401 

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload && systemctl enable vmstorage.service  && systemctl start vmstorage.service 

主要参数

-httpListenAddr string
	Address to listen for http connections (default ":8482")
-vminsertAddr string
	TCP address to accept connections from vminsert services (default ":8400")
-vmselectAddr string
	TCP address to accept connections from vmselect services(default ":8401")

3.4.2 vminsert-prod

接收外部的写请求,默认端口8480

cat >> /etc/systemd/system/vminsert.service <<EOF
[Unit]
Description=Vminsert Server
After=network.target

[Service]
Restart=on-failure
WorkingDirectory=/tmp
ExecStart=/usr/local/bin/vminsert-prod -httpListenAddr=:8480 -storageNode=10.0.0.86:8400,10.0.0.87:8400,10.0.0.88:8400

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload && systemctl enable vminsert.service  && systemctl start vminsert.service 

3.4.3 vmselect-prod

负责接收外部的读请求,默认端口8481

cat >> /etc/systemd/system/vmselect.service <<EOF
[Unit]
Description=Vmselect Server
After=network.target

[Service]
Restart=on-failure
WorkingDirectory=/tmp
ExecStart=/usr/local/bin/vmselect-prod -httpListenAddr=:8481 -storageNode=10.0.0.86:8401,10.0.0.87:8401,10.0.0.88:8401

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload && systemctl enable vmselect.service  && systemctl start vmselect.service 

3.5 验证服务端口

vm1

[root@vm1 opt]#ss -ntl
State                Recv-Q                Send-Q                               Local Address:Port                               Peer Address:Port               Process         
LISTEN               0                     4096                                       0.0.0.0:8480                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8481                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8482                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8400                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8401                                    0.0.0.0:*                                  
LISTEN               0                     4096                                 127.0.0.53%lo:53                                      0.0.0.0:*                                  
LISTEN               0                     128                                        0.0.0.0:22                                      0.0.0.0:*                                  
LISTEN               0                     128                                      127.0.0.1:6010                                    0.0.0.0:*                                  
LISTEN               0                     128                                           [::]:22                                         [::]:*                                  
LISTEN               0                     128                                          [::1]:6010                                       [::]:*  

vm2

[root@vm2 opt]#ss -ntl
State                Recv-Q                Send-Q                               Local Address:Port                               Peer Address:Port               Process         
LISTEN               0                     4096                                       0.0.0.0:8400                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8401                                    0.0.0.0:*                                  
LISTEN               0                     4096                                 127.0.0.53%lo:53                                      0.0.0.0:*                                  
LISTEN               0                     128                                        0.0.0.0:22                                      0.0.0.0:*                                  
LISTEN               0                     128                                      127.0.0.1:6010                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8480                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8481                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8482                                    0.0.0.0:*                                  
LISTEN               0                     128                                           [::]:22                                         [::]:*                                  
LISTEN               0                     128                                          [::1]:6010                                       [::]:*  

vm3

[root@vm3 opt]#ss -ntl
State                Recv-Q                Send-Q                               Local Address:Port                               Peer Address:Port               Process         
LISTEN               0                     4096                                       0.0.0.0:8400                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8401                                    0.0.0.0:*                                  
LISTEN               0                     4096                                 127.0.0.53%lo:53                                      0.0.0.0:*                                  
LISTEN               0                     128                                        0.0.0.0:22                                      0.0.0.0:*                                  
LISTEN               0                     128                                      127.0.0.1:6010                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8480                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8481                                    0.0.0.0:*                                  
LISTEN               0                     4096                                       0.0.0.0:8482                                    0.0.0.0:*                                  
LISTEN               0                     128                                           [::]:22                                         [::]:*                                  
LISTEN               0                     128                                          [::1]:6010                                       [::]:*   

可网页访问测试

http://10.0.0.86:8480/metrics

http://10.0.0.86:8481/metrics

http://10.0.0.86:8482/metrics

http://10.0.0.87:8480/metrics

http://10.0.0.87:8481/metrics

http://10.0.0.87:8482/metrics

http://10.0.0.88:8480/metrics

http://10.0.0.88:8481/metrics

http://10.0.0.88:8482/metrics

3.6 配置prometheus

global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# 单机配置
#remote_write:
#  - url: http://10.0.0.84:8428/api/v1/write

# 集群配置
remote_write:
  - url: http://10.0.0.86:8480/insert/0/prometheus
  - url: http://10.0.0.87:8480/insert/0/prometheus
  - url: http://10.0.0.88:8480/insert/0/prometheus

3.7 grafana设置数据源

设置集群查询地址

http://10.0.0.86:8481/select/0/prometheus,可配置VIP实现高可用

3.8 grafana导入模板

13824

3.9 开启数据复制

https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety

默认情况下,数据被vminsert的组件基于hash算法分别将数据持久化到不同的vmstroage节点,可以启用vminsert组件支持的-replicationFactor=N复制功能,将数据分别在各节点保存一份完整的副本以实现数据的高可用。