63、K8S-使用K8S部署Prometheus、grafana

发布时间 2023-04-12 20:36:22作者: 小粉优化大师

Kubernetes学习目录

1、准备工作

1.1、教程Github地址

https://github.com/prometheus-operator/kube-prometheus.git

1.2、下载编写好的yaml

wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.12.0.tar.gz

1.3、解压项目代码

tar xvf kube-prometheus-0.12.0.tar.gz
cd kube-prometheus-0.12.0/

2、创建命令空间和自定义资源控制器

2.1、应用资源配置清单

]# kubectl create -f kube-prometheus-0.12.0/manifests/setup/
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
namespace/monitoring created

2.2、资源配置清单解析

2.2.1、创建命名空间

]# cat kube-prometheus-0.12.0/manifests/setup/namespace.yaml 
apiVersion: v1
kind: Namespace
metadata:
  name: monitoring

]# kubectl get ns
NAME                   STATUS   AGE
default                Active   9d
ingress-nginx          Active   4d23h
kube-node-lease        Active   9d
kube-public            Active   9d
kube-system            Active   9d
kubernetes-dashboard   Active   4d20h
monitoring             Active   108s

2.2.2、创建自定义控制器

# 其他的配置文件
]# kubectl get customresourcedefinitions.apiextensions.k8s.io | grep coreos alertmanagerconfigs.monitoring.coreos.com 2023-04-12T06:09:30Z alertmanagers.monitoring.coreos.com 2023-04-12T06:09:30Z podmonitors.monitoring.coreos.com 2023-04-12T06:09:30Z probes.monitoring.coreos.com 2023-04-12T06:09:30Z prometheuses.monitoring.coreos.com 2023-04-12T06:09:30Z prometheusrules.monitoring.coreos.com 2023-04-12T06:09:31Z servicemonitors.monitoring.coreos.com 2023-04-12T06:09:31Z thanosrulers.monitoring.coreos.com 2023-04-12T06:09:31Z

3、部署Prometheus资源

3.1、分类出prometheus的资源配置清单

# 将资源配置清单分类为prometheus

mkdir prom-server
mv prometheus-*.yaml prom-server/

prom-server]# ll
-rw-rw-r-- 1 root root   483 Jan 24 18:14 prometheus-clusterRoleBinding.yaml
-rw-rw-r-- 1 root root   430 Jan 24 18:14 prometheus-clusterRole.yaml
-rw-rw-r-- 1 root root   922 Jan 24 18:14 prometheus-networkPolicy.yaml
-rw-rw-r-- 1 root root   546 Jan 24 18:14 prometheus-podDisruptionBudget.yaml
-rw-rw-r-- 1 root root 16430 Jan 24 18:14 prometheus-prometheusRule.yaml
-rw-rw-r-- 1 root root  1238 Jan 24 18:14 prometheus-prometheus.yaml
-rw-rw-r-- 1 root root   507 Jan 24 18:14 prometheus-roleBindingConfig.yaml
-rw-rw-r-- 1 root root  1661 Jan 24 18:14 prometheus-roleBindingSpecificNamespaces.yaml
-rw-rw-r-- 1 root root   402 Jan 24 18:14 prometheus-roleConfig.yaml
-rw-rw-r-- 1 root root  2161 Jan 24 18:14 prometheus-roleSpecificNamespaces.yaml
-rw-rw-r-- 1 root root   342 Jan 24 18:14 prometheus-serviceAccount.yaml
-rw-rw-r-- 1 root root   624 Jan 24 18:14 prometheus-serviceMonitor.yaml
-rw-rw-r-- 1 root root   637 Jan 24 18:14 prometheus-service.yaml

3.2、配置离线镜像

3.2.1、准备离线镜像

docker pull quay.io/prometheus/prometheus:v2.41.0
docker tag quay.io/prometheus/prometheus:v2.41.0 192.168.10.33:80/k8s/prometheus/prometheus:v2.41.0
docker push 192.168.10.33:80/k8s/prometheus/prometheus:v2.41.0

3.2.2、修改资源配置清单image

prom-server]# cat prometheus-prometheus.yaml | grep image
  image: 192.168.10.33:80/k8s/prometheus/prometheus:v2.41.0

3.3、修改service配置

3.3.1、修改SVC为NodePort用于测试

vi prometheus-service.yaml 
spec:
  ports:
  - name: web
    port: 9090
    targetPort: web
  - name: reloader-web
    port: 8080
    targetPort: reloader-web
    nodePort: 30090
  type: NodePort

3.4、应用资源配置清单

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom-server/
clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
networkpolicy.networking.k8s.io/prometheus-k8s created
poddisruptionbudget.policy/prometheus-k8s created
prometheus.monitoring.coreos.com/k8s created
prometheusrule.monitoring.coreos.com/prometheus-k8s-prometheus-rules created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s-config created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus-k8s created

3.5、查询运行状态

]# kubectl -n monitoring get prometheus
NAME   VERSION   DESIRED   READY   RECONCILED   AVAILABLE   AGE
k8s    2.41.0    2                                          5m55s

]# kubectl -n monitoring get svc
NAME             TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
prometheus-k8s   NodePort   10.111.189.205   <none>        9090:30599/TCP,8080:30090/TCP   6m

]# kubectl -n monitoring  get endpoints
NAME             ENDPOINTS   AGE
prometheus-k8s   <none>      9m58s

4、部署prometheusOperator、prometheusAdapter

4.1、安装prometheusOperator

4.1.1、归类配置清单

mkdir prom_opt
mv prometheusOperator-*.yaml prom_opt/

]# cd prom_opt/ && ll
total 36
-rw-rw-r-- 1 root root  471 Jan 24 18:14 prometheusOperator-clusterRoleBinding.yaml
-rw-rw-r-- 1 root root 1401 Jan 24 18:14 prometheusOperator-clusterRole.yaml
-rw-rw-r-- 1 root root 2631 Jan 24 18:14 prometheusOperator-deployment.yaml
-rw-rw-r-- 1 root root  694 Jan 24 18:14 prometheusOperator-networkPolicy.yaml
-rw-rw-r-- 1 root root 5819 Jan 24 18:14 prometheusOperator-prometheusRule.yaml
-rw-rw-r-- 1 root root  321 Jan 24 18:14 prometheusOperator-serviceAccount.yaml
-rw-rw-r-- 1 root root  715 Jan 24 18:14 prometheusOperator-serviceMonitor.yaml
-rw-rw-r-- 1 root root  515 Jan 24 18:14 prometheusOperator-service.yaml

4.1.2、配置离线镜像

# 下载镜像为本地
docker pull quay.io/brancz/kube-rbac-proxy:v0.14.0
docker tag quay.io/brancz/kube-rbac-proxy:v0.14.0 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0
docker push 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0

docker pull quay.io/prometheus-operator/prometheus-operator:v0.62.0
docker tag quay.io/prometheus-operator/prometheus-operator:v0.62.0 192.168.10.33:80/k8s/prometheus-operator/prometheus-operator:v0.62.0
docker push 192.168.10.33:80/k8s/prometheus-operator/prometheus-operator:v0.62.0

docker pull quay.io/brancz/kube-rbac-proxy:v0.14.0
docker tag quay.io/brancz/kube-rbac-proxy:v0.14.0 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0
docker push 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0

docker pull quay.io/prometheus-operator/prometheus-config-reloader:v0.62.0
docker tag quay.io/prometheus-operator/prometheus-config-reloader:v0.62.0 192.168.10.33:80/k8s/prometheus-operator/prometheus-config-reloader:v0.62.0
docker push 192.168.10.33:80/k8s/prometheus-operator/prometheus-config-reloader:v0.62.0

# 修改配置文件
sed -i 's#quay.io#192.168.10.33:80/k8s#g' prometheusOperator-deployment.yaml 
]# cat prometheusOperator-deployment.yaml | grep -E 'image|reloader'
        - --prometheus-config-reloader=192.168.10.33:80/k8s/prometheus-operator/prometheus-config-reloader:v0.62.0
        image: 192.168.10.33:80/k8s/prometheus-operator/prometheus-operator:v0.62.0
        image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0

4.1.3、应用资源配置清单

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_opt/
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
networkpolicy.networking.k8s.io/prometheus-operator created
prometheusrule.monitoring.coreos.com/prometheus-operator-rules created
service/prometheus-operator created
serviceaccount/prometheus-operator created
servicemonitor.monitoring.coreos.com/prometheus-operator created

4.1.4、检查运行状态

]# kubectl -n monitoring get pods -o wide
NAME                                 READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
prometheus-k8s-0                     2/2     Running   0          12m   10.244.3.94    node1   <none>           <none>
prometheus-k8s-1                     2/2     Running   0          12m   10.244.4.126   node2   <none>           <none>
prometheus-operator-ffcc9958-hffd6   2/2     Running   0          12m   10.244.3.93    node1   <none>           <none>

4.1.5、查询运行状态

[root@master1 prom_opt]# kubectl -n monitoring get pods -o wide
NAME                                 READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
prometheus-k8s-0                     2/2     Running   0          14m   10.244.3.94    node1   <none>           <none>
prometheus-k8s-1                     2/2     Running   0          14m   10.244.4.126   node2   <none>           <none>
prometheus-operator-ffcc9958-hffd6   2/2     Running   0          14m   10.244.3.93    node1   <none>           <none>
[root@master1 prom_opt]# kubectl -n monitoring get prometheus
NAME   VERSION   DESIRED   READY   RECONCILED   AVAILABLE   AGE
k8s    2.41.0    2         2       True         True        34m

4.2、安装prometheusAdapter

4.2.1、归类配置清单

mkdir prom_adapter && mv prometheusAdapter-*.yaml prom_adapter/ && cd prom_adapter/

]# ll
-rw-rw-r-- 1 root root  483 Jan 24 18:14 prometheusAdapter-apiService.yaml
-rw-rw-r-- 1 root root  601 Jan 24 18:14 prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
-rw-rw-r-- 1 root root  519 Jan 24 18:14 prometheusAdapter-clusterRoleBindingDelegator.yaml
-rw-rw-r-- 1 root root  496 Jan 24 18:14 prometheusAdapter-clusterRoleBinding.yaml
-rw-rw-r-- 1 root root  403 Jan 24 18:14 prometheusAdapter-clusterRoleServerResources.yaml
-rw-rw-r-- 1 root root  434 Jan 24 18:14 prometheusAdapter-clusterRole.yaml
-rw-rw-r-- 1 root root 2205 Jan 24 18:14 prometheusAdapter-configMap.yaml
-rw-rw-r-- 1 root root 3179 Jan 24 18:14 prometheusAdapter-deployment.yaml
-rw-rw-r-- 1 root root  565 Jan 24 18:14 prometheusAdapter-networkPolicy.yaml
-rw-rw-r-- 1 root root  502 Jan 24 18:14 prometheusAdapter-podDisruptionBudget.yaml
-rw-rw-r-- 1 root root  516 Jan 24 18:14 prometheusAdapter-roleBindingAuthReader.yaml
-rw-rw-r-- 1 root root  324 Jan 24 18:14 prometheusAdapter-serviceAccount.yaml
-rw-rw-r-- 1 root root  907 Jan 24 18:14 prometheusAdapter-serviceMonitor.yaml
-rw-rw-r-- 1 root root  502 Jan 24 18:14 prometheusAdapter-service.yaml

4.2.2、配置离线镜像

docker pull registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.10.0
docker tag registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.10.0 192.168.10.33:80/k8s/prometheus-adapter/prometheus-adapter:v0.10.0
docker push 192.168.10.33:80/k8s/prometheus-adapter/prometheus-adapter:v0.10.0

sed -i 's#registry.k8s.io#192.168.10.33:80/k8s#g' prometheusAdapter-deployment.yaml 
]# cat prometheusAdapter-deployment.yaml | grep image
        image: 192.168.10.33:80/k8s/prometheus-adapter/prometheus-adapter:v0.10.0

4.2.3、应用资源配置清单

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_adapter/
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io configured
clusterrole.rbac.authorization.k8s.io/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader configured
clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created
clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created
clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created
configmap/adapter-config created
deployment.apps/prometheus-adapter created
networkpolicy.networking.k8s.io/prometheus-adapter created
poddisruptionbudget.policy/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created
service/prometheus-adapter created
serviceaccount/prometheus-adapter created
servicemonitor.monitoring.coreos.com/prometheus-adapter created

4.2.4、查询运行状态

]# kubectl -n monitoring get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
prometheus-adapter-67d7695cb7-nlz59   1/1     Running   0          48s   10.244.3.95    node1   <none>           <none>
prometheus-adapter-67d7695cb7-zqgtd   1/1     Running   0          48s   10.244.4.127   node2   <none>           <none>
prometheus-k8s-0                      2/2     Running   0          25m   10.244.3.94    node1   <none>           <none>
prometheus-k8s-1                      2/2     Running   0          25m   10.244.4.126   node2   <none>           <none>
prometheus-operator-ffcc9958-hffd6    2/2     Running   0          25m   10.244.3.93    node1   <none>           <none>

5、部署kubernetesControlPlane、kubeStateMetrics

5.1、安装kubernetesControlPlane

5.1.1、整理分类

mkdir prom_control_plane && mv kubernetesControlPlane-*.yaml prom_control_plane/ && cd prom_control_plane/
[root@master1 prom_control_plane]# ll
total 104
-rw-rw-r-- 1 root root 71670 Jan 24 18:14 kubernetesControlPlane-prometheusRule.yaml
-rw-rw-r-- 1 root root  6997 Jan 24 18:14 kubernetesControlPlane-serviceMonitorApiserver.yaml
-rw-rw-r-- 1 root root   591 Jan 24 18:14 kubernetesControlPlane-serviceMonitorCoreDNS.yaml
-rw-rw-r-- 1 root root  6516 Jan 24 18:14 kubernetesControlPlane-serviceMonitorKubeControllerManager.yaml
-rw-rw-r-- 1 root root  7714 Jan 24 18:14 kubernetesControlPlane-serviceMonitorKubelet.yaml
-rw-rw-r-- 1 root root   577 Jan 24 18:14 kubernetesControlPlane-serviceMonitorKubeScheduler.yaml

5.1.2、应用资源配置清单

这个服务没有要下载的镜像,直接创建即可

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_control_plane/
prometheusrule.monitoring.coreos.com/kubernetes-monitoring-rules created
servicemonitor.monitoring.coreos.com/kube-apiserver created
servicemonitor.monitoring.coreos.com/coredns created
servicemonitor.monitoring.coreos.com/kube-controller-manager created
servicemonitor.monitoring.coreos.com/kube-scheduler created
servicemonitor.monitoring.coreos.com/kubelet created

5.1.3、查询运行状态

]# kubectl get prometheusrules.monitoring.coreos.com  -n monitoring 
NAME                              AGE
kubernetes-monitoring-rules       63s
prometheus-k8s-prometheus-rules   53m
prometheus-operator-rules         35m

5.2、安装kubeStateMetrics

5.2.1、整理分类

]# mkdir prom_kube_state_metric
]# mv kubeStateMetrics-*.yaml prom_kube_state_metric/
# cd prom_kube_state_metric/

]# ll
-rw-rw-r-- 1 root root  464 Jan 24 18:14 kubeStateMetrics-clusterRoleBinding.yaml
-rw-rw-r-- 1 root root 1903 Jan 24 18:14 kubeStateMetrics-clusterRole.yaml
-rw-rw-r-- 1 root root 3428 Jan 24 18:14 kubeStateMetrics-deployment.yaml
-rw-rw-r-- 1 root root  723 Jan 24 18:14 kubeStateMetrics-networkPolicy.yaml
-rw-rw-r-- 1 root root 3152 Jan 24 18:14 kubeStateMetrics-prometheusRule.yaml
-rw-rw-r-- 1 root root  316 Jan 24 18:14 kubeStateMetrics-serviceAccount.yaml
-rw-rw-r-- 1 root root 1167 Jan 24 18:14 kubeStateMetrics-serviceMonitor.yaml
-rw-rw-r-- 1 root root  580 Jan 24 18:14 kubeStateMetrics-service.yaml

5.2.2、配置离线镜像

# 原来的镜像
prom_kube_state_metric]# grep 'image' *
kubeStateMetrics-deployment.yaml:        image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.7.0
kubeStateMetrics-deployment.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.14.0
kubeStateMetrics-deployment.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.14.0

# 准备离线镜像
docker pull registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.7.0
docker tag registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.7.0 192.168.10.33:80/k8s/kube-state-metrics/kube-state-metrics:v2.7.0
docker push 192.168.10.33:80/k8s/kube-state-metrics/kube-state-metrics:v2.7.0

# 修改为离线镜像运行
sed -i 's#registry.k8s.io#192.168.10.33:80/k8s#g' kubeStateMetrics-deployment.yaml
sed -i 's#quay.io#192.168.10.33:80/k8s#g' kubeStateMetrics-deployment.yaml

# 修改后的镜像
prom_kube_state_metric]# grep 'image' *
kubeStateMetrics-deployment.yaml:        image: 192.168.10.33:80/k8s/kube-state-metrics/kube-state-metrics:v2.7.0
kubeStateMetrics-deployment.yaml:        image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0
kubeStateMetrics-deployment.yaml:        image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0

5.2.3、运行镜像

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_kube_state_metric/
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
networkpolicy.networking.k8s.io/kube-state-metrics created
prometheusrule.monitoring.coreos.com/kube-state-metrics-rules created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created
servicemonitor.monitoring.coreos.com/kube-state-metrics created

5.2.4、查询运行状态

]# kubectl -n monitoring get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE     IP              NODE      NOMINATED NODE   READINESS GATES
kube-state-metrics-c7c57885f-t9ct7    3/3     Running   0          30s     10.244.3.96     node1     <none>           <none>
node-exporter-6q6d9                   2/2     Running   0          164m    192.168.10.26   master1   <none>           <none>
node-exporter-7ngm9                   2/2     Running   0          164m    192.168.10.29   node1     <none>           <none>
node-exporter-k7kzr                   2/2     Running   0          163m    192.168.10.30   node2     <none>           <none>
node-exporter-l5cvm                   2/2     Running   0          164m    192.168.10.27   master2   <none>           <none>
prometheus-adapter-67d7695cb7-nlz59   1/1     Running   0          3h7m    10.244.3.95     node1     <none>           <none>
prometheus-adapter-67d7695cb7-zqgtd   1/1     Running   0          3h7m    10.244.4.127    node2     <none>           <none>
prometheus-k8s-0                      2/2     Running   0          3h32m   10.244.3.94     node1     <none>           <none>
prometheus-k8s-1                      2/2     Running   0          3h32m   10.244.4.126    node2     <none>           <none>
prometheus-operator-ffcc9958-hffd6    2/2     Running   0          3h32m   10.244.3.93     node1     <none>           <none>

6、部署nodeExporter、blackboxExporter、alertmanager

6.1、安装nodeExporter

6.1.1、整理分类

mkdir prom_nodeExporter && mv nodeExporter-*.yaml prom_nodeExporter/ && cd prom_nodeExporter/
]# ll
-rw-rw-r-- 1 root root   468 Jan 24 18:14 nodeExporter-clusterRoleBinding.yaml
-rw-rw-r-- 1 root root   485 Jan 24 18:14 nodeExporter-clusterRole.yaml
-rw-rw-r-- 1 root root  3640 Jan 24 18:14 nodeExporter-daemonset.yaml
-rw-rw-r-- 1 root root   671 Jan 24 18:14 nodeExporter-networkPolicy.yaml
-rw-rw-r-- 1 root root 15004 Jan 24 18:14 nodeExporter-prometheusRule.yaml
-rw-rw-r-- 1 root root   306 Jan 24 18:14 nodeExporter-serviceAccount.yaml
-rw-rw-r-- 1 root root   850 Jan 24 18:14 nodeExporter-serviceMonitor.yaml
-rw-rw-r-- 1 root root   492 Jan 24 18:14 nodeExporter-service.yaml

6.1.2、准备离线镜像

docker pull quay.io/prometheus/node-exporter:v1.5.0
docker tag quay.io/prometheus/node-exporter:v1.5.0 192.168.10.33:80/k8s/prometheus/node-exporter:v1.5.0
docker push 192.168.10.33:80/k8s/prometheus/node-exporter:v1.5.0

sed -i 's#quay.io#192.168.10.33:80/k8s#g' nodeExporter-daemonset.yaml 
]# cat nodeExporter-daemonset.yaml | grep image
        image: 192.168.10.33:80/k8s/prometheus/node-exporter:v1.5.0
        image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0 # 这些已经下载过,直接配置即可

6.1.3、应用资源配置清单 

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_nodeExporter/
clusterrole.rbac.authorization.k8s.io/node-exporter created
clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
daemonset.apps/node-exporter created
networkpolicy.networking.k8s.io/node-exporter created
prometheusrule.monitoring.coreos.com/node-exporter-rules created
service/node-exporter created
serviceaccount/node-exporter created
servicemonitor.monitoring.coreos.com/node-exporter created

6.1.4、查询运行状态

]# kubectl -n monitoring  get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE      NOMINATED NODE   READINESS GATES
node-exporter-6q6d9                   2/2     Running   0          30s   192.168.10.26   master1   <none>           <none>
node-exporter-7ngm9                   2/2     Running   0          30s   192.168.10.29   node1     <none>           <none>
node-exporter-k7kzr                   2/2     Running   0          29s   192.168.10.30   node2     <none>           <none>
node-exporter-l5cvm                   2/2     Running   0          30s   192.168.10.27   master2   <none>           <none>
prometheus-adapter-67d7695cb7-nlz59   1/1     Running   0          24m   10.244.3.95     node1     <none>           <none>
prometheus-adapter-67d7695cb7-zqgtd   1/1     Running   0          24m   10.244.4.127    node2     <none>           <none>
prometheus-k8s-0                      2/2     Running   0          48m   10.244.3.94     node1     <none>           <none>
prometheus-k8s-1                      2/2     Running   0          48m   10.244.4.126    node2     <none>           <none>
prometheus-operator-ffcc9958-hffd6    2/2     Running   0          48m   10.244.3.93     node1     <none>           <none>

6.2、安装blackboxExporter

6.2.1、整理分类

mkdir prom_blackbox && mv blackboxExporter-*.yaml prom_blackbox && cd prom_blackbox

prom_blackbox]# ll total
32 -rw-rw-r-- 1 root root 485 Jan 24 18:14 blackboxExporter-clusterRoleBinding.yaml -rw-rw-r-- 1 root root 287 Jan 24 18:14 blackboxExporter-clusterRole.yaml -rw-rw-r-- 1 root root 1392 Jan 24 18:14 blackboxExporter-configuration.yaml -rw-rw-r-- 1 root root 3545 Jan 24 18:14 blackboxExporter-deployment.yaml -rw-rw-r-- 1 root root 722 Jan 24 18:14 blackboxExporter-networkPolicy.yaml -rw-rw-r-- 1 root root 315 Jan 24 18:14 blackboxExporter-serviceAccount.yaml -rw-rw-r-- 1 root root 680 Jan 24 18:14 blackboxExporter-serviceMonitor.yaml -rw-rw-r-- 1 root root 540 Jan 24 18:14 blackboxExporter-service.yaml

 

6.2.2、准备离线镜像

# 原来的镜像地址
prom_blackbox]# grep 'image' *
blackboxExporter-deployment.yaml:        image: quay.io/prometheus/blackbox-exporter:v0.23.0
blackboxExporter-deployment.yaml:        image: jimmidyson/configmap-reload:v0.5.0
blackboxExporter-deployment.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.14.0


# 下载镜像并且上传至本地仓库
docker pull quay.io/prometheus/blackbox-exporter:v0.23.0
docker tag quay.io/prometheus/blackbox-exporter:v0.23.0 192.168.10.33:80/k8s/prometheus/blackbox-exporter:v0.23.0
docker push 192.168.10.33:80/k8s/prometheus/blackbox-exporter:v0.23.0

docker pull jimmidyson/configmap-reload:v0.5.0
docker tag jimmidyson/configmap-reload:v0.5.0 192.168.10.33:80/k8s/configmap-reload:v0.5.0
docker push 192.168.10.33:80/k8s/configmap-reload:v0.5.0

# 修改配置文件
sed -i 's#jimmidyson#192.168.10.33:80/k8s#g' blackboxExporter-deployment.yaml
sed -i 's#quay.io#192.168.10.33:80/k8s#g' blackboxExporter-deployment.yaml

# 修改之后的
prom_blackbox]# grep 'image' *
blackboxExporter-deployment.yaml:        image: 192.168.10.33:80/k8s/prometheus/blackbox-exporter:v0.23.0
blackboxExporter-deployment.yaml:        image: 192.168.10.33:80/k8s/configmap-reload:v0.5.0
blackboxExporter-deployment.yaml:        image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0

 

6.2.3、应用资源配置清单 

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_blackbox/
clusterrole.rbac.authorization.k8s.io/blackbox-exporter created
clusterrolebinding.rbac.authorization.k8s.io/blackbox-exporter created
configmap/blackbox-exporter-configuration created
deployment.apps/blackbox-exporter created
networkpolicy.networking.k8s.io/blackbox-exporter created
service/blackbox-exporter created
serviceaccount/blackbox-exporter created
servicemonitor.monitoring.coreos.com/blackbox-exporter created

6.2.4、查询运行状态

]# kubectl -n monitoring get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE     IP              NODE      NOMINATED NODE   READINESS GATES
blackbox-exporter-84bb6f6bd9-49rxv    3/3     Running   0          45s     10.244.4.128    node2     <none>           <none>
kube-state-metrics-c7c57885f-t9ct7    3/3     Running   0          13m     10.244.3.96     node1     <none>           <none>
node-exporter-6q6d9                   2/2     Running   0          177m    192.168.10.26   master1   <none>           <none>
node-exporter-7ngm9                   2/2     Running   0          177m    192.168.10.29   node1     <none>           <none>
node-exporter-k7kzr                   2/2     Running   0          177m    192.168.10.30   node2     <none>           <none>
node-exporter-l5cvm                   2/2     Running   0          177m    192.168.10.27   master2   <none>           <none>
prometheus-adapter-67d7695cb7-nlz59   1/1     Running   0          3h20m   10.244.3.95     node1     <none>           <none>
prometheus-adapter-67d7695cb7-zqgtd   1/1     Running   0          3h20m   10.244.4.127    node2     <none>           <none>
prometheus-k8s-0                      2/2     Running   0          3h45m   10.244.3.94     node1     <none>           <none>
prometheus-k8s-1                      2/2     Running   0          3h45m   10.244.4.126    node2     <none>           <none>
prometheus-operator-ffcc9958-hffd6    2/2     Running   0          3h45m   10.244.3.93     node1     <none>           <none>

6.3、安装alertmanager

6.3.1、整理分类

mkdir prom_alertmanager && mv alertmanager-*.yaml prom_alertmanager/ && cd prom_alertmanager/
prom_alertmanager]# ll
-rw-rw-r-- 1 root root  928 Jan 24 18:14 alertmanager-alertmanager.yaml
-rw-rw-r-- 1 root root  977 Jan 24 18:14 alertmanager-networkPolicy.yaml
-rw-rw-r-- 1 root root  561 Jan 24 18:14 alertmanager-podDisruptionBudget.yaml
-rw-rw-r-- 1 root root 7072 Jan 24 18:14 alertmanager-prometheusRule.yaml
-rw-rw-r-- 1 root root 1443 Jan 24 18:14 alertmanager-secret.yaml
-rw-rw-r-- 1 root root  351 Jan 24 18:14 alertmanager-serviceAccount.yaml
-rw-rw-r-- 1 root root  637 Jan 24 18:14 alertmanager-serviceMonitor.yaml
-rw-rw-r-- 1 root root  650 Jan 24 18:14 alertmanager-service.yaml

6.3.2、准备离线镜像

# 原来的镜像地址
prom_alertmanager]# grep 'image' *
alertmanager-alertmanager.yaml:  image: quay.io/prometheus/alertmanager:v0.25.0

# 下载离线镜像
docker pull quay.io/prometheus/alertmanager:v0.25.0
docker tag quay.io/prometheus/alertmanager:v0.25.0 192.168.10.33:80/k8s/prometheus/alertmanager:v0.25.0
docker push 192.168.10.33:80/k8s/prometheus/alertmanager:v0.25.0

# 修改离线镜像
sed -i 's#quay.io#192.168.10.33:80/k8s#g' alertmanager-alertmanager.yaml

# 修改后的镜像地址
prom_alertmanager]# grep 'image' alertmanager-alertmanager.yaml 
  image: 192.168.10.33:80/k8s/prometheus/alertmanager:v0.25.0

6.3.3、将service修改为NodePort

prom_alertmanager]# cat alertmanager-service.yaml 
spec:
  ports:
  - name: web
    port: 9093
    targetPort: web
  - name: reloader-web
    port: 8080
    targetPort: reloader-web
    nodePort: 30093
  type: NodePort

6.3.4、应用资源配置清单 

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_alertmanager/
alertmanager.monitoring.coreos.com/main created
networkpolicy.networking.k8s.io/alertmanager-main created
poddisruptionbudget.policy/alertmanager-main created
prometheusrule.monitoring.coreos.com/alertmanager-main-rules created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
servicemonitor.monitoring.coreos.com/alertmanager-main created

6.3.5、查询运行状态

]# kubectl -n monitoring get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE     IP              NODE      NOMINATED NODE   READINESS GATES
alertmanager-main-0                   2/2     Running   0          26s     10.244.3.97     node1     <none>           <none>
alertmanager-main-1                   2/2     Running   0          26s     10.244.4.129    node2     <none>           <none>
alertmanager-main-2                   2/2     Running   0          26s     10.244.4.130    node2     <none>           <none>
blackbox-exporter-84bb6f6bd9-49rxv    3/3     Running   0          32m     10.244.4.128    node2     <none>           <none>
kube-state-metrics-c7c57885f-t9ct7    3/3     Running   0          45m     10.244.3.96     node1     <none>           <none>
node-exporter-6q6d9                   2/2     Running   0          3h28m   192.168.10.26   master1   <none>           <none>
node-exporter-7ngm9                   2/2     Running   0          3h28m   192.168.10.29   node1     <none>           <none>
node-exporter-k7kzr                   2/2     Running   0          3h28m   192.168.10.30   node2     <none>           <none>
node-exporter-l5cvm                   2/2     Running   0          3h28m   192.168.10.27   master2   <none>           <none>
prometheus-adapter-67d7695cb7-nlz59   1/1     Running   0          3h52m   10.244.3.95     node1     <none>           <none>
prometheus-adapter-67d7695cb7-zqgtd   1/1     Running   0          3h52m   10.244.4.127    node2     <none>           <none>
prometheus-k8s-0                      2/2     Running   0          4h17m   10.244.3.94     node1     <none>           <none>
prometheus-k8s-1                      2/2     Running   0          4h17m   10.244.4.126    node2     <none>           <none>
prometheus-operator-ffcc9958-hffd6    2/2     Running   0          4h17m   10.244.3.93     node1     <none>           <none>

7、部署grafana

7.1、整理分类

7.1.1、kubePrometheus-prometheusRule.yaml应用资源配置清单

# 还剩下一个没有创建到,现在把它创建起来
]# kubectl apply -f kubePrometheus-prometheusRule.yaml 
prometheusrule.monitoring.coreos.com/kube-prometheus-rules created

7.1.2、目录分类

mkdir prom_grafana && mv grafana-*.yaml prom_grafana && cd prom_grafana
[root@master1 prom_grafana]# ll
-rw-rw-r-- 1 root root     344 Jan 24 18:14 grafana-config.yaml
-rw-rw-r-- 1 root root     680 Jan 24 18:14 grafana-dashboardDatasources.yaml
-rw-rw-r-- 1 root root 1549788 Jan 24 18:14 grafana-dashboardDefinitions.yaml
-rw-rw-r-- 1 root root     658 Jan 24 18:14 grafana-dashboardSources.yaml
-rw-rw-r-- 1 root root    9290 Jan 24 18:14 grafana-deployment.yaml
-rw-rw-r-- 1 root root     651 Jan 24 18:14 grafana-networkPolicy.yaml
-rw-rw-r-- 1 root root    1427 Jan 24 18:14 grafana-prometheusRule.yaml
-rw-rw-r-- 1 root root     293 Jan 24 18:14 grafana-serviceAccount.yaml
-rw-rw-r-- 1 root root     398 Jan 24 18:14 grafana-serviceMonitor.yaml
-rw-rw-r-- 1 root root     452 Jan 24 18:14 grafana-service.yaml

7.2、修改service

7.2.1、将service修改为NodePort

prom_grafana]# cat grafana-service.yaml 
spec:
  ports:
  - name: http
    port: 3000
    targetPort: http
    nodePort: 30030
  type:NodePort
  selector:
    app.kubernetes.io/component: grafana
    app.kubernetes.io/name: grafana
    app.kubernetes.io/part-of: kube-prometheus

7.3、修改离线镜像

# 原来的镜像
prom_grafana]# grep 'image: ' *
grafana-deployment.yaml:        image: grafana/grafana:9.3.2


# 下载镜像文件并且上传本地仓库
docker pull grafana/grafana:9.3.2
docker tag grafana/grafana:9.3.2 192.168.10.33:80/k8s/grafana:9.3.2
docker push 192.168.10.33:80/k8s/grafana:9.3.2

# 修改离线镜像
sed -i 's#grafana/grafana:9.3.2#192.168.10.33:80/k8s/grafana:9.3.2#g' grafana-deployment.yaml

# 修改后的镜像
prom_grafana]# grep 'image: ' *
grafana-deployment.yaml:        image: 192.168.10.33:80/k8s/grafana:9.3.2

7.4、应用资源配置清单

]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_grafana/
secret/grafana-config configured
secret/grafana-datasources configured
configmap/grafana-dashboard-alertmanager-overview created
configmap/grafana-dashboard-apiserver created
configmap/grafana-dashboard-cluster-total created
configmap/grafana-dashboard-controller-manager created
configmap/grafana-dashboard-grafana-overview created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-node created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-kubelet created
configmap/grafana-dashboard-namespace-by-pod created
configmap/grafana-dashboard-namespace-by-workload created
configmap/grafana-dashboard-node-cluster-rsrc-use created
configmap/grafana-dashboard-node-rsrc-use created
configmap/grafana-dashboard-nodes-darwin created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-persistentvolumesusage created
configmap/grafana-dashboard-pod-total created
configmap/grafana-dashboard-prometheus-remote-write created
configmap/grafana-dashboard-prometheus created
configmap/grafana-dashboard-proxy created
configmap/grafana-dashboard-scheduler created
configmap/grafana-dashboard-workload-total created
configmap/grafana-dashboards created
deployment.apps/grafana configured
networkpolicy.networking.k8s.io/grafana created
prometheusrule.monitoring.coreos.com/grafana-rules created
service/grafana created
serviceaccount/grafana created
servicemonitor.monitoring.coreos.com/grafana created

7.5、查询运行状态

]# kubectl -n monitoring get pods -o wide 
NAME                                  READY   STATUS    RESTARTS   AGE     IP              NODE      NOMINATED NODE   READINESS GATES
alertmanager-main-0                   2/2     Running   0          3m33s   10.244.3.7      node1     <none>           <none>
alertmanager-main-1                   2/2     Running   0          3m32s   10.244.4.8      node2     <none>           <none>
alertmanager-main-2                   2/2     Running   0          3m31s   10.244.4.9      node2     <none>           <none>
blackbox-exporter-84bb6f6bd9-2tr2q    3/3     Running   0          2m25s   10.244.3.9      node1     <none>           <none>
grafana-7bdbdbcb4b-5d996              1/1     Running   0          13m     10.244.3.5      node1     <none>           <none>
kube-state-metrics-c7c57885f-scxdh    3/3     Running   0          118s    10.244.3.10     node1     <none>           <none>
node-exporter-27bgj                   2/2     Running   0          53s     192.168.10.27   master2   <none>           <none>
node-exporter-cnzhw                   2/2     Running   0          53s     192.168.10.30   node2     <none>           <none>
node-exporter-knqgv                   2/2     Running   0          53s     192.168.10.29   node1     <none>           <none>
node-exporter-qwbb6                   2/2     Running   0          53s     192.168.10.26   master1   <none>           <none>
prometheus-adapter-67d7695cb7-7wf9j   1/1     Running   0          2m41s   10.244.4.10     node2     <none>           <none>
prometheus-adapter-67d7695cb7-vbdkr   1/1     Running   0          2m41s   10.244.3.8      node1     <none>           <none>
prometheus-k8s-0                      2/2     Running   0          18s     10.244.3.12     node1     <none>           <none>
prometheus-k8s-1                      2/2     Running   0          18s     10.244.4.11     node2     <none>           <none>
prometheus-operator-ffcc9958-2dbgn    2/2     Running   0          101s    10.244.3.11     node1     <none>           <none>

8、删除网络规则

8.1、原因

为了避免,因网络规则限制的问题,导致我们调试有问题,先删除它,如果需要的话,再学习networkpolicies,再去增加即可

 

8.2、批量删除网络规则

for i in `kubectl get networkpolicies -n monitoring | grep -v NAME | awk -F " " '{ print $1 }'`; do kubectl -n monitoring delete networkpolicies $i; done