k8s中使用prometheus

发布时间 2023-11-02 11:33:13作者: 丁丁丁99

使用方法

git clone https://github.com/prometheus-operator/kube-prometheus.git
cd kube-prometheus
# 先部署 kube-prometheus 的 CRD 和 创建 monitoring namespace
kubectl apply -f manifests/setup/
#在这一步直接 apply 可能会报错,具体错误如下:
#The CustomResourceDefinition “prometheuses.monitoring.coreos.com” is invalid: metadata.annotations: Too long: must have at most 262144 bytes

#这时候可以先删除,再通过 create 创建
kubectl delete -f manifests/setup/
kubectl create -f manifests/setup/

#最后部署 prometheus 和 grafana
kubectl apply -f manifests/

工作原理

为了简化 Prometheus 监控在 Kubernetes 中的管理,Prometheus Operator(一种 Kubernetes 的 Operator)提供了 ServiceMonitor 这个自定义资源。ServiceMonitor 允许在 Kubernetes 中定义 Prometheus 应该如何自动发现和监控服务(Service)的指标。它为 Kubernetes 提供了更加智能和自动化的监控目标配置方式。
ServiceMonitor用于管理Service,再通过service去管理pod, 即ServiceMonitor --> Service --> Pod 。

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kube-ovn-controller
  namespace: monitoring
spec:
  endpoints:
    - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      interval: 15s
      port: metrics
  namespaceSelector:
    matchNames:
      - kube-system
  selector:
    matchLabels:
      app: kube-ovn-controller

ServiceMonitor字段说明:

  • interval:监控数据抓取的时间间隔
  • port:监控的Service的暴露的监控指标采集端口
  • namespaceSelector:监控目标Service所在的命名空间
  • selector:监控目标Service的标签

查看上报数据

打开prometheus地址(可通过nodePort暴露地址),点击status --> target即可查看所有pod上报的指标

踩坑

1. Pod迁入自定义namespace后无法采集指标

背景:
Pod 原先位于kube-system namespace下,指标上报正常,迁入自定义namespace就看不到指标上报了,查看prometheus-k8s pod日志,出现一下报错,判断是在自定义namespace中缺少权限:

解决办法:
在自定义namespace中创建role和roleBinding,和ServiceAccount monitoring/prometheus-k8s绑定。
配置可参考:

  apiVersion: rbac.authorization.k8s.io/v1
  kind: Role
  metadata:
    labels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: k8s
      app.kubernetes.io/name: prometheus
      app.kubernetes.io/part-of: kube-prometheus
      app.kubernetes.io/version: 2.47.2
    name: prometheus-k8s
    namespace: dwc #这里改成自定义命名空间
  rules:
  - apiGroups:
    - ""
    resources:
    - services
    - endpoints
    - pods
    verbs:
    - get
    - list
    - watch
  - apiGroups:
    - extensions
    resources:
    - ingresses
    verbs:
    - get
    - list
    - watch
  - apiGroups:
    - networking.k8s.io
    resources:
    - ingresses
    verbs:
    - get
    - list
    - watch
---
  apiVersion: rbac.authorization.k8s.io/v1
  kind: RoleBinding
  metadata:
    labels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: k8s
      app.kubernetes.io/name: prometheus
      app.kubernetes.io/part-of: kube-prometheus
      app.kubernetes.io/version: 2.47.2
    name: prometheus-k8s
    namespace: dwc #这里改成自定义命名空间
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: Role
    name: prometheus-k8s
  subjects:
  - kind: ServiceAccount
    name: prometheus-k8s
    namespace: monitoring #注意这里是monitoring