zabbix6监控k8s指标说明

发布时间 2023-11-30 10:55:46作者: 潇潇暮鱼鱼

一.deploy中的指标

1.1 Deployment 副本数未达预期告警

min(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})>0
and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_desired[{#NAMESPACE}/{#NAME}])>=0
and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_available[{#NAMESPACE}/{#NAME}])>=0

说明:

1)min(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})>0

kube.deployment.replicas_mismatched为deployment副本数量不一致的数量,{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD}为模板中的设置的宏设置为#5即5个监控周期,server默认的监控周期是30s,在其主要项Kubernetes: Get state metrics中设置的监控周期是1m,覆盖掉默认的20s监控,所以5个监控周期为5分钟。

在宏中可以通过配置{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD}来配置不同的告警检测时间,如设置所有的deployment告警检测时间为5分钟{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:regex:"deployment:.*:.*"} = #5,设置default中deployment名为nginx的告警检测时间为为3分钟{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:default:nginx"} = #3。

所以第一句即为5分钟之内最小副本不匹配数为大于0。

2)last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_desired[{#NAMESPACE}/{#NAME}])>=0

kube.deployment.replicas_desired为deployment所需副本数,大于等于0
3)last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_available[{#NAMESPACE}/{#NAME}])>=0

kube.deployment.replicas_available为deployment可用副本,大于等于0

1.2 Deployment 副本数未达预期恢复

max(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})=0

and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_desired[{#NAMESPACE}/{#NAME}])>=0

and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_available[{#NAMESPACE}/{#NAME}])>=0

说明

1)max(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})=0

5分钟内最大deployment副本数量不一致的数量为0

2)kube.deployment.replicas_desired为deployment所需副本数,大于等于0

3)kube.deployment.replicas_available为deployment可用副本,大于等于0