kubernetes metrics-server安装

发布时间 2023-04-07 03:34:50作者: lzjasd

k8s版本

[root@master v60]# kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-02-22T13:39:03Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-02-22T13:32:22Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}

 


kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.0/components.yaml
这个是官方文档的

wget -O metrics-server-v0.6.0.yaml https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.0/components.yaml


vim /etc/kubernetes/manifests/kube-apiserver.yaml
- command:
- kube-apiserver
- --requestheader-allowed-names="" # front-proxy-client
- --enable-aggregator-routing=true

reboot 重启所有的机器

metrics-server.tar metrics-server-v0.6.0.yaml


yaml参数
设置
- --cert-dir=/tmp
- --kubelet-insecure-tls
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s

安装要求,如下红框中说明,Kubelet证书需要由群集证书颁发机构签名(或可以禁用证书验证,通过对Metrics Server配置参数–Kubelet-insecure-tls不安全)
添加了“–Kubelet-insecure-tls”这个配置,就不会去验证Kubelets提供的服务证书的CA。

镜像 k8simage/metrics-server:v0.6.0

docker pull k8simage/metrics-server:v0.6.0
docker save -o metrics-server.tar k8simage/metrics-server:v0.6.0

 

[root@master v60]# kubectl top pods -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-5bbd96d687-lnsgj 3m 15Mi
coredns-5bbd96d687-rqrq5 4m 55Mi
etcd-master 58m 149Mi
kube-apiserver-master 103m 500Mi
kube-controller-manager-master 58m 160Mi
kube-proxy-8n6vw 1m 61Mi
kube-proxy-p7758 1m 61Mi
kube-proxy-xwwpn 1m 58Mi
kube-scheduler-master 7m 68Mi
metrics-server-848499c696-jlv6z 8m 67Mi
[root@master v60]# kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master 710m 35% 2025Mi 52%
node1 231m 11% 880Mi 22%
node3 285m 14% 947Mi 24%

概念
集群安装是默认会安装cAdvisor组件,它的作用是:容器数据收集。而metrics-server基于heapster组件,它的作用是:集群监控数据收集,汇总所有节点监控数据。
而prometheus-stack中用于暴露metrics接口的组件是kube-state-metrics,这里简单总结一下metrics-server和kube-state-metrics的区别:

metric-server(或heapster)获取的是容器运行的指标,是cpu、内存使用率这种监控指标,这也不难解释在kubectl top时产生的报错,它还有一个核心作用:为HPA等组件提供决策指标支持。

kube-state-metrics关注于获取k8s各种资源的最新状态,如deployment或者daemonset,这里列出其提供的指标,指标类别包括:

metric-server(或heapster)获取的是容器运行的指标,是cpu、内存使用率这种监控指标,这也不难解释在kubectl top时产生的报错,它还有一个核心作用:为HPA等组件提供决策指标支持。

kube-state-metrics关注于获取k8s各种资源的最新状态,如deployment或者daemonset,这里列出其提供的指标,指标类别包括:

CronJob Metrics
DaemonSet Metrics
Deployment Metrics
Job Metrics
LimitRange Metrics
Node Metrics
PersistentVolume Metrics
PersistentVolumeClaim Metrics
Pod Metrics
Pod Disruption Budget Metrics
ReplicaSet Metrics
ReplicationController Metrics
ResourceQuota Metrics
Service Metrics
StatefulSet Metrics
Namespace Metrics
Horizontal Pod Autoscaler Metrics
Endpoint Metrics
Secret Metrics
ConfigMap Metrics

之所以没有把kube-state-metrics纳入到metric-server的能力中,是因为他们的关注点本质上是不一样的。metric-server仅仅是获取、格式化现有数据,写入特定的存储,实质上是一个监控系统,可以直接将获取到的监控指标发送给存储后端,如influxdb或云厂商,并可以在grafana配置data source为influxdb来进行展示。
而kube-state-metrics是将k8s的运行状况在内存中做了个快照,并且获取新的指标,但他没有能力导出这些指标,需要借助Prometheus来做指标收集、集成,并导出到grafana

换个角度讲,kube-state-metrics本身是metric-server的一种数据来源,虽然现在没有这么做

参考文档

https://kuboard.cn/learning/

kubernetes教程

https://kubesphere.io/zh/docs/v3.3

metrics-server

https://github.com/kubernetes-sigs/metrics-server
https://huangzhongde.cn/istio/Chapter4.html

https://blog.frognew.com/2023/01/kubeadm-install-kubernetes-1.26.html#22-%E4%BD%BF%E7%94%A8kubeadm-init%E5%88%9D%E5%A7%8B%E5%8C%96%E9%9B%86%E7%BE%A4

 

Metrics ServerMetrics API group/versionSupported Kubernetes version
0.6.x metrics.k8s.io/v1beta1 1.19+
0.5.x metrics.k8s.io/v1beta1 *1.8+
0.4.x metrics.k8s.io/v1beta1 *1.8+
0.3.x metrics.k8s.io/v1beta1 1.8-1.21

 

在安装这个的时候,遇到很多问题

1:kube-apiserver.yaml文件修改后,需要重启机器才能生效。

2:通过yaml生成metrics-server的pod后,遇到很多问题,最后是过了一天,重新启动机器才发现这些问题没有了。

Error: unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 10.96.0.1:443: i/o timeout

Warning Unhealthy 21s (x3 over 51s) kubelet Liveness probe failed: Get "https://192.168.166.133:4443/livez": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 1s (x6 over 81s) kubelet Liveness probe failed: Get "https://192.168.166.133:4443/livez": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 1s (x3 over 61s) kubelet Readiness probe failed: Get "https://192.168.166.133:4443/readyz": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

panic: unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 10.96.0.1:443: i/o timeout
/etc/kubernetes/pki/front-proxy-ca.crt

---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 44s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "f0777054e41807afa9dc2072c4dfac31c73b7eeb996391f1da76c826dd205166" network for pod "metrics-server-6985ffc44b-tjrqb": networkPlugin cni failed to set up pod "metrics-server-6985ffc44b-tjrqb_kube-system" network: plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized, failed to clean up sandbox container "f0777054e41807afa9dc2072c4dfac31c73b7eeb996391f1da76c826dd205166" network for pod "metrics-server-6985ffc44b-tjrqb": networkPlugin cni failed to teardown pod "metrics-server-6985ffc44b-tjrqb_kube-system" network: plugin type="calico" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized]
Normal SandboxChanged 3s (x4 over 44s) kubelet Pod sandbox changed, it will be killed and re-created.