基础环境
主机名 | 配置 | 角色 | 系统版本 | IP | 安装的组件 |
---|---|---|---|---|---|
220903master | 4核4G | master | openEuler22.09 | 10.0.0.3 | apiserver、controller-manager、scheduler、kubelet、etcd、kube-proxy、容器运行时、calico |
220904node1 | 4核4G | worker | openEuler22.09 | 10.0.0.4 | Kube-proxy、 calicocoredns、容器运行时、kubelet |
系统初始化配置
配置本地解析
集群内主机都需要执行
$ cat >> /etc/hosts << EOF
10.0.0.3 220903master
10.0.0.4 220904node1
EOF
配置互信
集群内主机都需要执行
ssh-keygen
ssh-copy-id 220903master
ssh-copy-id 220904node1
关闭防火墙、selinux
集群内主机都需要执行
systemctl stop firewalld && systemctl disable firewalld
setenforce 0
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
关闭系统的交换分区swap
集群内主机都需要执行
sed -ri 's/^([^#].*swap.*)$/#\1/' /etc/fstab && grep swap /etc/fstab && swapoff -a && free -h
设置内核参数
集群内主机都需要执行
cat >> /etc/sysctl.conf <<EOF
vm.swappiness = 0
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
#加载模块
modprobe br_netfilter
#让配置生效
sysctl -p
安装docker
集群内主机都需要执行
#安装yum源
yum-config-manager --add-repo https://mirrors.ustc.edu.cn/docker-ce/linux/centos/docker-ce.repo
# 这本是centos的安装源,如果直接使用会报错,因为其中的 $releasever 会替换成22.09
# 所以,使用前需要先把$releasever都修改成7
yum makecache
#安装docker
yum install docker-ce -y
#设置开机自启
systemctl enable docker
#启动docker
systemctl start docker
#配置加速
cat <<EOF > /etc/docker/daemon.json
{
"registry-mirrors": [
"https://docker.mirrors.ustc.edu.cn",
"https://hub-mirror.c.163.com",
"https://reg-mirror.qiniu.com",
"https://registry.docker-cn.com"
],
"exec-opts": ["native.cgroupdriver=systemd"],
"data-root": "/data/docker",
"log-driver": "json-file",
"log-opts": {
"max-size": "200m",
"max-file": "5"
}
}
EOF
#重启docker
systemctl restart docker
安装最新版本的kubeadm、kubelet 和 kubectl
集群内主机都需要执行
配置安装源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
EOF
# 刷新yum的元数据
yum makecache
安装kubelet、kubeadm 和 kubectl
yum install -y kubelet kubeadm kubectl
设置kubelet自启
此时,还不能启动kubelet,因为集群还没有配置起来,现在仅仅设置开机自启动
systemctl enable kubelet
cri-dockerd
集群内主机都需要执行
Kubernetes自v1.24移除了对docker-shim的支持,而Docker Engine默认又不支持CRI规范,因而二者将无法直接完成整合。为此,Mirantis和Docker联合创建了cri-dockerd项目,用于为Docker Engine提供一个能够支持到CRI规范的垫片,从而能够让Kubernetes基于CRI控制Docker 。
项目地址:https://github.com/Mirantis/cri-dockerd
cri-dockerd项目提供了rpm包
安装 cri-dockerd
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.4/cri-dockerd-0.3.4-3.el7.x86_64.rpm
yum install cri-dockerd-0.3.4-3.el7.x86_64.rpm -y
配置 cri-dockerd
从国内 cri-dockerd 服务无法下载 k8s.gcr.io上面相关镜像,导致无法启动,所以需要修改cri-dockerd 使用国内镜像源
修改配置文件,设置国内镜像源
$ sed -ri 's@^(.*fd://).*$@\1 --pod-infra-container-image registry.aliyuncs.com/google_containers/pause@' /usr/lib/systemd/system/cri-docker.service
# 重启
$ systemctl daemon-reload && systemctl restart cri-docker && systemctl enable cri-docker
在master节点上初始化
集群内master主机需要执行
生成初始化配置文件
kubeadm config print init-defaults > kubeadm.yaml
修改配置文件
vim kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 修改成本master的ip
advertiseAddress: 10.0.0.3
bindPort: 6443
nodeRegistration:
# 修改成cri-dockerd的sock
criSocket: unix:///run/cri-dockerd.sock
imagePullPolicy: IfNotPresent
# 修改成本master的主机名
name: 220903master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
# 修改成具体对应的版本好
kubernetesVersion: 1.27.4
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
# 添加pod的IP地址
podSubnet: 10.244.0.0/16
scheduler: {}
# 在最后添加上下面两部分
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
初始化集群
kubeadm init --config=kubeadm.yaml --ignore-preflight-errors=SystemVerification
出现下面这种类似的情况,就说明初始化成功了
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 7.502122 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node 220903master as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node 220903master as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.0.3:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:0deaa9ceed7266c28c5f5241ed9efea77c798055ebcc7a27dc03f6c97323c8a0
按照提示要求,创建配置文件目录以及复制配置文件
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
加入worker节点
集群内只在worker主机需要执行
kubeadm join 10.0.0.3:6443 \
--token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:0deaa9ceed7266c28c5f5241ed9efea77c798055ebcc7a27dc03f6c97323c8a0 \
--cri-socket unix:///run/cri-dockerd.sock
注意,一定要加上 --cri-socket unix:///run/cri-dockerd.sock
指定容器运行时
执行完成,查看节点
kubectl get node
NAME STATUS ROLES AGE VERSION
220903master NotReady control-plane 9m26s v1.27.4
220904node1 NotReady <none> 5m41s v1.27.4
kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
etcd-0 Healthy
controller-manager Healthy ok
scheduler Healthy ok
安装pod网络calico
可以看到,虽然worker节点虽然添加上去了,但是状态时NotReady的,所以需要安装网络
只需要在其中一个master节点上执行
下载资源清单文件
在线下载配置文件地址是: https://docs.projectcalico.org/manifests/calico.yaml
wget https://docs.projectcalico.org/manifests/calico.yaml
修改资源清单文件
查找 DaemonSet
找到下面的容器containers部分
如果有多个网卡,需要添加网卡
- name: CLUSTER_TYPE
value: "k8s,bgp"
# Auto-detect the BGP IP address.
- name: IP
value: "autodetect"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always"
#在这里指定网卡 添加下面两行
- name: IP_AUTODETECTION_METHOD
value: "interface=ens33"
# Enable or Disable VXLAN on the default IP pool.
- name: CALICO_IPV4POOL_VXLAN
value: "Never"
修改CIDR,将CIDR修改成上面初始化时pod的内部网段
对应项:
- --pod-network-cidr=10.244.0.0/16
- podSubnet: 10.244.0.0/16
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within `--cluster-cidr`.
#
#这部分原本是注释的,需要去掉#号,将192.168.0.0/16修改成10.244.0.0/16
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
安装calico
kubectl apply -f calico.yaml
可用命令观察各服务容器的状态
watch kubectl get pods --all-namespaces -o wide
查看节点状态
当看到pod都正常运行了,就可以查看状态了
kubectl get nodes
NAME STATUS ROLES AGE VERSION
220903master Ready control-plane 96m v1.27.4
220904node1 Ready <none> 93m v1.27.4
测试集群网络是否正常
kubectl run busybox --image busybox:1.28 --restart=Never --rm -it busybox -- sh
/ # nslookup kubernetes.default.svc.cluster.local
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default.svc.cluster.local
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
10.96.0.10 就是我们coreDNS的clusterIP,说明coreDNS配置好了。
解析内部Service的名称,是通过coreDNS去解析的。
注意:
busybox要用指定的1.28版本,不能用最新版本,最新版本,nslookup会解析不到dns和ip