1.业务部署说明
我们是做算法的,每个算法会被封装成一个镜像,比如:image1(入侵算法),image2(安全带识别算)
结合k8s流程: ingress-nginx(做了hostNetwork: true 网络模式,省去了service的一成转发),直接可以访问ingress-nginx的域名和端口——>客户通过ingress发布的host+path+业务自定义的接口地址拼接成访问url——>ingress-nginx根据对应的匹配规则转发 到后端service——>deployment——>pods(pvc——>pv)
最后的流程就是:客户通过固定的接口格式,访问ingress-nginx根据匹配结果代理到后端不同的service上提供算力能力
2.完整卸载k8s
# 首先清理运行到k8s群集中的pod,使用
kubectl delete node --all
# 使用脚本停止所有k8s服务
for service in kube-apiserver kube-controller-manager kubectl kubelet etcd kube-proxy kube-scheduler;
do
systemctl stop $service
done
# 使用命令卸载k8s
kubeadm reset -f
# 卸载k8s相关程序
yum -y remove kube*
# 删除相关的配置文件
modprobe -r ipip
lsmod
# 然后手动删除配置文件和flannel网络配置和flannel网口:
rm -rf /etc/cni
rm -rf /root/.kube
# 删除cni网络
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
# 删除残留的配置文件
rm -rf ~/.kube/
rm -rf /etc/kubernetes/
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /etc/systemd/system/kubelet.service
rm -rf /etc/systemd/system/multi-user.target.wants/kubelet.service
rm -rf /var/lib/kubelet
rm -rf /usr/libexec/kubernetes/kubelet-plugins
rm -rf /usr/bin/kube*
rm -rf /opt/cni
rm -rf /var/lib/etcd
rm -rf /var/etcd
# 更新镜像
yum clean all
yum makecache
3.安装k8s准备工作
3.1 服务器分配
k8master: 172.16.4.58 #master使用虚拟机部署,不部署业务
k8node1: 172.16.3.199 #物理机(GPU 2060),部署业务ai
操作系统:Centos7.8
k8s版本:v1.23.0
docker版本:19.03.8
ingress版本:
3.2 配置主机名解析(所有节点)
[root@k8master ~]# cat /etc/hosts
#添加
172.16.4.58 k8master
172.16.3.199 k8node1
3.3 设置hostname(所有节点)
#k8master节点执行
hostnamectl set-hostname k8master
#k8node1节点执行
hostnamectl set-hostname k8snode1
3.4 安装时间服务器(所有节点)
yum -y install bash-completion chrony iotop sysstat
## 启动时间服务
cat > /etc/chrony.conf <<EOF
server ntp.aliyun.com iburst
stratumweight 0
driftfile /var/lib/chrony/drift
rtcsync
makestep 10 3
bindcmdaddress 127.0.0.1
bindcmdaddress ::1
keyfile /etc/chrony.keys
commandkey 1
generatecommandkey
logchange 0.5
logdir /var/log/chrony
EOF
systemctl enable chronyd
systemctl start chronyd
3.5 禁用SELinux和Firewalld服务(所有节点)
#关闭firewalld
systemctl stop firewalld
systemctl disable firewalld
#禁用selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config # 重启后生效
3.6 禁用swap分区
#临时关闭
swapoff -a
#永久关闭
sed -i 's/.*swap.*/#&/' /etc/fstab
3.7 添加网桥过滤和地址转发功能(所有节点)
cat > /etc/sysctl.d/kubernetes.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
# 然后执行,生效
sysctl --system
3.8 安装docker(所有节点)
#安装依赖
yum install -y yum-utils device-mdataer-persistent-data lvm2
#安装docker库
yum-config-manager --add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
#安装docker ce(业务使用版本,可以自行选择对应版本)
yum install -y containerd.io-1.2.13 docker-ce-19.03.8 docker-ce-cli-19.03.8
#创建docker目录
mkdir /etc/docker
#配置docker daemon 文件
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"graph": "/data/docker_storage",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"insecure-registries" : ["172.168.4.99:7090","152.199.254.168:7090"],
"registry-mirrors": ["https://g427vmjy.mirror.aliyuncs.com"],
"live-restore": true
}
EOF
#打开docker的api监听端口
sed -i 's/^ExecStart.*/#&/' /lib/systemd/system/docker.service
sed -i '15i ExecStart=/usr/bin/dockerd -H tcp://localhost:2375 -H unix://var/run/docker.sock -H fd:// --containerd=/run/containerd/containerd.sock' /lib/systemd/system/docker.service
#启动docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker
{
// 注意daemon.json一定要要添加这行,指定cgroup的驱动程序,其他可按照业务自行配置
"exec-opts": ["native.cgroupdriver=systemd"],
}
3.9 kubernetes镜像切换成国内源(所有节点)
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
3.10 安装指定版本 kubeadm,kubelet 和 kubectl(我这里选择1.23.0
版本的)
yum install -y kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0
# 设置kubelet开机启动
systemctl enable kubelet
3.11 更改kubelet的容器路径(可以不改,直接跳过)(所有节点)
#创建目录
mkdir /data/kubelet
vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf #添加--root-dir=/data/kubelet/,指定自己的目录 [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --root-dir=/data/kubelet/ --kubeconfig=/etc/kubernetes/kubelet.conf"
#配置生效
systemctl daemon-reload
systemctl restart docker
systemctl restart kubelet
4.部署kubernetes集群
4.1 覆盖k8s的镜像地址(只需要在master节点上操作初始化命令)
(1)首先要覆盖kubeadm的镜像地址,因为这个是外网的无法访问,需要替换成国内的镜像地址,使用此命令列出集群在配置过程中需要哪些镜像
[root@k8master ~]# kubeadm config images list
I0418 18:26:04.047449 19242 version.go:255] remote version is much newer: v1.27.1; falling back to: stable-1.23
k8s.gcr.io/kube-apiserver:v1.23.17
k8s.gcr.io/kube-controller-manager:v1.23.17
k8s.gcr.io/kube-scheduler:v1.23.17
k8s.gcr.io/kube-proxy:v1.23.17
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6
(2)更改为阿里云的镜像地址
[root@k8master ~]# kubeadm config images list --image-repository registry.aliyuncs.com/google_containers
I0418 18:28:18.740057 20021 version.go:255] remote version is much newer: v1.27.1; falling back to: stable-1.23
registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.17
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.17
registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.17
registry.aliyuncs.com/google_containers/kube-proxy:v1.23.17
registry.aliyuncs.com/google_containers/pause:3.6
registry.aliyuncs.com/google_containers/etcd:3.5.1-0
registry.aliyuncs.com/google_containers/coredns:v1.8.6
(3)然后将镜像手动拉取下来,这样在初始化的时候回更快一些(还有一个办法就是直接在docker上把镜像pull下来,docker只要配置一下国内源即可快速的将镜像pull下来)
[root@k8master ~]# kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
I0418 18:28:31.795554 20088 version.go:255] remote version is much newer: v1.27.1; falling back to: stable-1.23
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.6
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.1-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.8.6
(4)初始化kubernetes(只需要在master节点上操作初始化命令)
# 初始化 Kubernetes,指定网络地址段 和 镜像地址(后续的子节点可以使用join命令进行动态的追加)
kubeadm init \ --apiserver-advertise-address=172.16.4.58 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.23.0 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 \ --ignore-preflight-errors=all
# –apiserver-advertise-address # 集群通告地址(master 机器IP)
# –image-repository # 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
# –kubernetes-version #K8s版本,与上面安装的一致
# –service-cidr #集群内部虚拟网络,Pod统一访问入口,可以不用更改,直接用上面的参数
# –pod-network-cidr #Pod网络,与下面部署的CNI网络组件yaml中保持一致,可以不用更改,直接用上面的参数
#执行完之后要手动执行一些参数(尤其是 加入集群的join命令 需要复制记录下载)
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.16.4.58:6443 --token nnzdrq.1mqngk9jnh88nkyn \
--discovery-token-ca-cert-hash sha256:de19ee27e18341ce9acf6248d76664c3bc932372745930fe28687a20073c179a
(5)执行参数
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
vim /root/.bash_profile
#最后添加
# 超级用户变量
export KUBECONFIG=/etc/kubernetes/admin.conf
# 设置别名
alias k=kubectl
# 设置kubectl命令补齐功能
source <(kubectl completion bash)
#执行生效
source /root/.bash_profile
(6)这段要复制记录下来(来自k8s初始化成功之后出现的join命令,需要先配置完calico或者Flannel才能加入子节点),后续子节点加入master节点需要执行这段命令
kubeadm join 172.16.4.58:6443 --token nnzdrq.1mqngk9jnh88nkyn \
--discovery-token-ca-cert-hash sha256:de19ee27e18341ce9acf6248d76664c3bc932372745930fe28687a20073c179a
4.2 k8s网络部署,calico插件(只需要k8master上执行)
(1)下载calico.yaml文件
wget https://docs.projectcalico.org/v3.20/manifests/calico.yaml --no-check-certificate
(2)calico.yaml文件修改
vim /data/calico.yaml
#calico增加、修改内容解释
- name: IP_AUTODETECTION_METHOD
value: interface=ens192
作用:告诉 Calico 插件使用 ens192 网络接口来自动检测容器的 IP 地址。这意味着 Calico 将尝试在指定的网络接口上查找可用的 IP 地址,并将该 IP 地址用于容器的网络通信
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
作用:这表示 Calico 将使用 IP 地址范围 10.244.0.0 到 10.244.255.255(CIDR /16 表示 16 位的网络前缀)来分配给容器,这个值要和:
kubeadm init \
--apiserver-advertise-address=172.16.4.58 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.0 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--ignore-preflight-errors=all
中的 --pod-network-cidr=10.244.0.0/16 保持一致
apiVersion: policy/v1beta1
修改为:
apiVersion: policy/v1
目的:policy/v1 引入了 NetworkPolicy 资源的稳定版本,并提供了更强大的网络策略功能。这个版本更加成熟和稳定,因此推荐在较新的 Kubernetes 版本中使用它
(3)应用calico网络
kubectl apply -f calico.yaml
(4)查看calico是否运行成功
[root@k8master data]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5b9cd88b65-vg4gq 1/1 Running 3 (17h ago) 21h
calico-node-86l8c 1/1 Running 2 (17h ago) 21h
calico-node-lg2mg 1/1 Running 0 21h
coredns-6d8c4cb4d-wm8d2 1/1 Running 0 23h
coredns-6d8c4cb4d-xxdmm 1/1 Running 0 23h
(5)下载calicoctl工具
#github地址
https://github.com/projectcalico/calicoctl/releases/tag/v3.20.6
mv calicoctl-linux-amd64 calicoctl chmod +x calicoctl mv calicoctl /usr/bin/
#执行命令calicoctl node status,看到up就说明已经启动 [root@k8master data]# calicoctl node status Calico process is running. IPv4 BGP status +--------------+-------------------+-------+----------+-------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +--------------+-------------------+-------+----------+-------------+ | 172.16.3.199 | node-to-node mesh | up | 12:15:08 | Established | +--------------+-------------------+-------+----------+-------------+ IPv6 BGP status No IPv6 peers found.
至此k8s的master节点全部部署完成!!!
5.k8node1从节点加入集群(以下操作在k8node1,也就是从节点执行)
5.1 安装nvidia驱动(需要根据自己的业务选择合适的驱动安装)
# 禁用系统Nouveau驱动
sed -i "s/blacklist nvidiafb/#&/" /usr/lib/modprobe.d/dist-blacklist.conf
cat >> /usr/lib/modprobe.d/dist-blacklist.conf <<EOF
blacklist nouveau
options nouveau modeset=0
EOF
# 备份系统initramfs镜像
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut /boot/initramfs-$(uname -r).img $(uname -r)
reboot
# 重启后查看系统Nouveau是否被禁用(没有任何输出)
lsmod | grep nouveau
# 安装驱动(--kernel-source-path手动补全,禁止复制)
sh /data/nvidia-drive/NVIDIA-Linux-x86_64-440.82.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.102.1.el7.x86_64/ -k $(uname -r)
#注意: kernel-source-path 需要手动查看目录
# 接下来界面化操作(略)
5.2 安装nvidia-docker2支持k8s
# 安装nvidia-container-runtime && nvidia-container-runtime
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
yum install -y nvidia-container-toolkit nvidia-container-runtime
# 安装nvidia-docker2,使k8s可以使用显卡驱动
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
# 安装nvidia-docker2,重载Docker daemon configuration
# 在执行过程中,会覆盖/etc/docker/daemon.json的内容,此时注意备份,可以把daemon.json的内容和新生成的合成一体。
yum install -y nvidia-docker2
5.3 daemon.json文件合并
[root@k8node1 ~]# cat /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"graph": "/data/docker_storage",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"insecure-registries" : ["172.168.4.90:8090","152.188.254.169:8090"],
"registry-mirrors": ["https://g427vmjy.mirror.aliyuncs.com"],
"live-restore": true,
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
5.4 查看是否可以调用gpu,看到下图则为调用成功
[root@k8node1 aibox-ai-server]# nvidia-smi
Thu Nov 9 14:17:29 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 Off | 00000000:B3:00.0 Off | N/A |
| 0% 38C P8 3W / 160W | 0MiB / 5934MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
5.5 重启docker、kublete服务
systemctl restart docker
systemctl restart kubelet
5.6 从k8node1节点加入k8master主节点(k8node1从节点执行)
kubeadm join 172.16.4.58:6443 --token nnzdrq.1mqngk9jnh88nkyn --discovery-token-ca-cert-hash sha256:de19ee27e18341ce9acf6248d76664c3bc932372745930fe28687a20073c179a
5.7 查看加入是否成功(k8master主节点执行)
#给k8node1设置一个标签(在k8master上执行)
kubectl label nodes k8node1 node-role.kubernetes.io/work=work
[root@k8master data]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8master Ready control-plane,master 23h v1.23.0 172.16.4.58 <none> CentOS Linux 7 (Core) 3.10.0-1127.el7.x86_64 docker://19.3.8
k8node1 Ready work 23h v1.23.0 172.16.3.199 <none> CentOS Linux 7 (Core) 3.10.0-1160.102.1.el7.x86_64 docker://19.3.8)
可以看到k8node1已经加入了
5.8 删除子节点(在master主节点上操作)
# kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
# 其中 <node name> 是在k8s集群中使用 <kubectl get nodes> 查询到的节点名称
# 假设这里删除 node3 子节点
[root@node1 home]# kubectl drain node3 --delete-local-data --force --ignore-daemonsets
[root@node1 home]# kubectl delete node node3
[root@node3 home]# # 子节点重置k8s
[root@node3 home]# kubeadm reset
6.部署k8s dashboard(这里使用Kubepi)
Kubepi是一个简单高效的k8s集群图形化管理工具,方便日常管理K8S集群,高效快速的查询日志定位问题的工具
6.1 部署KubePI(随便在哪个节点部署,我这里在主节点部署):
[root@k8master ~]# docker pull kubeoperator/kubepi-server [root@k8master ~]# # 运行容器 [root@k8master ~]# docker run --privileged -itd --restart=unless-stopped --name kube_dashboard -v /home/docker-mount/kubepi/:/var/lib/kubepi/ -p 8000:80 kubeoperator/kubepi-server
地址: http://172.16.4.58:8000 默认用户名:admin 默认密码:kubepi
6.2 填写集群名称,默认认证模式,填写apisever
地址及token
6.3 获取登录需要用到的ip地址和登录token
[root@k8master ~]# # 在 k8s 主节点上创建用户,并获取token [root@k8master ~]# kubectl create sa k8admin --namespace kube-system serviceaccount/k8admin created [root@k8master ~]# kubectl create clusterrolebinding k8admin --clusterrole=cluster-admin --serviceaccount=kube-system:k8admin clusterrolebinding.rbac.authorization.k8s.io/k8admin created [root@k8master ~]# [root@k8master ~]# # 在主节点上获取新建的用户 k8admin 的 token [root@k8master ~]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep k8admin | awk '{print $1}') | grep token: | awk '{print $2}' eyJhbGciOiJSUzI1NiIsImtpZCI6IkhVeUtyc1BpU1JvRnVacXVqVk1PTFRkaUlIZm1KQTV6Wk9WSExSRllmd0kifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcGktdXNlci10b2tlbi10cjVsMiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcGktdXNlciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjJiYzlhZDRjLWVjZTItNDE2Mi04MDc1LTA2NTI0NDg0MzExZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcGktdXNlciJ9.QxkR1jBboqTYiVUUVO4yGhfWmlLDA5wHLo_ZnjAuSLZQDyVevCgBluL6l7y7UryRdId6FmBZ-L0QitvOuTsurcjGL2QHxPE_yZsNW7s9K7eikxJ8q-Q_yOvnADtAueH_tcMGRGW9Zyec2TlmcGTZCNaNUme84TfMlWqX7oP3GGJGMbMGN7H4fPXh-Qqrdp-0MJ3tP-dk3koZUEu3amrq8ExSmjIAjso_otrgFWbdSOMkCXKsqb9yuZzaw7u5Cy18bH_HW6RbNCRT5jGs5aOwzuMAd0HQ5iNm-5OISI4Da6jGdjipLXejcC1H-xWgLlJBx0RQWu41yoPNF57cG1NubQ [root@k8master ~]# [root@k8master ~]# # 在主节点上获取 apiserver 地址 [root@k8master ~]# cat ~/.kube/config | grep server: | awk '{print $2}' https://172.16.4.58:6443
6.4 确认之后就可以看到
7.安装metrics k8s集群监控插件
https://zhuanlan.zhihu.com/p/572406293
8.k8s整体部署方式参考文档:
https://zhuanlan.zhihu.com/p/627310856?utm_id=0
至此k8s相关内容已经部署完成!!!
8.业务相关服务部署
8.1 部署内容和流程
创建pv——创建pvc——创建aiserver服务——创建service——创建ingress代理
8.2 流程yaml文件展示
(1)pv.yaml
[root@k8master new]# cat pv.yaml apiVersion: v1 kind: PersistentVolume metadata: name: pv-aimodel labels: pv: aimodel spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /data/aibox-common/aimodel --- apiVersion: v1 kind: PersistentVolume metadata: name: pv-common labels: pv: common spec: capacity: storage: 100Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /data/aibox-common/common --- apiVersion: v1 kind: PersistentVolume metadata: name: pv-ai-logs labels: pv: ai-logs spec: capacity: storage: 50Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /data/aibox-common/ai-server/logs --- apiVersion: v1 kind: PersistentVolume metadata: name: pv-ai-dmi labels: pv: ai-dmi spec: capacity: storage: 5Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /sys/firmware/dmi/ ---
(2)pvc.yaml
[root@k8master new]# cat pvc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-aimodel spec: accessModes: - ReadWriteMany resources: requests: storage: 20Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-common spec: accessModes: - ReadWriteMany resources: requests: storage: 100Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-ai-logs spec: accessModes: - ReadWriteMany resources: requests: storage: 50Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-ai-dmi spec: accessModes: - ReadWriteMany resources: requests: storage: 5Gi ---
(3)ai.yaml + service.yaml (将ai服务部署到k8node1节点,因为这个节点是业务服务部署节点,有GPU)
[root@k8master new]# cat ai.yaml --- apiVersion: v1 kind: Service metadata: name: ai-svc labels: app: ai spec: type: NodePort ports: - port: 28865 targetPort: 28865 nodePort: 31000 selector: app: ai --- apiVersion: apps/v1 kind: Deployment metadata: name: ai #namespace: kube-fjyd spec: replicas: 5 selector: matchLabels: app: ai template: metadata: labels: app: ai spec: nodeName: k8node1 #调用到k8node1进行部署,因为业务需要,有GPU containers: - name: ai image: 172.168.4.60:8090/rz4.5.0.0/aiserver:v4.5.3.0019_v4.5.15.16 #自己业务镜像,下载不下来 imagePullPolicy: IfNotPresent ports: - containerPort: 28865 volumeMounts: - name: logs mountPath: /home/nvidia/aibox/logs - name: aimodel mountPath: /home/nvidia/aibox/aimodel - name: common mountPath: /home/nvidia/aibox/common - name: dmi mountPath: /mnt/sys/firmware/dmi - name: localtime mountPath: /etc/localtime readOnly: true resources: limits: nvidia.com/gpu: 1 # 请求使用1个GPU volumes: - name: logs persistentVolumeClaim: claimName: pvc-ai-logs - name: aimodel persistentVolumeClaim: claimName: pvc-aimodel - name: common persistentVolumeClaim: claimName: pvc-common - name: dmi persistentVolumeClaim: claimName: pvc-ai-dmi - name: localtime hostPath: path: /etc/localtime type: "" restartPolicy: Always ---
9. ingress部署
9.1 什么是 Ingress
Ingress 是对集群中服务的外部访问进行管理的 API 对象,典型的访问方式是 HTTP。Ingress 可以提供负载均衡
Ingress 公开了从集群外部到集群内 服务的 HTTP 和 HTTPS 路由。 流量路由由 Ingress 资源上定义的规则控制。
下面是一个将所有流量都发送到同一 Service 的简单 Ingress 示例:
9.2 部署 Ingress-nginx controller
deploy.yaml 坑点:
Ingress-nginx 官网 https://kubernetes.github.io/ingress-nginx/ 提到了 deploy.yaml 文件
Ingress-nginx 新版本的 depoly.yaml 有些不同,需要拉取下面2个镜像
k8s.gcr.io/ingress-nginx/controller:v1.1.2
k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.1.1
多半是下载不到的,所以需要自己替换一下 ,可以去docker hub 上找到对应的 镜像文件,比如下边这两个:
[root@k8master ingress]# docker images | egrep "longjianghu|liangjw" longjianghu/ingress-nginx-controller v1.1.2 7e5c1cecb086 20 months ago 286MB #k8s.gcr.io/ingress-nginx/controller:v1.1.2 liangjw/kube-webhook-certgen v1.1.1 c41e9fcadf5a 2 years ago 47.7MB #k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.1.1
9.3 总结 坑点:
(1)新版本中 提供了 IngressClass ,需要在编写 Ingress 的时候指定
(2)deploy.yaml中的Image 加载不到,替换成上边的两个镜像
(3)ingress-nginx-controller 使用 hostNetwork: true 进行部署 比 NodePort 减少一层转发,但是需要指定 选择打了标签的 node nodeSelector: app: ingress
9.4 deploy.yaml文件自己修改后的,可以参考
[root@k8master ingress]# cat deploy.yaml #代码太多,已经折叠
apiVersion: v1 kind: Namespace metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx name: ingress-nginx --- apiVersion: v1 automountServiceAccountToken: true kind: ServiceAccount metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx namespace: ingress-nginx --- apiVersion: v1 kind: ServiceAccount metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx namespace: ingress-nginx rules: - apiGroups: - "" resources: - namespaces verbs: - get - apiGroups: - "" resources: - configmaps - pods - secrets - endpoints verbs: - get - list - watch - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io resources: - ingressclasses verbs: - get - list - watch - apiGroups: - "" resourceNames: - ingress-controller-leader resources: - configmaps verbs: - get - update - apiGroups: - "" resources: - configmaps verbs: - create - apiGroups: - "" resources: - events verbs: - create - patch --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission namespace: ingress-nginx rules: - apiGroups: - "" resources: - secrets verbs: - get - create --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx rules: - apiGroups: - "" resources: - configmaps - endpoints - nodes - pods - secrets - namespaces verbs: - list - watch - apiGroups: - "" resources: - nodes verbs: - get - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses verbs: - get - list - watch - apiGroups: - "" resources: - events verbs: - create - patch - apiGroups: - networking.k8s.io resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io resources: - ingressclasses verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission rules: - apiGroups: - admissionregistration.k8s.io resources: - validatingwebhookconfigurations verbs: - get - update --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: v1 data: allow-snippet-annotations: "true" kind: ConfigMap metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller namespace: ingress-nginx --- apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller namespace: ingress-nginx spec: externalTrafficPolicy: Local ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - appProtocol: http name: http port: 80 protocol: TCP targetPort: http - appProtocol: https name: https port: 443 protocol: TCP targetPort: https selector: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx type: LoadBalancer --- apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller-admission namespace: ingress-nginx spec: ports: - appProtocol: https name: https-webhook port: 443 targetPort: webhook selector: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx type: ClusterIP --- apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller namespace: ingress-nginx spec: minReadySeconds: 0 revisionHistoryLimit: 10 selector: matchLabels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx template: metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx spec: hostNetwork: true #修改 ingress-nginx-controller 为 hostNetwork模式 nodeSelector: #选择 node label 中有 app=ingress的节点进行部署 app: ingress containers: - args: - /nginx-ingress-controller - --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller - --election-id=ingress-controller-leader - --controller-class=k8s.io/ingress-nginx - --ingress-class=nginx - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller - --validating-webhook=:8443 - --validating-webhook-certificate=/usr/local/certificates/cert - --validating-webhook-key=/usr/local/certificates/key env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: LD_PRELOAD value: /usr/local/lib/libmimalloc.so image: longjianghu/ingress-nginx-controller:v1.1.2 #修改镜像地址 imagePullPolicy: IfNotPresent lifecycle: preStop: exec: command: - /wait-shutdown livenessProbe: failureThreshold: 5 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: controller ports: - containerPort: 80 name: http protocol: TCP - containerPort: 443 name: https protocol: TCP - containerPort: 8443 name: webhook protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 100m memory: 90Mi securityContext: allowPrivilegeEscalation: true capabilities: add: - NET_BIND_SERVICE drop: - ALL runAsUser: 101 volumeMounts: - mountPath: /usr/local/certificates/ name: webhook-cert readOnly: true dnsPolicy: ClusterFirst nodeSelector: kubernetes.io/os: linux serviceAccountName: ingress-nginx terminationGracePeriodSeconds: 300 volumes: - name: webhook-cert secret: secretName: ingress-nginx-admission --- apiVersion: batch/v1 kind: Job metadata: annotations: helm.sh/hook: pre-install,pre-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-create namespace: ingress-nginx spec: template: metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-create spec: containers: - args: - create - --host=ingress-nginx-controller-admission,ingress-nginx-controller-admission.$(POD_NAMESPACE).svc - --namespace=$(POD_NAMESPACE) - --secret-name=ingress-nginx-admission env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace image: liangjw/kube-webhook-certgen:v1.1.1 #修改镜像地址 imagePullPolicy: IfNotPresent name: create securityContext: allowPrivilegeEscalation: false nodeSelector: kubernetes.io/os: linux restartPolicy: OnFailure securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 2000 serviceAccountName: ingress-nginx-admission --- apiVersion: batch/v1 kind: Job metadata: annotations: helm.sh/hook: post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-patch namespace: ingress-nginx spec: template: metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-patch spec: containers: - args: - patch - --webhook-name=ingress-nginx-admission - --namespace=$(POD_NAMESPACE) - --patch-mutating=false - --secret-name=ingress-nginx-admission - --patch-failure-policy=Fail env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace image: liangjw/kube-webhook-certgen:v1.1.1 #修改镜像地址 imagePullPolicy: IfNotPresent name: patch securityContext: allowPrivilegeEscalation: false nodeSelector: kubernetes.io/os: linux restartPolicy: OnFailure securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 2000 serviceAccountName: ingress-nginx-admission --- apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: nginx spec: controller: k8s.io/ingress-nginx --- apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingWebhookConfiguration metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission webhooks: - admissionReviewVersions: - v1 clientConfig: service: name: ingress-nginx-controller-admission namespace: ingress-nginx path: /networking/v1/ingresses failurePolicy: Fail matchPolicy: Equivalent name: validate.nginx.ingress.kubernetes.io rules: - apiGroups: - networking.k8s.io apiVersions: - v1 operations: - CREATE - UPDATE resources: - ingresses sideEffects: None
[root@k8master ingress]# kubectl get all -n ingress-nginx NAME READY STATUS RESTARTS AGE pod/ingress-nginx-admission-create-fqsl7 0/1 Completed 0 122m pod/ingress-nginx-admission-patch-nmbrd 0/1 Completed 0 122m pod/ingress-nginx-controller-6b68d8cbbf-9xj8t 1/1 Running 0 122m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/ingress-nginx-controller LoadBalancer 10.109.255.117 <pending> 80:30297/TCP,443:31879/TCP 122m service/ingress-nginx-controller-admission ClusterIP 10.99.13.106 <none> 443/TCP 122m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/ingress-nginx-controller 1/1 1 1 122m NAME DESIRED CURRENT READY AGE replicaset.apps/ingress-nginx-controller-6b68d8cbbf 1 1 1 122m NAME COMPLETIONS DURATION AGE job.batch/ingress-nginx-admission-create 1/1 8s 122m job.batch/ingress-nginx-admission-patch 1/1 24s 122m
9.5.部署 Ingress-nginx
(1)准备工作
给 k8node1 节点打了app=ingress标签,因为上面的ingress-nginx-controller 使用的是 hostNetwork 模式(只会放pod真实pod 的 端口) + nodeSelector
[root@k8master ingress]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8master Ready control-plane,master 47h v1.23.0 172.16.4.58 <none> CentOS Linux 7 (Core) 3.10.0-1127.el7.x86_64 docker://19.3.8
k8node1 Ready work 47h v1.23.0 172.16.3.199 <none> CentOS Linux 7 (Core) 3.10.0-1160.102.1.el7.x86_64 docker://19.3.8
[root@k8master ingress]# kubectl label node k8node1 app=ingress node/k8node1 labeled [root@k8master ingress]# kubectl get node --show-labels NAME STATUS ROLES AGE VERSION LABELS k8master Ready control-plane,master 45h v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8master,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers= k8node1 Ready work 44h v1.23.0 app=ingress,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8node1,kubernetes.io/os=linux,node-role.kubernetes.io/work=work
(2)部署deploy.yaml, kubect apply -f
kubectl apply -f deploy.yaml #通过 kubectl apply 命令进行部署 ,前提是镜像准备好,否则GG
[root@k8master ingress]# kubectl apply -f deploy.yaml namespace/ingress-nginx created serviceaccount/ingress-nginx created serviceaccount/ingress-nginx-admission created role.rbac.authorization.k8s.io/ingress-nginx created role.rbac.authorization.k8s.io/ingress-nginx-admission created clusterrole.rbac.authorization.k8s.io/ingress-nginx created clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created rolebinding.rbac.authorization.k8s.io/ingress-nginx created rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created configmap/ingress-nginx-controller created service/ingress-nginx-controller created service/ingress-nginx-controller-admission created deployment.apps/ingress-nginx-controller created job.batch/ingress-nginx-admission-create created job.batch/ingress-nginx-admission-patch created ingressclass.networking.k8s.io/nginx created validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
(3) 查看状态
kubectl get all -n ingress-nginx #查看 ingress-nginx namespace的 部署情况
[root@k8master ingress]# kubectl get all -n ingress-nginx NAME READY STATUS RESTARTS AGE pod/ingress-nginx-admission-create-fqsl7 0/1 Completed 0 147m pod/ingress-nginx-admission-patch-nmbrd 0/1 Completed 0 147m pod/ingress-nginx-controller-6b68d8cbbf-9xj8t 1/1 Running 0 147m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/ingress-nginx-controller LoadBalancer 10.109.255.117 <pending> 80:30297/TCP,443:31879/TCP 147m service/ingress-nginx-controller-admission ClusterIP 10.99.13.106 <none> 443/TCP 147m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/ingress-nginx-controller 1/1 1 1 147m NAME DESIRED CURRENT READY AGE replicaset.apps/ingress-nginx-controller-6b68d8cbbf 1 1 1 147m NAME COMPLETIONS DURATION AGE job.batch/ingress-nginx-admission-create 1/1 8s 147m job.batch/ingress-nginx-admission-patch 1/1 24s 147m
(4)查看 ingress-nginx-controller 的 日志情况
kubectl logs -f ingress-nginx-controller-6b68d8cbbf-9xj8t -n ingress-nginx
(5)测试访问(因为给k8node1节点打了app=ingress标签)
直接访问 k8node1的 ip 即可,因为 ingress-nginx-controller 默认是 监听 80端口,由于上面的 nodeSelector: #选择 node label 中有 app=ingress的节点进行部署 ,而 k8node1 是被打了标签的节点node
(6)部署一个 tomcat 测试 Ingress-nginx
通过部署一个tomcat ,测试Ingress-nginx的代理 是否生效
1. 编写 deploy-tomcat.yaml
- Deployment 部署tomcat:8.0-alpine,
- Service 暴露 tomcat pod
- 一个 Ingress 资源它规定 访问 tomcat.demo.com 这个域名的 所有请求 / 都转发到 tomcat-demo Service 上
- IngressClass 新版本提供的资源 ,用于在定义 Ingress资源的时候 指定,在集群中有多个 Ingress controller 的时候很有用处
[root@k8master ingress]# cat deploy-tomcat.yaml apiVersion: apps/v1 kind: Deployment metadata: name: tomcat-demo spec: selector: matchLabels: app: tomcat-demo replicas: 1 template: metadata: labels: app: tomcat-demo spec: containers: - name: tomcat-demo image: tomcat:8.0-alpine ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: tomcat-demo spec: selector: app: tomcat-demo ports: - port: 80 protocol: TCP targetPort: 8080 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: tomcat-demo spec: defaultBackend: service: name: default-http-backend port: number: 80 ingressClassName: nginx rules: - host: tomcat.demo.com http: paths: - pathType: Prefix path: "/" backend: service: name: tomcat-demo port: number: 80 --- apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend spec: replicas: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: terminationGracePeriodSeconds: 60 containers: - name: default-http-backend image: registry.cn-hangzhou.aliyuncs.com/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 ports: - containerPort: 8080 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi --- apiVersion: v1 kind: Service metadata: name: default-http-backend labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend
2. deploy-tomcat.yaml解释
apiVersion: apps/v1 kind: Deployment metadata: name: tomcat-demo spec: selector: matchLabels: app: tomcat-demo replicas: 1 template: metadata: labels: app: tomcat-demo spec: containers: - name: tomcat-demo image: tomcat:8.0-alpine ports: - containerPort: 8080 创建一个名为 tomcat-demo 的 Deployment 对象,用于部署 Tomcat 应用。 使用标签选择器匹配 app: tomcat-demo 的 pod。 设置副本数为 1。 定义 pod 模板,使用 Tomcat 8.0 Alpine 版本镜像,容器暴露 8080 端口
apiVersion: v1 kind: Service metadata: name: tomcat-demo spec: selector: app: tomcat-demo ports: - port: 80 protocol: TCP targetPort: 8080 创建一个名为 tomcat-demo 的 Service 对象,用于公开 Tomcat 应用。 使用标签选择器选择具有 app: tomcat-demo 标签的 pod。 在服务上公开端口 80,将流量转发到 pod 的 8080 端口。
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: tomcat-demo spec: defaultBackend: service: name: default-http-backend port: number: 80 ingressClassName: nginx rules: - host: tomcat.demo.com http: paths: - pathType: Prefix path: "/" backend: service: name: tomcat-demo port: number: 80 创建一个名为 tomcat-demo 的 Ingress 对象,定义了路由规则。 使用默认的后端服务 default-http-backend 处理未匹配到规则的请求。 指定 Ingress 使用的类别是 nginx。 定义了一个规则:当请求的主机是 tomcat.demo.com 时,将请求路径为 "/" 的流量转发到 tomcat-demo 服务的 80 端口
apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend spec: replicas: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: terminationGracePeriodSeconds: 60 containers: - name: default-http-backend image: registry.cn-hangzhou.aliyuncs.com/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 ports: - containerPort: 8080 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi 创建一个名为 default-http-backend 的 Deployment 对象,用于部署默认的后端服务。 设置副本数为 1。 使用标签选择器匹配 app: default-http-backend 的 pod。 定义 pod 模板,使用默认后端服务的镜像,容器暴露 8080 端口。 配置 livenessProbe,确保服务正常运行
apiVersion: v1 kind: Service metadata: name: default-http-backend labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend 创建一个名为 default-http-backend 的 Service 对象,用于公开默认的后端服务。 在服务上公开端口 80,将流量转发到 pod 的 8080 端口。 使用标签选择器选择具有 app: default-http-backend 标签的 pod。 这个配置文件定义了一个包含 Tomcat 应用和默认后端服务的完整 Kubernetes 部署,以及相关的 Service 和 Ingress 资源。 Ingress 规则将根据主机名和路径将流量路由到相应的服务
3.执行deploy-tomcat.yaml文件
[root@k8master ingress]# kubectl apply -f deploy-tomcat.yaml deployment.apps/tomcat-demo unchanged service/tomcat-demo unchanged ingress.networking.k8s.io/tomcat-demo created deployment.apps/default-http-backend created service/default-http-backend created
4.测试是否可以访问成功
编辑windows的hosts文件
172.16.3.199 tomcat.demo.com
172.16.3.199 api.demo.com
10.参考部署文档
https://blog.51cto.com/u_16213624/7693786
11.我们自己的业务需求
11.1业务需求说明
我们是做算法的,每个算法会被封装成一个镜像,比如:image1(入侵算法),image2(安全带识别算)
结合k8s流程: ingress-nginx(做了hostNetwork: true 网络模式,省去了service的一成转发),直接可以访问ingress-nginx的域名和端口——>客户通过ingress发布的host+path+业务自定义的接口地址拼接成访问url——>ingress-nginx根据对应的匹配规则转发 到后端service——>deployment——>pods(pvc——>pv)
最后的流程就是:客户通过固定的接口格式,访问对应的路由匹配到不同的后端service提供算法服务
11.2 ai-ingress部署
[root@k8master ingress]# cat ai-ingress.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ai-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: /ai/test/alg/$1 spec: defaultBackend: service: name: default-http-backend port: number: 80 ingressClassName: nginx rules: - host: funuo.ai.com http: paths: - pathType: Prefix path: "/A/(.*)" backend: service: name: ai-svc port: number: 28865 - pathType: Prefix path: "/B/(.*)" backend: service: name: ai1-svc port: number: 28865 --- apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend spec: replicas: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: terminationGracePeriodSeconds: 60 containers: - name: default-http-backend image: registry.cn-hangzhou.aliyuncs.com/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 ports: - containerPort: 8080 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi --- apiVersion: v1 kind: Service metadata: name: default-http-backend labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend
11.3 ingress部分解释
在我的 Ingress 配置中,path: "/B/(.*)", 以及 annotations 中的 nginx.ingress.kubernetes.io/rewrite-target: /ai/test/alg/$1 的写法是用于通过正则表达式捕获路径的一部分,并在重写目标中使用捕获到的值。 具体来说: path: "/B/(.*)": 这是一个使用正则表达式的路径规则,其中 (.*) 表示捕获任意字符序列。在这个规则中,路径以 /B/ 开头,然后 (.*) 捕获后续的字符序列。 annotations: nginx.ingress.kubernetes.io/rewrite-target: /ai/test/alg/$1: 这是一个注解,它告诉 Ingress 控制器在将请求发送到后端服务之前重写路径。其中 $1 是在路径匹配中捕获到的第一个组的值(即正则表达式中的 (.*) 部分)。 如果你的路径是 /B/infer/ai/test/alg/infer,那么 /B/(.*) 中的 (.*) 会捕获 infer/ai/test/alg/infer,然后 $1 就会被替换为捕获到的值,最终的重写路径就是 /ai/test/alg/infer/ai/test/alg/infer。 如果你的路径是 /B/foo/bar,那么 /B/(.*) 中的 (.*) 会捕获 foo/bar,然后 $1 就会被替换为捕获到的值,最终的重写路径就是 /ai/test/alg/foo/bar。
11.4 ingress状态查询
[root@k8master ingress]# kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE ai-ingress nginx funuo.ai.com 80 3h34m tomcat-demo nginx tomcat.demo.com 80 3h51m
11.5 结果展示(还需要提前启动好对应的服务,这就是业务上的事情了,其他的部署方式可以参考文档)