Docker容器安装 镜像源切换 CentOS7主机 软件源切换阿里开源镜像站 1 2 3 4 5 6 7 8 9 # 源备份 mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/bk.CentOS-Base.repo # 下载镜像源 curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo # 升级系统相关组件 yum update yum upgrade
容器本体安装 CentOS7主机 参考:菜鸟教程-CentOS Docker安装 1 2 3 4 5 6 7 8 9 10 11 12 13 14 sudo yum install -y yum-utils \ device-mapper-persistent-data \ lvm2 sudo yum-config-manager \ --add-repo \ https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/centos/docker-ce.repo yum list docker-ce --showduplicates | sort -r VERSION_STRING=19.03.15; sudo yum install -y docker-ce-${VERSION_STRING} docker-ce-cli-${VERSION_STRING} containerd.io systemctl start docker && systemctl enable docker
容器配置 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 # /home/imsdata/docker是Docker基本路径,镜像和容器都会占用此空间,根据需要改动 # https://hub-mirror.c.163.com # https://reg-mirror.qiniu.com mkdir -p /etc/docker && mkdir -p /mnt/data_20/docker-root && cat > /etc/docker/daemon.json <<-EOF { "registry-mirrors": [ "http://docker.mirrors.ustc.edu.cn", "http://registry.docker-cn.com", "http://hub-mirror.c.163.com" ], "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m", "max-file": "3" }, "data-root": "$_" } EOF # 老版本为graph # 重启容器生效 systemctl daemon-reload && systemctl restart docker # 容器命令验证,Docker Root Dir: 得到配置的路径 # Cgroup Driver: systemd docker info | grep -E "Docker Root Dir:|Cgroup Driver:"
安装前准备 设置主机名 1 2 3 4 5 6 7 # 设置主机名 hostnamectl set-hostname <HOSTNAME> # 将所有主机名加入hosts中,修改/etc/hosts # echo -ne "192.168.11.251 node1\n192.168.11.252 node2\n192.168.11.253 node3\n" >> /etc/hosts# echo -ne "n1 192.168.58.12\nn2 192.168.58.14\n" >> /etc/hostsecho <IP> <HOSTNAME> >> /etc/hosts
禁用防火墙和缓存 CentOS主机 以下步骤二选一,若跳过禁用防火墙,需要增加端口规则:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # 禁用防火墙 systemctl stop firewalld && systemctl disable firewalld # 开启防火墙后,需要打开NAT转发 # 否则k8s正常启动后,dns解析失败 firewall-cmd --permanent --add-masquerade # 开放端口规则 # 必须开放k8s使用的6443 10250 firewall-cmd --permanent --zone=public --add-port=6443/tcp firewall-cmd --permanent --zone=public --add-port=10250/tcp # 删除端口规则 # firewall-cmd --permanent --zone=public --remove-port=6443/tcp # 开放指定范围端口规则 firewall-cmd --permanent --zone=public --add-port=31080-31090/tcp # 开发虚拟网卡 firewall-cmd --permanent --zone=public --add-interface=cni0 # 让规则生效 firewall-cmd --reload # 查看所有规则 firewall-cmd --list-all
禁用缓存:1 2 3 4 5 6 7 8 9 10 11 12 # 禁用swap # 注释/etc/fstab中swap行 swapoff -a # 注释swap相关行 /mnt/swap swap swap defaults 0 0 这一行或者注释掉这一行 # 虚拟机中运行 # sed -i 's%^/dev/mapper/centos-swap.%# swap %g' /etc/fstab sed -i 's/\(^[^#|.]*swap.*swap.*\)/#\1/g' /etc/fstab # 修改启动等待时间为1s # vim /boot/grub2/grub.cfg,修改第一个timeout 值为0,跳过开机等待 # sed -i 's/^ set timeout=.*/ set timeout=0/' /boot/grub2/grub.cfg sed -i 's/\(^\s*set timeout=\).*/\13/' /boot/grub2/grub.cfg
关闭SELinux CentOS主机 永久方法 – 需要重启服务器
修改 /etc/selinux/config 文件中设置 SELINUX=disabled ,然后重启服务器,命令1 2 3 # sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config sed -i 's/\(^SELINUX=\).*/\1disabled/g' /etc/selinux/config setenforce 0
打开iptables的支持 Debian10通过命令sysctl -a查看,默认已经打开iptabels支持。
CentOS主机 1 2 3 4 5 6 7 8 9 10 11 12 # 配置内核参数(虚拟机方式) echo -ne "net.bridge.bridge-nf-call-iptables = 1\nnet.bridge.bridge-nf-call-ip6tables = 1\n" >> /etc/sysctl.conf # 内核配置生效 sysctl -p # 显示内核参数,回显1为打开,0为关闭 sysctl -a | egrep "bridge-nf-call-iptables|bridge-nf-call-ip6tables|ip_forward" # 另一种查看方法 cat /proc/sys/net/bridge/bridge-nf-call-iptables cat /proc/sys/net/bridge/bridge-nf-call-ip6tables cat /proc/sys/net/ipv4/ip_forward
配置国内源 若有vpn可以跳过,参考:阿里开源-Kubernetes镜像
CentOS主机 1 2 3 4 5 6 7 8 9 10 11 # 配置国内源 cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF sudo yum update
若出现签名验证不过signature could not be verified for kubernetes,可以强行跳过签名验证:1 2 3 4 5 6 7 8 9 10 11 # 配置国内源 cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF sudo yum update
安装辅助工具 安装特定版本辅助工具,例如:1.21.14
CentOS主机 1 2 3 4 5 6 7 8 # 安装特定版本的工具 yum list kubelet --showduplicates | sort -r # KUBE_VERSION=1.21.11; yum install -y kubelet-${KUBE_VERSION} kubeadm-${KUBE_VERSION} kubectl-${KUBE_VERSION} KUBE_VERSION=1.21.14; yum install -y kubelet-${KUBE_VERSION} kubeadm-${KUBE_VERSION} kubectl-${KUBE_VERSION} # 启动kubelet systemctl enable kubelet && systemctl start kubelet # 此时因k8s配置文件不存在,kubelet启动失败 systemctl status kubelet
安装K8s 拉取镜像 从国内源在线拉取1 2 3 4 5 6 7 8 9 10 # 列出所有需要的版本镜像 kubeadm config images list --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers # 拉取所有需要的版本镜像 kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers # 测试域名 docker pull busybox:1.28.4 # 网络控制器 docker pull quay.io/coreos/flannel:v0.14.0 # ingress类控制器 docker pull quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.29.0 # 集群管理 docker pull swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3
若已下载镜像执行1 2 ls k8s-images-1.21.9/*.image | while read file; do docker image load -i $file; done ls flannel-images/*.image | while read file; do docker image load -i $file; done
【离线安装】批量镜像处理——非必须 1 2 3 4 5 # 批量保存镜像 docker image ls | grep "registry.cn-hangzhou.aliyuncs.com" | awk '{print $1 ":" $2}' | while read image; do file=${image//\//_}.image; file=${file//:/_}; docker image save $image -o $file; done # 批量加载镜像 ls *.image | while read file; do docker image load -i $file; done
初始化K8s 从节点无需初始化1 2 3 4 5 6 7 8 9 10 11 12 # 指定镜像源初始化 # 需要特定网络,需要在此步骤中指定 # 使用flannel网络需要指定--pod-network-cidr参数 # --pod-network-cidr 指定pod网络地址范围 # --apiserver-advertise-address 指定集群API server绑定地址 # --service-dns-domain 指定各服务顶级dns域名 kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=172.20.1.25 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=ice | tee ./kubeadm-init.log # kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.8.101 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=viid # 出现Your Kubernetes control-plane has initialized successfully!表明初始化正常 # 服务状态正常,但因未配置网络/etc/cni/net.d,后台上报异常 systemctl status kubelet
若有cgroup改systemd告警,处理参考
安装命令补全工具 参考后面K8s配置章节
启动集群 1 2 3 4 5 6 7 8 9 10 11 12 # 启动集群 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 若是root用户,需要执行 export KUBECONFIG=/etc/kubernetes/admin.conf # 允许Master参与调度,单节点必须将Master置为可调度状态 kubectl get nodes kubectl taint node localhost.localdomain node-role.kubernetes.io/master- #将 Master 也当作 Node 使用 kubectl taint node localhost.localdomain node-role.kubernetes.io/master=:NoSchedule #将 Master 恢复成 Master Only 状态
路由选择 默认kube-proxy使用iptables实现,随着service、pod数量增加,iptables顺序遍历的方式就会凸显。可替代方案可以选择:
kube-proxy支持的另一种模式,性能比iptables更高,推荐使用。参考文章
整体成熟度低,不建议单独使用,参考文章
替换kube-proxy方法,参考文章
可以对数据包进行观测,参考文章
网络配置 上面安装成功后如果通过查询kube-system下Pod的运行情况,会放下和网络相关的Pod都处于Pending的状态,这是因为缺少相关的网络插件,而网络插件有很多个(以下任选一个),可以选择自己需要的。参考:Kubernetes指南
flannel 参考官网
官方网站,在Documentation文件夹中,参考kube-flannel.yaml, kube-flannel-aliyun.yaml, kube-flannel-old.yaml等描述 文件
kube-flannel.yaml文件内容:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 --- apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: psp.flannel.unprivileged annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default spec: privileged: false volumes: - configMap - secret - emptyDir - hostPath allowedHostPaths: - pathPrefix: "/etc/cni/net.d" - pathPrefix: "/etc/kube-flannel" - pathPrefix: "/run/flannel" readOnlyRootFilesystem: false runAsUser: rule: RunAsAny supplementalGroups: rule: RunAsAny fsGroup: rule: RunAsAny allowPrivilegeEscalation: false defaultAllowPrivilegeEscalation: false allowedCapabilities: ['NET_ADMIN' , 'NET_RAW' ] defaultAddCapabilities: [] requiredDropCapabilities: [] hostPID: false hostIPC: false hostNetwork: true hostPorts: - min: 0 max: 65535 seLinux: rule: 'RunAsAny' --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel rules: - apiGroups: ['extensions' ] resources: ['podsecuritypolicies' ] verbs: ['use' ] resourceNames: ['psp.flannel.unprivileged' ] - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "host-gw" } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: - linux hostNetwork: true priorityClassName: system-node-critical tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.14.0 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.14.0 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN" , "NET_RAW" ] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg
Network.Network: 10.244.0.0/16需要改成kubeadm初始化参数所带网络参数。
Network.Backend.Type: vxlan为默认网络参数,已经改为host-gw,测试文档 中发现host-gw模式性能更高。
创建网络:1 kubectl apply -f kube-flannel.yaml
等待片刻可以看到虚拟网桥cni0已经创建好。若coredns一直处于pending状态,请将flannel文件改为最新版本(0.19.2)
查看系统空间kube-system中flannel网络pod资源已经正常运行1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 kubectl -n kube-system get all # NAME READY STATUS RESTARTS AGE # pod/coredns-6f6b8cc4f6-95bng 1/1 Running 0 24m # pod/coredns-6f6b8cc4f6-hkdn8 1/1 Running 0 24m # pod/etcd-n1 1/1 Running 1 24m # pod/kube-apiserver-n1 1/1 Running 1 24m # pod/kube-controller-manager-n1 1/1 Running 1 24m # pod/kube-flannel-ds-vz49s 1/1 Running 0 74s # pod/kube-proxy-bbjzq 1/1 Running 1 24m # pod/kube-scheduler-n1 1/1 Running 1 24m # # service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 24m # # daemonset.apps/kube-flannel-ds 1 1 1 1 1 <none> 74s # daemonset.apps/kube-proxy 1 1 1 1 1 kubernetes.io/os=linux 24m # # deployment.apps/coredns 2/2 2 2 24m # # replicaset.apps/coredns-6f6b8cc4f6 2 2 2 24m
cni 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 mkdir -p /etc/cni/net.d cat >/etc/cni/net.d/10-mynet.conf <<-EOF { "cniVersion": "0.3.0", "name": "mynet", "type": "bridge", "bridge": "cni0", "isGateway": true, "ipMasq": true, "ipam": { "type": "host-local", "subnet": "10.244.0.0/16", "routes": [ {"dst": "0.0.0.0/0"} ] } } EOF cat >/etc/cni/net.d/99-loopback.conf <<-EOF { "cniVersion": "0.3.0", "type": "loopback" } EOF
等待片刻可以看到虚拟网桥cni0已经创建好。
查看是否安装成功 1 2 3 4 5 6 7 8 9 10 11 12 # 查看系统pod是否正常启动 kubectl get pods -n kube-system # 看到类似结果 # NAME READY STATUS RESTARTS AGE # coredns-6f6b8cc4f6-285bg 1/1 Running 0 7m38s # coredns-6f6b8cc4f6-znlf7 1/1 Running 0 7m38s # etcd-n175 1/1 Running 0 7m46s # kube-apiserver-n175 1/1 Running 0 7m46s # kube-controller-manager-n175 1/1 Running 0 7m46s # kube-flannel-ds-8p8xx 1/1 Running 0 95s # kube-proxy-5zvr2 1/1 Running 0 7m38s # kube-scheduler-n175 1/1 Running 0 7m46s
域名解析测试方法 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # 启动容器,-n xxx可以指定创建的命名空间 kubectl run -it --rm --image=busybox:1.28.4 --restart=Never sh # 解析域名地址,格式:<Service>.<Namespace>.svc.cluster.local nslookup kube-dns.kube-system nslookup kube-dns.kube-system.svc.cluster.ice nslookup ver-svc nslookup ver-svc.ver-dev nslookup ver-svc.ver-dev.svc nslookup ver-svc.ver-dev.svc.cluster nslookup ver-svc.ver-dev.svc.cluster.ice # 正常能查看到类似结果 # Server: 10.96.0.10 # Address 1: 10.96.0.10 kube-dns.kube-system.svc.ice # # Address 1: 10.96.0.10 kube-dns.kube-system.svc.ice
域名解析时都会优先尝试直接域名解析,若无法连通则会逐步扩大解析范围。使用简短域名解析通过tcpdump -i cni0 -w result.cap抓包dns流程分析,简短域名解析耗时10+ms,而完整域名解析1+ms,因此推荐使用完整域名解析。
工作Node节点加入集群 操作步骤到安装辅助工具后,只需要加载pause、proxy、网络插件(若有)镜像。再运行主节点部署时的回显命令即可加入集群。1 2 3 # 加载节点镜像 docker image load -i k8s-images-1.21.8/registry.cn-hangzhou.aliyuncs.com_google_containers_pause_3.4.1.image docker image load -i k8s-images-1.21.8/registry.cn-hangzhou.aliyuncs.com_google_containers_kube-proxy_v1.21.8.image
Ingress类控制器安装 ingress-nginx 官方GitHub 仓,注意官方使用的镜像k8s.gcr.io可能无法下载,需要替换一下
nginx分支 参考文章
以nginx-0.29.0 分支为例,进入到资源描述路径deploy/static
configmap.yaml 提供configmap可以在线更新nginx的配置namespace.yaml 创建一个独立的命名空间 ingress-nginxrbac.yaml 创建对应的role rolebinding 用于rbacwith-rbac.yaml 有应用rbac的nginx-ingress-controller组件mandatory.yaml 以上所有文件的集合1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 apiVersion: v1 kind: Namespace metadata: name: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- kind: ConfigMap apiVersion: v1 metadata: name: nginx-configuration namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- kind: ConfigMap apiVersion: v1 metadata: name: tcp-services namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- kind: ConfigMap apiVersion: v1 metadata: name: udp-services namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- apiVersion: v1 kind: ServiceAccount metadata: name: nginx-ingress-serviceaccount namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: nginx-ingress-clusterrole labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx rules: - apiGroups: - "" resources: - configmaps - endpoints - nodes - pods - secrets verbs: - list - watch - apiGroups: - "" resources: - nodes verbs: - get - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - "" resources: - events verbs: - create - patch - apiGroups: - "extensions" - "networking.k8s.io" resources: - ingresses verbs: - get - list - watch - apiGroups: - "extensions" - "networking.k8s.io" resources: - ingresses/status verbs: - update --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: Role metadata: name: nginx-ingress-role namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx rules: - apiGroups: - "" resources: - configmaps - pods - secrets - namespaces verbs: - get - apiGroups: - "" resources: - configmaps resourceNames: - "ingress-controller-leader-nginx" verbs: - get - update - apiGroups: - "" resources: - configmaps verbs: - create - apiGroups: - "" resources: - endpoints verbs: - get --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: RoleBinding metadata: name: nginx-ingress-role-nisa-binding namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: nginx-ingress-role subjects: - kind: ServiceAccount name: nginx-ingress-serviceaccount namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: nginx-ingress-clusterrole-nisa-binding labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: nginx-ingress-clusterrole subjects: - kind: ServiceAccount name: nginx-ingress-serviceaccount namespace: ingress-nginx --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-ingress-controller namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx template: metadata: labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx annotations: prometheus.io/port: "10254" prometheus.io/scrape: "true" spec: hostNetwork: true terminationGracePeriodSeconds: 300 serviceAccountName: nginx-ingress-serviceaccount nodeSelector: kubernetes.io/os: linux containers: - name: nginx-ingress-controller image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.29.0 args: - /nginx-ingress-controller - --configmap=$(POD_NAMESPACE)/nginx-configuration - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services - --udp-services-configmap=$(POD_NAMESPACE)/udp-services - --publish-service=$(POD_NAMESPACE)/ingress-nginx - --annotations-prefix=nginx.ingress.kubernetes.io - --http-port=31080 - --https-port=31443 securityContext: allowPrivilegeEscalation: true capabilities: drop: - ALL add: - NET_BIND_SERVICE runAsUser: 101 env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace ports: - name: http containerPort: 31080 protocol: TCP hostPort: 31080 - name: https containerPort: 31443 protocol: TCP hostPort: 31443 livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 10 readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 10 lifecycle: preStop: exec: command: - /wait-shutdown --- apiVersion: v1 kind: LimitRange metadata: name: ingress-nginx namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx spec: limits: - min: memory: 90Mi cpu: 100m type: Container
svc服务文件deploy/baremetal/service-nodeport.yaml,将ingress内部80端口暴露到节点1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 apiVersion: v1 kind: Service metadata: name: ingress-nginx namespace: ingress-nginx spec: type: ClusterIP ports: - name: http port: 31080 targetPort: 31080 protocol: TCP - name: https port: 31443 targetPort: 31443 protocol: TCP selector: app: ingress-nginx
默认nginx-ingress-controller会随意选择一个node节点运行pod,为此需要我们把nginx-ingress-controller运行到指定的node节点上。首先需要给需要运行nginx-ingress-controller的node节点打标签,在此我们把nginx-ingress-controller运行在指定的node节点上
此步骤非必须
为节点打标签1 2 3 4 5 # 为指定节点打标签,标签可以换成其他的 kubectl label node localhost.localdomain nodeFeature=nginx # 查看节点已有标签 kubectl get nodes --show-labels
在mandatory.yaml文件中nodeSelector属性中增加nodeFeature: nginx
验证程序
ingress-nginx测试程序httpd-dep.yaml:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: httpd labels: name: httpd spec: rules: - http: paths: - pathType: Prefix path: "/" backend: service: name: httpd port: number: 8000 --- apiVersion: v1 kind: Service metadata: name: httpd spec: selector: app: httpd ports: - port: 8000 protocol: TCP targetPort: 80 type: ClusterIP --- apiVersion: apps/v1 kind: Deployment metadata: name: httpd spec: replicas: 4 selector: matchLabels: app: httpd template: metadata: labels: app: httpd spec: containers: - name: httpd image: httpd:2.4.52 resources: limits: memory: "128Mi" cpu: "500m" ports: - containerPort: 80
若未指定ingress会随机选择节点启动pod,查看命令1 2 3 4 kubectl -n ingress-nginx get pod -o wide # NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES # nginx-ingress-controller-fd46d8644-s772j 1/1 Running 0 103m 192.168.31.192 n2 <none> <none>
K8s上部署Redis集群 【重点】不适合部署带持久化的Redis集群,因Redis集群以IP建立的。
重启后无法恢复原有集群。
本方案采用StatefulSet进行redis的部署。参考文章
环境信息 1 master 192.168.1.100 2 node1 192.168.1.101 3 node2 192.168.1.102 4 node3 192.168.1.103
创建存储卷 安装nfs软件包 1 2 # 任选一台节点(这里选k8s-master) yum –y install nfs-utils rpcbind
创建共享存储 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # 创建共享目录 mkdir -p /home/data/redis/pv{1,2,3,4,5,6} # 配置共享路径 vi /etc/exports # 增加如下配置 /home/data/redis/pv1 192.168.11.0/24(rw,sync,no_root_squash) /home/data/redis/pv2 192.168.11.0/24(rw,sync,no_root_squash) /home/data/redis/pv3 192.168.11.0/24(rw,sync,no_root_squash) /home/data/redis/pv4 192.168.11.0/24(rw,sync,no_root_squash) /home/data/redis/pv5 192.168.11.0/24(rw,sync,no_root_squash) /home/data/redis/pv6 192.168.11.0/24(rw,sync,no_root_squash) # 重启 systemctl restart rpcbind systemctl restart nfs systemctl enable nfs # 其他节点验证nfs yum -y install nfs-utils showmount -e 172.20.1.25
创建PV pv.yaml:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv1 spec: capacity: storage: 2Gi accessModes: - ReadWriteMany volumeMode: Filesystem persistentVolumeReclaimPolicy: Recycle storageClassName: "redis" nfs: server: 192.168.1.10 path: "/usr/local/k8s/redis/pv1" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-vp2 spec: capacity: storage: 2Gi accessModes: - ReadWriteMany volumeMode: Filesystem persistentVolumeReclaimPolicy: Recycle storageClassName: "redis" nfs: server: 192.168.1.10 path: "/usr/local/k8s/redis/pv2" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv3 spec: capacity: storage: 2Gi accessModes: - ReadWriteMany volumeMode: Filesystem persistentVolumeReclaimPolicy: Recycle storageClassName: "redis" nfs: server: 192.168.1.10 path: "/usr/local/k8s/redis/pv3" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-vp4 spec: capacity: storage: 2Gi accessModes: - ReadWriteMany volumeMode: Filesystem persistentVolumeReclaimPolicy: Recycle storageClassName: "redis" nfs: server: 192.168.1.10 path: "/usr/local/k8s/redis/pv4" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv5 spec: capacity: storage: 2Gi accessModes: - ReadWriteMany volumeMode: Filesystem persistentVolumeReclaimPolicy: Recycle storageClassName: "redis" nfs: server: 192.168.1.10 path: "/usr/local/k8s/redis/pv5" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-vp6 spec: capacity: storage: 2Gi accessModes: - ReadWriteMany volumeMode: Filesystem persistentVolumeReclaimPolicy: Recycle storageClassName: "redis" nfs: server: 192.168.1.10 path: "/usr/local/k8s/redis/pv6"
创建configmap vim redis.conf1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 port 6379 protected-mode no cluster-enabled yes cluster-config-file nodes-${PORT} .conf cluster-node-timeout 5000 appendonly yes appendfsync everysec no-appendfsync-on-rewrite no auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb
创建1 kubectl create configmap redis-conf --from-file=redis.conf
创建headless service vim headless-service.yaml1 2 3 4 5 6 7 8 9 10 11 12 13 14 --- apiVersion: v1 kind: Service metadata: name: redis-service labels: app: redis spec: ports: - name: redis-port port: 6379 clusterIP: None selector: app: redis
创建redis集群节点 通过StatefulSet创建6个redis的pod ,实现3主3从的redis集群。vim redis.yaml1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 --- apiVersion: apps/v1 kind: StatefulSet metadata: name: redis-app spec: serviceName: "redis-service" replicas: 6 selector: matchLabels: app: redis appCluster: redis-cluster template: metadata: labels: app: redis appCluster: redis-cluster spec: containers: - name: redis image: "redis:3.2.8" command: - "redis-server" args: - "/etc/redis/redis.conf" - "-protected-mode" - "no" resources: requests: cpu: "100m" memory: "100Mi" ports: - name: redis containerPort: 6379 protocol: "TCP" - name: cluster containerPort: 16379 protocol: "TCP" volumeMounts: - name: "redis-conf" mountPath: "/etc/redis" - name: "redis-data" mountPath: "/var/lib/redis" volumes: - name: "redis-conf" configMap: name: "redis-conf" items: - key: "redis.conf" path: "redis.conf" volumeClaimTemplates: - metadata: name: redis-data spec: accessModes: [ "ReadWriteMany" ] storageClassName: "redispv" resources: requests: storage: 2Gi
初始化redis集群 获取节点信息1 2 3 4 5 6 7 8 kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES redis-app-0 1/1 Running 0 2m22s 10.244.0.248 n175 <none> <none> redis-app-1 1/1 Running 0 97s 10.244.0.252 n175 <none> <none> redis-app-2 1/1 Running 0 101s 10.244.0.251 n175 <none> <none> redis-app-3 1/1 Running 0 117s 10.244.0.250 n175 <none> <none> redis-app-4 1/1 Running 0 2m1s 10.244.0.249 n175 <none> <none> redis-app-5 1/1 Running 0 2m31s 10.244.0.247 n175 <none> <none>
进入任意节点执行命令初始化集群1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 redis-cli --cluster create \ 10.244.0.248:6379 \ 10.244.0.252:6379 \ 10.244.0.251:6379 \ 10.244.0.250:6379 \ 10.244.0.249:6379 \ 10.244.0.247:6379 \ --cluster-replicas 1 # 无法正常建立集群 redis-cli --cluster create \ redis-app-0.redis-service.default.svc.ice:6379 \ redis-app-1.redis-service.default.svc.ice:6379 \ redis-app-2.redis-service.default.svc.ice:6379 \ redis-app-3.redis-service.default.svc.ice:6379 \ redis-app-4.redis-service.default.svc.ice:6379 \ redis-app-5.redis-service.default.svc.ice:6379 \ --cluster-replicas 1 # 回显信息 > >> Performing hash slots allocation on 6 nodes... Master[0] -> Slots 0 - 5460 Master[1] -> Slots 5461 - 10922 Master[2] -> Slots 10923 - 16383 Adding replica 10.244.0.249:6379 to 10.244.0.248:6379 Adding replica 10.244.0.247:6379 to 10.244.0.252:6379 Adding replica 10.244.0.250:6379 to 10.244.0.251:6379 M: 8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379 slots:[0-5460] (5461 slots) master M: f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379 slots:[5461-10922] (5462 slots) master M: dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379 slots:[10923-16383] (5461 slots) master S: b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379 replicates dc0a1908830468f2070883e1c026fd5b1b2ff526 S: 4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379 replicates 8921255530faaa44fb793a7248d93c179211b7d9 S: dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379 replicates f32bcfa4dcacef662096d5ccdef4d741588aa2cb Can I set the above configuration? (type 'yes' to accept): yes > >> Nodes configuration updated > >> Assign a different config epoch to each node > >> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join .. > >> Performing Cluster Check (using node 10.244.0.248:6379) M: 8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379 slots:[0-5460] (5461 slots) master 1 additional replica(s) S: 4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379 slots: (0 slots) slave replicates 8921255530faaa44fb793a7248d93c179211b7d9 M: f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379 slots:[5461-10922] (5462 slots) master 1 additional replica(s) S: b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379 slots: (0 slots) slave replicates dc0a1908830468f2070883e1c026fd5b1b2ff526 S: dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379 slots: (0 slots) slave replicates f32bcfa4dcacef662096d5ccdef4d741588aa2cb M: dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379 slots:[10923-16383] (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. > >> Check for open slots... > >> Check slots coverage... [OK] All 16384 slots covered.
验证状态1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 redis-cli -c # 操作回显信息 127.0.0.1:6379> CLUSTER INFO cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:6 cluster_my_epoch:1 cluster_stats_messages_ping_sent:315 cluster_stats_messages_pong_sent:335 cluster_stats_messages_sent:650 cluster_stats_messages_ping_received:330 cluster_stats_messages_pong_received:315 cluster_stats_messages_meet_received:5 cluster_stats_messages_received:650 127.0.0.1:6379> CLUSTER NODES 4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379@16379 slave 8921255530faaa44fb793a7248d93c179211b7d9 0 1658228657078 1 connected f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379@16379 master - 0 1658228657580 2 connected 5461-10922 b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379@16379 slave dc0a1908830468f2070883e1c026fd5b1b2ff526 0 1658228658082 3 connected dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379@16379 slave f32bcfa4dcacef662096d5ccdef4d741588aa2cb 0 1658228657000 2 connected 8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379@16379 myself,master - 0 1658228657000 1 connected 0-5460 dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379@16379 master - 0 1658228657078 3 connected 10923-16383
创建用于访问的service 之前创建了用于实现StatefulSet的Headless Service,但该Service没有Cluster IP,因此不能用于外界访问。所以,我们还需要创建一个Service,专用于为Redis集群提供访问和负载均衡;也可以部署为Ingress供集群外部访问。这里只创建用于内部访问的service
redis-access-service.yaml1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 apiVersion: v1 kind: Service metadata: name: redis-access-service labels: app: redis spec: ports: - name: redis-port protocol: "TCP" port: 6379 targetPort: 6379 selector: app: redis appCluster: redis-cluster
K8s上部署Kafka集群 K8s上部署MySQL集群 Helm离线安装Chart包 参考文章
CICD引擎(非必须) Jenkins替代工具
BuildMaster Drone.io GoCD
Argo 可视化管理工具(非必须) kuboard 在线安装1 2 3 4 5 6 7 8 9 10 11 12 # 也可以使用镜像 swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3 ,可以更快地完成镜像下载。 # 请不要使用 127.0.0.1 或者 localhost 作为内网 IP \ sudo docker run -d \ --restart=unless-stopped \ --name=kuboard \ -p 30080:80/tcp \ -p 31089:10081/tcp \ -e KUBOARD_ENDPOINT="http://192.168.11.178:30080" \ -e KUBOARD_AGENT_SERVER_TCP_PORT="31089" \ -v /opt/kuboard-data:/data \ swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3.4.1.0
docker-compose.yml方式:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 version: "3" networks: kuboard-net: external: false driver: bridge # ipam: # config: # - subnet: 172.90.161.0/24 services: kuboard: image: swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3.4.1.0 container_name: kuboard restart: unless-stopped # always environment: # user/passwd:admin/Kuboard123 # modify: Zdxf@2021 KUBOARD_ENDPOINT: "http://172.20.1.25:30080" KUBOARD_AGENT_SERVER_TCP_PORT: "31089" KUBERNETES_CLUSTER_DOMAIN: "ice" KUBOARD_ICP_DESCRIPTION: "ICP备案号" KUBOARD_DISABLE_AUDIT: true ports: - 30080:80/tcp - 31089:10081/tcp volumes: - /etc/localtime:/etc/localtime:ro - ./data:/data networks: - kuboard-net logging: #driver: none driver: json-file options: max-size: "200k" max-file: "1" # 使用deploy限制资源,启动时需要增加--compatibility参数,防止报错 deploy: resources: limits: cpus: '1' memory: 1G reservations: cpus: '1' memory: 200M
在浏览器输入 http://your-host-ip:31088 即可访问 Kuboard v3.x 的界面,登录方式:
用户名: admin 密 码: Kuboard123 浏览器兼容性
请使用 Chrome / FireFox / Safari 等浏览器
不兼容 IE 以及以 IE 为内核的浏览器
资源监控 metrics-server.yaml内容:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: 443 selector: k8s-app: metrics-server --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: 'true' rbac.authorization.k8s.io/aggregate-to-edit: 'true' rbac.authorization.k8s.io/aggregate-to-view: 'true' name: 'system:aggregated-metrics-reader' namespace: kube-system rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: 'system:metrics-server' namespace: kube-system rules: - apiGroups: - '' resources: - pods - nodes - nodes/stats - namespaces - configmaps verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: 'metrics-server:system:auth-delegator' namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: 'system:auth-delegator' subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: 'system:metrics-server' namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: 'system:metrics-server' subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io namespace: kube-system spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100 --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: replicas: 1 selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 1 template: metadata: labels: k8s-app: metrics-server spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: node-role.kubernetes.io/master operator: Exists weight: 100 podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: k8s-app: metrics-server namespaces: - kube-system topologyKey: kubernetes.io/hostname containers: - args: - '--cert-dir=/tmp' - '--secure-port=443' - '--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname' - '--kubelet-use-node-status-port' - '--kubelet-insecure-tls=true' - '--authorization-always-allow-paths=/livez,/readyz' - '--metric-resolution=15s' image: >- swr.cn-east-2.myhuaweicloud.com/kuboard-dependency/metrics-server:v0.5.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: 100m memory: 200Mi securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server tolerations: - effect: '' key: node-role.kubernetes.io/master operator: Exists volumes: - emptyDir: {} name: tmp-dir --- apiVersion: policy/v1beta1 kind: PodDisruptionBudget metadata: name: metrics-server namespace: kube-system spec: minAvailable: 1 selector: matchLabels: k8s-app: metrics-server
k8sLens 官方网站
K8s配置 删除资源 若资源是DEPLOY.yaml创建的,则可以使用命令1 2 # 删除yaml文件中描述的资源 kubectl delete -f DEPLOY.yaml
删除命名空间中所有资源–暂时未验证1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1、先查找该命名空间下的资源有哪些, kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n ingress-nginx 确定资源类型如下, ingress deployment service 2、清理ingress-nginx命名空间下的资源 kubectl get ingress -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete ingress -n ingress-nginx kubectl get service -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete service -n ingress-nginx kubectl get deployment -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete deployment -n ingress-nginx 3、删除命名空间ingress-nginx kubectl delete ns ingress-nginx 4、查看该命名空间是否已删除 kubectl get ns ingress-nginx
命令补全工具 bash bash需要安装bash-completion:1 2 3 yum install bash-completion echo "source <(kubectl completion bash)" >> ~/.bashrc source ~/.bashrc
zsh 命令行执行:1 2 echo "source <(kubectl completion zsh)" >> ~/.zshrc source ~/.zshrc
普通用户命令权限 命令行执行:1 2 3 4 5 6 7 # 比如用户名为USER mkdir -p ~/.kube sudo cp -i /etc/kubernetes/admin.conf ~/.kube sudo chown USER:USER /etc/kubernetes/admin.conf # 配置环境变量 # zsh配置到.zshrc文件中添加 export KUBECONFIG=~/.kube/admin.conf
集群证书 集群证书存放位置/etc/kubernetes/pki/1 2 3 4 5 # 检查哪些证书过期 kubeadm certs check-expiration # 手动刷新证书 kubeadm certs renew all
性能测试 工具使用参考测试文档
基础网络工具有:curl,iperf,Locust ,kubemark
业务性能测试有:JMeter, LoadRunner
Apache JMeter是压力测试工具。
LoadRunner是一种预测系统行为和性能的负载测试工具。1 2 3 4 5 6 # 时间指标说明 # 单位:秒 # time_connect:建立到服务器的 TCP 连接所用的时间 # time_starttransfer:在发出请求之后,Web 服务器返回数据的第一个字节所用的时间 # time_total:完成请求所用的时间 curl -o /dev/null -s -w '%{time_connect} %{time_starttransfer} %{time_total}' "http://sample-webapp:8000/"
iperf在CentOS安装方法1 2 3 yum install epel-release yum update yum install iperf
使用方法1 2 3 4 5 6 7 8 9 10 11 # 启动tcp服务端 iperf -s # 启动客户端测试 iperf -c <SERVER_IP> # 启动udp服务端 iperf -s -u # 启动客户端测试,udp可能受参数限制带宽,可以用-b更改最大带宽 iperf -c <SERVER_IP> -u # 双向测试只需要客户端增加-d参数
容器化构建发布 构建发布 参考文章 ,Dockerfile:1 2 3 4 5 6 7 8 9 10 11 FROM golang:buster as buildWORKDIR /go/src/greeter-server RUN curl -o main.go https://github.com/grpc/grpc-go/blob/91e0aeb192456225adf27966d04ada4cf8599915/examples/features/reflection/server/main.go && \ go mod init greeter-server && \ go mod tidy && \ go build -o /greeter-server main.go FROM gcr.io/distroless/base-debian10COPY --from=build /greeter-server / EXPOSE 50051 CMD ["/greeter-server" ]
grpc示例 ingress支持grpc示例,参考文章
卸载K8s 清理本体 1 2 3 4 # 卸载集群本体 kubeadm reset -f # 清理本体文件夹 rm -rf ~/.kube/ /etc/kubernetes/ /etc/cni /opt/cni /var/lib/etcd
清理组件 CentOS主机 1 2 # 清理组件 yum autoremove -y kubelet kubeadm kubectl && rm -rf /usr/bin/kube*
Debian/Ubuntu主机 1 2 3 # 清理组件 apt-get remove kube* rm -rf /usr/bin/kube*
清理相关镜像 1 2 3 4 5 # 由于从registry.cn-hangzhou.aliyuncs.com拉取的镜像,将镜像删除 docker image ls | grep -v grep | grep -v REPOSITORY | grep registry.cn-hangzhou.aliyuncs.com | awk '{print $3}' | xargs docker image rm - # 将所有镜像全部删除 docker image ls | grep -v REPOSITORY | awk '{print $3}' | xargs docker image rm -
FAQ bridge-nf-call-iptables异常 安装时报错[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
参考文章 1 2 3 # 解决方案 echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
flannel无法访问kubernetes资源10.96.0.1不可达 10.96.0.1地址是指向k8s集群default空间创建的kubernetes服务的,底层基础网络不通(tcpdump无法抓到到此IP的包,可以考虑更换工具或者增加参数)。虚拟机使用的是虚拟主机网络,怀疑是网络配置问题,将此网络包丢弃导致。虚拟机更换为普通桥接问题解决。
先参考资料
kube-proxy开启ipvs的前置条件(所有节点)1 2 3 4 5 6 7 8 9 cat > /etc/sysconfig/modules/ipvs.modules << EOF # !/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack_ipv4 EOF chmod 755 /etc/sysconfig/modules/ipvs.modules
docker使用systemd的cgroup1 2 3 /etc/docker/daemon.json中增加 "exec-opts": ["native.cgroupdriver=systemd"] Cgroup Driver: cgroupfs
11 2 3 4 5 6 7 8 9 10 11 12 13 14 kubeadm config print init-defaults > kubeadm-config.yaml 修改advertiseAddress值为本机IP 修改kubernetesVersion为k8s版本 修改dnsDomain为最高级域名--非必须 新增podSubnet: 10.244.0.0/16容器子网--非必须 修改serviceSubnet服务子网--非必须 # 镜像名需要改-- kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.log # 简化初始化流程 kubeadm init --apiserver-advertise-address=192.168.208.3 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=imsv2 kubectl -n kube-system get all
1 2 # 日志中关键错误 E1228 14:03:55.799748 1 main.go:234] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-rbzdn': Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-rbzdn": dial tcp 10.96.0.1:443: connect: network is unreachable
网络性能调优 参考文章
CentOS7内核版本低导致部分域名解析失败 参考文章