CentOS7-国内环境Kubernetes安装说明

Docker容器安装

镜像源切换

CentOS7主机

软件源切换阿里开源镜像站

1
2
3
4
5
6
7
8
9
# 源备份
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/bk.CentOS-Base.repo

# 下载镜像源
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo

# 升级系统相关组件
yum update
yum upgrade

容器本体安装

CentOS7主机

参考:菜鸟教程-CentOS Docker安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 安装容器基本依赖
sudo yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
sudo yum-config-manager \
--add-repo \
https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/centos/docker-ce.repo

# 安装特定版本容器,19.03.15
yum list docker-ce --showduplicates | sort -r
VERSION_STRING=19.03.15; sudo yum install -y docker-ce-${VERSION_STRING} docker-ce-cli-${VERSION_STRING} containerd.io

# 启动服务
systemctl start docker && systemctl enable docker

容器配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# /home/imsdata/docker是Docker基本路径,镜像和容器都会占用此空间,根据需要改动
# https://hub-mirror.c.163.com
# https://reg-mirror.qiniu.com
mkdir -p /etc/docker && mkdir -p /mnt/data_20/docker-root && cat > /etc/docker/daemon.json <<-EOF
{
"registry-mirrors": [
"http://docker.mirrors.ustc.edu.cn",
"http://registry.docker-cn.com",
"http://hub-mirror.c.163.com"
],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
},
"data-root": "$_"
}
EOF
# 老版本为graph
# 重启容器生效
systemctl daemon-reload && systemctl restart docker
# 容器命令验证,Docker Root Dir: 得到配置的路径
# Cgroup Driver: systemd
docker info | grep -E "Docker Root Dir:|Cgroup Driver:"

安装前准备

设置主机名

1
2
3
4
5
6
7
# 设置主机名
hostnamectl set-hostname <HOSTNAME>

# 将所有主机名加入hosts中,修改/etc/hosts
# echo -ne "192.168.11.251 node1\n192.168.11.252 node2\n192.168.11.253 node3\n" >> /etc/hosts
# echo -ne "n1 192.168.58.12\nn2 192.168.58.14\n" >> /etc/hosts
echo <IP> <HOSTNAME> >> /etc/hosts

禁用防火墙和缓存

CentOS主机

以下步骤二选一,若跳过禁用防火墙,需要增加端口规则:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 禁用防火墙
systemctl stop firewalld && systemctl disable firewalld

# 开启防火墙后,需要打开NAT转发
# 否则k8s正常启动后,dns解析失败
firewall-cmd --permanent --add-masquerade
# 开放端口规则
# 必须开放k8s使用的6443 10250
firewall-cmd --permanent --zone=public --add-port=6443/tcp
firewall-cmd --permanent --zone=public --add-port=10250/tcp
# 删除端口规则
# firewall-cmd --permanent --zone=public --remove-port=6443/tcp
# 开放指定范围端口规则
firewall-cmd --permanent --zone=public --add-port=31080-31090/tcp
# 开发虚拟网卡
firewall-cmd --permanent --zone=public --add-interface=cni0
# 让规则生效
firewall-cmd --reload
# 查看所有规则
firewall-cmd --list-all

禁用缓存:

1
2
3
4
5
6
7
8
9
10
11
12
# 禁用swap
# 注释/etc/fstab中swap行
swapoff -a
# 注释swap相关行 /mnt/swap swap swap defaults 0 0 这一行或者注释掉这一行
# 虚拟机中运行
# sed -i 's%^/dev/mapper/centos-swap.%# swap %g' /etc/fstab
sed -i 's/\(^[^#|.]*swap.*swap.*\)/#\1/g' /etc/fstab

# 修改启动等待时间为1s
# vim /boot/grub2/grub.cfg,修改第一个timeout值为0,跳过开机等待
# sed -i 's/^ set timeout=.*/ set timeout=0/' /boot/grub2/grub.cfg
sed -i 's/\(^\s*set timeout=\).*/\13/' /boot/grub2/grub.cfg

关闭SELinux

CentOS主机

永久方法 – 需要重启服务器

修改 /etc/selinux/config 文件中设置 SELINUX=disabled ,然后重启服务器,命令

1
2
3
#sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sed -i 's/\(^SELINUX=\).*/\1disabled/g' /etc/selinux/config
setenforce 0

打开iptables的支持

Debian10通过命令sysctl -a查看,默认已经打开iptabels支持。

CentOS主机

1
2
3
4
5
6
7
8
9
10
11
12
# 配置内核参数(虚拟机方式)
echo -ne "net.bridge.bridge-nf-call-iptables = 1\nnet.bridge.bridge-nf-call-ip6tables = 1\n" >> /etc/sysctl.conf
# 内核配置生效
sysctl -p

# 显示内核参数,回显1为打开,0为关闭
sysctl -a | egrep "bridge-nf-call-iptables|bridge-nf-call-ip6tables|ip_forward"
# 另一种查看方法
cat /proc/sys/net/bridge/bridge-nf-call-iptables
cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
cat /proc/sys/net/ipv4/ip_forward

配置国内源

若有vpn可以跳过,参考:阿里开源-Kubernetes镜像

CentOS主机

1
2
3
4
5
6
7
8
9
10
11
# 配置国内源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
sudo yum update

若出现签名验证不过signature could not be verified for kubernetes,可以强行跳过签名验证:

1
2
3
4
5
6
7
8
9
10
11
# 配置国内源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
sudo yum update

安装辅助工具

安装特定版本辅助工具,例如:1.21.14

CentOS主机

1
2
3
4
5
6
7
8
# 安装特定版本的工具
yum list kubelet --showduplicates | sort -r
# KUBE_VERSION=1.21.11; yum install -y kubelet-${KUBE_VERSION} kubeadm-${KUBE_VERSION} kubectl-${KUBE_VERSION}
KUBE_VERSION=1.21.14; yum install -y kubelet-${KUBE_VERSION} kubeadm-${KUBE_VERSION} kubectl-${KUBE_VERSION}
# 启动kubelet
systemctl enable kubelet && systemctl start kubelet
# 此时因k8s配置文件不存在,kubelet启动失败
systemctl status kubelet

安装K8s

拉取镜像

从国内源在线拉取

1
2
3
4
5
6
7
8
9
10
# 列出所有需要的版本镜像
kubeadm config images list --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers

# 拉取所有需要的版本镜像
kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers

# 测试域名 docker pull busybox:1.28.4
# 网络控制器 docker pull quay.io/coreos/flannel:v0.14.0
# ingress类控制器 docker pull quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.29.0
# 集群管理 docker pull swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3

若已下载镜像执行

1
2
ls k8s-images-1.21.9/*.image | while read file; do docker image load -i $file; done
ls flannel-images/*.image | while read file; do docker image load -i $file; done

【离线安装】批量镜像处理——非必须

1
2
3
4
5
# 批量保存镜像
docker image ls | grep "registry.cn-hangzhou.aliyuncs.com" | awk '{print $1 ":" $2}' | while read image; do file=${image//\//_}.image; file=${file//:/_}; docker image save $image -o $file; done

# 批量加载镜像
ls *.image | while read file; do docker image load -i $file; done

初始化K8s

从节点无需初始化

1
2
3
4
5
6
7
8
9
10
11
12
# 指定镜像源初始化
# 需要特定网络,需要在此步骤中指定
# 使用flannel网络需要指定--pod-network-cidr参数
# --pod-network-cidr 指定pod网络地址范围
# --apiserver-advertise-address 指定集群API server绑定地址
# --service-dns-domain 指定各服务顶级dns域名
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=172.20.1.25 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=ice | tee ./kubeadm-init.log
#kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.8.101 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=viid

# 出现Your Kubernetes control-plane has initialized successfully!表明初始化正常
# 服务状态正常,但因未配置网络/etc/cni/net.d,后台上报异常
systemctl status kubelet

若有cgroup改systemd告警,处理参考

安装命令补全工具

参考后面K8s配置章节

启动集群

1
2
3
4
5
6
7
8
9
10
11
12
# 启动集群
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# 若是root用户,需要执行
export KUBECONFIG=/etc/kubernetes/admin.conf

# 允许Master参与调度,单节点必须将Master置为可调度状态
kubectl get nodes
kubectl taint node localhost.localdomain node-role.kubernetes.io/master- #将 Master 也当作 Node 使用
kubectl taint node localhost.localdomain node-role.kubernetes.io/master=:NoSchedule #将 Master 恢复成 Master Only 状态

路由选择

默认kube-proxy使用iptables实现,随着service、pod数量增加,iptables顺序遍历的方式就会凸显。可替代方案可以选择:

  • iptables
  • IPVS

kube-proxy支持的另一种模式,性能比iptables更高,推荐使用。参考文章

  • eBPF

整体成熟度低,不建议单独使用,参考文章

替换kube-proxy方法,参考文章

可以对数据包进行观测,参考文章

网络配置

上面安装成功后如果通过查询kube-system下Pod的运行情况,会放下和网络相关的Pod都处于Pending的状态,这是因为缺少相关的网络插件,而网络插件有很多个(以下任选一个),可以选择自己需要的。参考:Kubernetes指南

flannel

参考官网

官方网站,在Documentation文件夹中,参考kube-flannel.yaml, kube-flannel-aliyun.yaml, kube-flannel-old.yaml描述文件

kube-flannel.yaml文件内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "host-gw"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.14.0
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.14.0
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg

Network.Network: 10.244.0.0/16需要改成kubeadm初始化参数所带网络参数。

Network.Backend.Type: vxlan为默认网络参数,已经改为host-gw测试文档中发现host-gw模式性能更高。

创建网络:

1
kubectl apply -f kube-flannel.yaml

等待片刻可以看到虚拟网桥cni0已经创建好。若coredns一直处于pending状态,请将flannel文件改为最新版本(0.19.2)

查看系统空间kube-systemflannel网络pod资源已经正常运行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
kubectl -n kube-system get all

# NAME READY STATUS RESTARTS AGE
# pod/coredns-6f6b8cc4f6-95bng 1/1 Running 0 24m
# pod/coredns-6f6b8cc4f6-hkdn8 1/1 Running 0 24m
# pod/etcd-n1 1/1 Running 1 24m
# pod/kube-apiserver-n1 1/1 Running 1 24m
# pod/kube-controller-manager-n1 1/1 Running 1 24m
# pod/kube-flannel-ds-vz49s 1/1 Running 0 74s
# pod/kube-proxy-bbjzq 1/1 Running 1 24m
# pod/kube-scheduler-n1 1/1 Running 1 24m
#
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 24m
#
# NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
# daemonset.apps/kube-flannel-ds 1 1 1 1 1 <none> 74s
# daemonset.apps/kube-proxy 1 1 1 1 1 kubernetes.io/os=linux 24m
#
# NAME READY UP-TO-DATE AVAILABLE AGE
# deployment.apps/coredns 2/2 2 2 24m
#
# NAME DESIRED CURRENT READY AGE
# replicaset.apps/coredns-6f6b8cc4f6 2 2 2 24m

cni

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
mkdir -p /etc/cni/net.d
cat >/etc/cni/net.d/10-mynet.conf <<-EOF
{
"cniVersion": "0.3.0",
"name": "mynet",
"type": "bridge",
"bridge": "cni0",
"isGateway": true,
"ipMasq": true,
"ipam": {
"type": "host-local",
"subnet": "10.244.0.0/16",
"routes": [
{"dst": "0.0.0.0/0"}
]
}
}
EOF
cat >/etc/cni/net.d/99-loopback.conf <<-EOF
{
"cniVersion": "0.3.0",
"type": "loopback"
}
EOF

等待片刻可以看到虚拟网桥cni0已经创建好。

查看是否安装成功

1
2
3
4
5
6
7
8
9
10
11
12
# 查看系统pod是否正常启动
kubectl get pods -n kube-system
# 看到类似结果
#NAME READY STATUS RESTARTS AGE
#coredns-6f6b8cc4f6-285bg 1/1 Running 0 7m38s
#coredns-6f6b8cc4f6-znlf7 1/1 Running 0 7m38s
#etcd-n175 1/1 Running 0 7m46s
#kube-apiserver-n175 1/1 Running 0 7m46s
#kube-controller-manager-n175 1/1 Running 0 7m46s
#kube-flannel-ds-8p8xx 1/1 Running 0 95s
#kube-proxy-5zvr2 1/1 Running 0 7m38s
#kube-scheduler-n175 1/1 Running 0 7m46s

域名解析测试方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 启动容器,-n xxx可以指定创建的命名空间
kubectl run -it --rm --image=busybox:1.28.4 --restart=Never sh

# 解析域名地址,格式:<Service>.<Namespace>.svc.cluster.local
nslookup kube-dns.kube-system
nslookup kube-dns.kube-system.svc.cluster.ice
nslookup ver-svc
nslookup ver-svc.ver-dev
nslookup ver-svc.ver-dev.svc
nslookup ver-svc.ver-dev.svc.cluster
nslookup ver-svc.ver-dev.svc.cluster.ice
# 正常能查看到类似结果
#Server: 10.96.0.10
#Address 1: 10.96.0.10 kube-dns.kube-system.svc.ice
#
#Name: kube-dns.kube-system
#Address 1: 10.96.0.10 kube-dns.kube-system.svc.ice

域名解析时都会优先尝试直接域名解析,若无法连通则会逐步扩大解析范围。使用简短域名解析通过tcpdump -i cni0 -w result.cap抓包dns流程分析,简短域名解析耗时10+ms,而完整域名解析1+ms,因此推荐使用完整域名解析。

工作Node节点加入集群

操作步骤到安装辅助工具后,只需要加载pause、proxy、网络插件(若有)镜像。再运行主节点部署时的回显命令即可加入集群。

1
2
3
# 加载节点镜像
docker image load -i k8s-images-1.21.8/registry.cn-hangzhou.aliyuncs.com_google_containers_pause_3.4.1.image
docker image load -i k8s-images-1.21.8/registry.cn-hangzhou.aliyuncs.com_google_containers_kube-proxy_v1.21.8.image

Ingress类控制器安装

ingress-nginx

官方GitHub仓,注意官方使用的镜像k8s.gcr.io可能无法下载,需要替换一下

nginx分支

参考文章

nginx-0.29.0分支为例,进入到资源描述路径deploy/static

  • configmap.yaml 提供configmap可以在线更新nginx的配置
  • namespace.yaml 创建一个独立的命名空间 ingress-nginx
  • rbac.yaml 创建对应的role rolebinding 用于rbac
  • with-rbac.yaml 有应用rbac的nginx-ingress-controller组件
  • mandatory.yaml 以上所有文件的集合
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
apiVersion: v1
kind: Namespace
metadata:
name: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx

---

kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-configuration
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx

---
kind: ConfigMap
apiVersion: v1
metadata:
name: tcp-services
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx

---
kind: ConfigMap
apiVersion: v1
metadata:
name: udp-services
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx

---
apiVersion: v1
kind: ServiceAccount
metadata:
name: nginx-ingress-serviceaccount
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: nginx-ingress-clusterrole
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
- "networking.k8s.io"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
- "networking.k8s.io"
resources:
- ingresses/status
verbs:
- update

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: nginx-ingress-role
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
- namespaces
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
# Defaults to "<election-id>-<ingress-class>"
# Here: "<ingress-controller-leader>-<nginx>"
# This has to be adapted if you change either parameter
# when launching the nginx-ingress-controller.
- "ingress-controller-leader-nginx"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: nginx-ingress-role-nisa-binding
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: nginx-ingress-role
subjects:
- kind: ServiceAccount
name: nginx-ingress-serviceaccount
namespace: ingress-nginx

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: nginx-ingress-clusterrole-nisa-binding
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: nginx-ingress-clusterrole
subjects:
- kind: ServiceAccount
name: nginx-ingress-serviceaccount
namespace: ingress-nginx

---

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-ingress-controller
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
template:
metadata:
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
annotations:
prometheus.io/port: "10254"
prometheus.io/scrape: "true"
spec:
hostNetwork: true # 设置物理网络
# wait up to five minutes for the drain of connections
terminationGracePeriodSeconds: 300
serviceAccountName: nginx-ingress-serviceaccount
nodeSelector:
kubernetes.io/os: linux
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.29.0
args:
- /nginx-ingress-controller
- --configmap=$(POD_NAMESPACE)/nginx-configuration
- --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
- --udp-services-configmap=$(POD_NAMESPACE)/udp-services
- --publish-service=$(POD_NAMESPACE)/ingress-nginx
- --annotations-prefix=nginx.ingress.kubernetes.io
# 增加额外参数指定暴露端口
- --http-port=31080
- --https-port=31443
securityContext:
allowPrivilegeEscalation: true
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
# www-data -> 101
runAsUser: 101
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- name: http
containerPort: 31080
protocol: TCP
hostPort: 31080 # 指定物理映射端口,可添加也可不添加
- name: https
containerPort: 31443
protocol: TCP
hostPort: 31443 # 指定物理映射端口,可添加也可不添加
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
lifecycle:
preStop:
exec:
command:
- /wait-shutdown

---

apiVersion: v1
kind: LimitRange
metadata:
name: ingress-nginx
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
spec:
limits:
- min:
memory: 90Mi
cpu: 100m
type: Container

svc服务文件deploy/baremetal/service-nodeport.yaml,将ingress内部80端口暴露到节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx
namespace: ingress-nginx
spec:
type: ClusterIP
#type: NodePort
ports:
- name: http
port: 31080
#nodePort: 31080
targetPort: 31080
protocol: TCP
- name: https
port: 31443
targetPort: 31443
protocol: TCP
selector:
app: ingress-nginx
#externalTrafficPolicy: Cluster

默认nginx-ingress-controller会随意选择一个node节点运行pod,为此需要我们把nginx-ingress-controller运行到指定的node节点上。首先需要给需要运行nginx-ingress-controller的node节点打标签,在此我们把nginx-ingress-controller运行在指定的node节点上

此步骤非必须

为节点打标签

1
2
3
4
5
# 为指定节点打标签,标签可以换成其他的
kubectl label node localhost.localdomain nodeFeature=nginx

# 查看节点已有标签
kubectl get nodes --show-labels

mandatory.yaml文件中nodeSelector属性中增加nodeFeature: nginx

验证程序

ingress-nginx测试程序httpd-dep.yaml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: httpd
labels:
name: httpd
spec:
rules:
- http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: httpd
port:
number: 8000

---
apiVersion: v1
kind: Service
metadata:
name: httpd
spec:
selector:
app: httpd
ports:
- port: 8000
protocol: TCP
targetPort: 80
type: ClusterIP

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpd
spec:
replicas: 4
selector:
matchLabels:
app: httpd
template:
metadata:
labels:
app: httpd
spec:
containers:
- name: httpd
image: httpd:2.4.52
resources:
limits:
memory: "128Mi"
cpu: "500m"
ports:
- containerPort: 80

若未指定ingress会随机选择节点启动pod,查看命令

1
2
3
4
kubectl -n ingress-nginx get pod -o wide

#NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
#nginx-ingress-controller-fd46d8644-s772j 1/1 Running 0 103m 192.168.31.192 n2 <none> <none>

K8s上部署Redis集群

【重点】不适合部署带持久化的Redis集群,因Redis集群以IP建立的。

重启后无法恢复原有集群。

本方案采用StatefulSet进行redis的部署。参考文章

环境信息

序列节点IP
1master192.168.1.100
2node1192.168.1.101
3node2192.168.1.102
4node3192.168.1.103

创建存储卷

  1. 安装nfs软件包
1
2
#任选一台节点(这里选k8s-master)
yum –y install nfs-utils rpcbind
  1. 创建共享存储
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 创建共享目录
mkdir -p /home/data/redis/pv{1,2,3,4,5,6}

# 配置共享路径
vi /etc/exports
# 增加如下配置
/home/data/redis/pv1 192.168.11.0/24(rw,sync,no_root_squash)
/home/data/redis/pv2 192.168.11.0/24(rw,sync,no_root_squash)
/home/data/redis/pv3 192.168.11.0/24(rw,sync,no_root_squash)
/home/data/redis/pv4 192.168.11.0/24(rw,sync,no_root_squash)
/home/data/redis/pv5 192.168.11.0/24(rw,sync,no_root_squash)
/home/data/redis/pv6 192.168.11.0/24(rw,sync,no_root_squash)

# 重启
systemctl restart rpcbind
systemctl restart nfs
systemctl enable nfs

# 其他节点验证nfs
yum -y install nfs-utils
showmount -e 172.20.1.25

创建PV

pv.yaml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv1
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv1"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-vp2
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv2"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv3
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv3"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-vp4
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv4"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv5
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv5"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-vp6
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv6"

创建configmap

vim redis.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# appendonly yes
# cluster-enabled yes
# cluster-config-file /var/lib/redis/nodes.conf
# cluster-node-timeout 5000
# dir /var/lib/redis
# port 6379
# redis端口
port 6379
# # 连接密码
# requirepass hzzhcs@2020
# masterauth hzzhcs@2020
# 关闭保护模式
protected-mode no
# 开启集群
cluster-enabled yes
# 集群节点配置
cluster-config-file nodes-${PORT}.conf
# 超时
cluster-node-timeout 5000
# # 集群节点IP host模式为宿主机IP
# # cluster-announce-ip 192.168.195.10
# # cluster-announce-ip 192.168.28.170
# cluster-announce-ip 192.168.11.215
# # 节点端口 6379 - 6381
# # 集群端口 16379 - 16381
# cluster-announce-port ${PORT}
# cluster-announce-bus-port ${CPORT}
# 开启 appendonly 备份模式
appendonly yes
# 每秒钟备份
appendfsync everysec
# 对aof文件进行压缩时,是否执行同步操作
no-appendfsync-on-rewrite no
# 当目前aof文件大小超过上一次重写时的aof文件大小的100%时会再次进行重写
auto-aof-rewrite-percentage 100
# 重写前AOF文件的大小最小值 默认 64mb
auto-aof-rewrite-min-size 64mb

创建

1
kubectl create configmap redis-conf --from-file=redis.conf

创建headless service

vim headless-service.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
---
apiVersion: v1
kind: Service
metadata:
name: redis-service
labels:
app: redis
spec:
ports:
- name: redis-port
port: 6379
clusterIP: None
selector:
app: redis

创建redis集群节点

通过StatefulSet创建6个redis的pod ,实现3主3从的redis集群。vim redis.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-app
spec:
serviceName: "redis-service"
replicas: 6
selector:
matchLabels:
app: redis
appCluster: redis-cluster
template:
metadata:
labels:
app: redis
appCluster: redis-cluster
spec:
containers:
- name: redis
image: "redis:3.2.8"
command:
- "redis-server"
args:
- "/etc/redis/redis.conf"
- "-protected-mode"
- "no"
resources:
requests:
cpu: "100m"
memory: "100Mi"
ports:
- name: redis
containerPort: 6379
protocol: "TCP"
- name: cluster
containerPort: 16379
protocol: "TCP"
volumeMounts:
- name: "redis-conf"
mountPath: "/etc/redis"
- name: "redis-data"
mountPath: "/var/lib/redis"
volumes:
- name: "redis-conf"
configMap:
name: "redis-conf"
items:
- key: "redis.conf"
path: "redis.conf"
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: "redispv"
resources:
requests:
storage: 2Gi

初始化redis集群

获取节点信息

1
2
3
4
5
6
7
8
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-app-0 1/1 Running 0 2m22s 10.244.0.248 n175 <none> <none>
redis-app-1 1/1 Running 0 97s 10.244.0.252 n175 <none> <none>
redis-app-2 1/1 Running 0 101s 10.244.0.251 n175 <none> <none>
redis-app-3 1/1 Running 0 117s 10.244.0.250 n175 <none> <none>
redis-app-4 1/1 Running 0 2m1s 10.244.0.249 n175 <none> <none>
redis-app-5 1/1 Running 0 2m31s 10.244.0.247 n175 <none> <none>

进入任意节点执行命令初始化集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
redis-cli --cluster create \
10.244.0.248:6379 \
10.244.0.252:6379 \
10.244.0.251:6379 \
10.244.0.250:6379 \
10.244.0.249:6379 \
10.244.0.247:6379 \
--cluster-replicas 1

# 无法正常建立集群
redis-cli --cluster create \
redis-app-0.redis-service.default.svc.ice:6379 \
redis-app-1.redis-service.default.svc.ice:6379 \
redis-app-2.redis-service.default.svc.ice:6379 \
redis-app-3.redis-service.default.svc.ice:6379 \
redis-app-4.redis-service.default.svc.ice:6379 \
redis-app-5.redis-service.default.svc.ice:6379 \
--cluster-replicas 1

# 回显信息
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.244.0.249:6379 to 10.244.0.248:6379
Adding replica 10.244.0.247:6379 to 10.244.0.252:6379
Adding replica 10.244.0.250:6379 to 10.244.0.251:6379
M: 8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379
slots:[0-5460] (5461 slots) master
M: f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379
slots:[5461-10922] (5462 slots) master
M: dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379
slots:[10923-16383] (5461 slots) master
S: b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379
replicates dc0a1908830468f2070883e1c026fd5b1b2ff526
S: 4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379
replicates 8921255530faaa44fb793a7248d93c179211b7d9
S: dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379
replicates f32bcfa4dcacef662096d5ccdef4d741588aa2cb
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
..
>>> Performing Cluster Check (using node 10.244.0.248:6379)
M: 8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379
slots: (0 slots) slave
replicates 8921255530faaa44fb793a7248d93c179211b7d9
M: f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379
slots: (0 slots) slave
replicates dc0a1908830468f2070883e1c026fd5b1b2ff526
S: dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379
slots: (0 slots) slave
replicates f32bcfa4dcacef662096d5ccdef4d741588aa2cb
M: dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

验证状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
redis-cli -c

# 操作回显信息
127.0.0.1:6379> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:315
cluster_stats_messages_pong_sent:335
cluster_stats_messages_sent:650
cluster_stats_messages_ping_received:330
cluster_stats_messages_pong_received:315
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:650
127.0.0.1:6379> CLUSTER NODES
4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379@16379 slave 8921255530faaa44fb793a7248d93c179211b7d9 0 1658228657078 1 connected
f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379@16379 master - 0 1658228657580 2 connected 5461-10922
b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379@16379 slave dc0a1908830468f2070883e1c026fd5b1b2ff526 0 1658228658082 3 connected
dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379@16379 slave f32bcfa4dcacef662096d5ccdef4d741588aa2cb 0 1658228657000 2 connected
8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379@16379 myself,master - 0 1658228657000 1 connected 0-5460
dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379@16379 master - 0 1658228657078 3 connected 10923-16383

创建用于访问的service

之前创建了用于实现StatefulSet的Headless Service,但该Service没有Cluster IP,因此不能用于外界访问。所以,我们还需要创建一个Service,专用于为Redis集群提供访问和负载均衡;也可以部署为Ingress供集群外部访问。这里只创建用于内部访问的service

redis-access-service.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Service
metadata:
name: redis-access-service
labels:
app: redis
spec:
ports:
- name: redis-port
protocol: "TCP"
port: 6379
targetPort: 6379
selector:
app: redis
appCluster: redis-cluster

K8s上部署Kafka集群

K8s上部署MySQL集群

Helm离线安装Chart包

参考文章

CICD引擎(非必须)

Jenkins替代工具

BuildMaster Drone.io GoCD

Argo

可视化管理工具(非必须)

kuboard

在线安装

1
2
3
4
5
6
7
8
9
10
11
12
  # 也可以使用镜像 swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3 ,可以更快地完成镜像下载。
# 请不要使用 127.0.0.1 或者 localhost 作为内网 IP \
# Kuboard 不需要和 K8S 在同一个网段,Kuboard Agent 甚至可以通过代理访问 Kuboard Server \
sudo docker run -d \
--restart=unless-stopped \
--name=kuboard \
-p 30080:80/tcp \
-p 31089:10081/tcp \
-e KUBOARD_ENDPOINT="http://192.168.11.178:30080" \
-e KUBOARD_AGENT_SERVER_TCP_PORT="31089" \
-v /opt/kuboard-data:/data \
swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3.4.1.0

docker-compose.yml方式:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
version: "3"

networks:
kuboard-net:
external: false
driver: bridge
# ipam:
# config:
# - subnet: 172.90.161.0/24

services:
kuboard:
image: swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3.4.1.0
container_name: kuboard
restart: unless-stopped # always
environment:
# user/passwd:admin/Kuboard123
# modify: Zdxf@2021
KUBOARD_ENDPOINT: "http://172.20.1.25:30080"
KUBOARD_AGENT_SERVER_TCP_PORT: "31089"
KUBERNETES_CLUSTER_DOMAIN: "ice"
KUBOARD_ICP_DESCRIPTION: "ICP备案号"
KUBOARD_DISABLE_AUDIT: true
ports:
- 30080:80/tcp
- 31089:10081/tcp
volumes:
- /etc/localtime:/etc/localtime:ro
- ./data:/data
networks:
- kuboard-net

logging:
#driver: none
driver: json-file
options:
max-size: "200k"
max-file: "1"

# 使用deploy限制资源,启动时需要增加--compatibility参数,防止报错
deploy:
resources:
limits:
cpus: '1'
memory: 1G
reservations:
cpus: '1'
memory: 200M

在浏览器输入 http://your-host-ip:31088 即可访问 Kuboard v3.x 的界面,登录方式:

  • 用户名: admin
  • 密 码: Kuboard123

浏览器兼容性

请使用 Chrome / FireFox / Safari 等浏览器

不兼容 IE 以及以 IE 为内核的浏览器

资源监控

metrics-server.yaml内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: 443
selector:
k8s-app: metrics-server

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: 'true'
rbac.authorization.k8s.io/aggregate-to-edit: 'true'
rbac.authorization.k8s.io/aggregate-to-view: 'true'
name: 'system:aggregated-metrics-reader'
namespace: kube-system
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: 'system:metrics-server'
namespace: kube-system
rules:
- apiGroups:
- ''
resources:
- pods
- nodes
- nodes/stats
- namespaces
- configmaps
verbs:
- get
- list
- watch

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: 'metrics-server:system:auth-delegator'
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: 'system:auth-delegator'
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: 'system:metrics-server'
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: 'system:metrics-server'
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system

---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system

---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
namespace: kube-system
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100

---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: metrics-server
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
weight: 100
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
k8s-app: metrics-server
namespaces:
- kube-system
topologyKey: kubernetes.io/hostname
containers:
- args:
- '--cert-dir=/tmp'
- '--secure-port=443'
- '--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname'
- '--kubelet-use-node-status-port'
- '--kubelet-insecure-tls=true'
- '--authorization-always-allow-paths=/livez,/readyz'
- '--metric-resolution=15s'
image: >-
swr.cn-east-2.myhuaweicloud.com/kuboard-dependency/metrics-server:v0.5.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
tolerations:
- effect: ''
key: node-role.kubernetes.io/master
operator: Exists
volumes:
- emptyDir: {}
name: tmp-dir

---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: metrics-server
namespace: kube-system
spec:
minAvailable: 1
selector:
matchLabels:
k8s-app: metrics-server

k8sLens

官方网站

K8s配置

删除资源

若资源是DEPLOY.yaml创建的,则可以使用命令

1
2
# 删除yaml文件中描述的资源
kubectl delete -f DEPLOY.yaml

删除命名空间中所有资源–暂时未验证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1、先查找该命名空间下的资源有哪些,
kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n ingress-nginx
确定资源类型如下,
ingress
deployment
service

2、清理ingress-nginx命名空间下的资源
kubectl get ingress -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete ingress -n ingress-nginx
kubectl get service -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete service -n ingress-nginx
kubectl get deployment -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete deployment -n ingress-nginx
3、删除命名空间ingress-nginx
kubectl delete ns ingress-nginx
4、查看该命名空间是否已删除
kubectl get ns ingress-nginx

命令补全工具

bash

bash需要安装bash-completion:

1
2
3
yum install bash-completion
echo "source <(kubectl completion bash)" >> ~/.bashrc
source ~/.bashrc

zsh

命令行执行:

1
2
echo "source <(kubectl completion zsh)" >> ~/.zshrc
source ~/.zshrc

普通用户命令权限

命令行执行:

1
2
3
4
5
6
7
# 比如用户名为USER
mkdir -p ~/.kube
sudo cp -i /etc/kubernetes/admin.conf ~/.kube
sudo chown USER:USER /etc/kubernetes/admin.conf
# 配置环境变量
# zsh配置到.zshrc文件中添加
export KUBECONFIG=~/.kube/admin.conf

集群证书

集群证书存放位置/etc/kubernetes/pki/

1
2
3
4
5
# 检查哪些证书过期
kubeadm certs check-expiration

# 手动刷新证书
kubeadm certs renew all

性能测试

工具使用参考测试文档

基础网络工具有:curl,iperf,Locust,kubemark

业务性能测试有:JMeter, LoadRunner

Apache JMeter是压力测试工具。

LoadRunner是一种预测系统行为和性能的负载测试工具。

1
2
3
4
5
6
# 时间指标说明
# 单位:秒
# time_connect:建立到服务器的 TCP 连接所用的时间
# time_starttransfer:在发出请求之后,Web 服务器返回数据的第一个字节所用的时间
# time_total:完成请求所用的时间
curl -o /dev/null -s -w '%{time_connect} %{time_starttransfer} %{time_total}' "http://sample-webapp:8000/"

iperf在CentOS安装方法

1
2
3
yum install epel-release
yum update
yum install iperf

使用方法

1
2
3
4
5
6
7
8
9
10
11
# 启动tcp服务端
iperf -s
# 启动客户端测试
iperf -c <SERVER_IP>

# 启动udp服务端
iperf -s -u
# 启动客户端测试,udp可能受参数限制带宽,可以用-b更改最大带宽
iperf -c <SERVER_IP> -u

# 双向测试只需要客户端增加-d参数

容器化构建发布

构建发布

参考文章,Dockerfile:

1
2
3
4
5
6
7
8
9
10
11
FROM golang:buster as build
WORKDIR /go/src/greeter-server
RUN curl -o main.go https://github.com/grpc/grpc-go/blob/91e0aeb192456225adf27966d04ada4cf8599915/examples/features/reflection/server/main.go && \
go mod init greeter-server && \
go mod tidy && \
go build -o /greeter-server main.go

FROM gcr.io/distroless/base-debian10
COPY --from=build /greeter-server /
EXPOSE 50051
CMD ["/greeter-server"]

grpc示例

ingress支持grpc示例,参考文章

卸载K8s

清理本体

1
2
3
4
# 卸载集群本体
kubeadm reset -f
# 清理本体文件夹
rm -rf ~/.kube/ /etc/kubernetes/ /etc/cni /opt/cni /var/lib/etcd

清理组件

CentOS主机

1
2
# 清理组件
yum autoremove -y kubelet kubeadm kubectl && rm -rf /usr/bin/kube*

Debian/Ubuntu主机

1
2
3
# 清理组件
apt-get remove kube*
rm -rf /usr/bin/kube*

清理相关镜像

1
2
3
4
5
# 由于从registry.cn-hangzhou.aliyuncs.com拉取的镜像,将镜像删除
docker image ls | grep -v grep | grep -v REPOSITORY | grep registry.cn-hangzhou.aliyuncs.com | awk '{print $3}' | xargs docker image rm -

# 将所有镜像全部删除
docker image ls | grep -v REPOSITORY | awk '{print $3}' | xargs docker image rm -

FAQ

bridge-nf-call-iptables异常

安装时报错[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1

参考文章

1
2
3
# 解决方案
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables

flannel无法访问kubernetes资源10.96.0.1不可达

10.96.0.1地址是指向k8s集群default空间创建的kubernetes服务的,底层基础网络不通(tcpdump无法抓到到此IP的包,可以考虑更换工具或者增加参数)。虚拟机使用的是虚拟主机网络,怀疑是网络配置问题,将此网络包丢弃导致。虚拟机更换为普通桥接问题解决。

先参考资料

kube-proxy开启ipvs的前置条件(所有节点)

1
2
3
4
5
6
7
8
9
cat > /etc/sysconfig/modules/ipvs.modules << EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules

docker使用systemd的cgroup

1
2
3
/etc/docker/daemon.json中增加
"exec-opts": ["native.cgroupdriver=systemd"]
Cgroup Driver: cgroupfs

1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
kubeadm config print init-defaults > kubeadm-config.yaml

修改advertiseAddress值为本机IP
修改kubernetesVersion为k8s版本
修改dnsDomain为最高级域名--非必须
新增podSubnet: 10.244.0.0/16容器子网--非必须
修改serviceSubnet服务子网--非必须

# 镜像名需要改--
kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.log
# 简化初始化流程
kubeadm init --apiserver-advertise-address=192.168.208.3 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=imsv2

kubectl -n kube-system get all
1
2
# 日志中关键错误
E1228 14:03:55.799748 1 main.go:234] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-rbzdn': Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-rbzdn": dial tcp 10.96.0.1:443: connect: network is unreachable

网络性能调优

参考文章

CentOS7内核版本低导致部分域名解析失败

参考文章