CentOS7-国内环境Kubernetes安装说明
Docker容器安装
镜像源切换
CentOS7主机
软件源切换阿里开源镜像站1
2
3
4
5
6
7
8
9源备份
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/bk.CentOS-Base.repo
下载镜像源
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
升级系统相关组件
yum update
yum upgrade
容器本体安装
CentOS7主机
参考:菜鸟教程-CentOS Docker安装1
2
3
4
5
6
7
8
9
10
11
12
13
14# 安装容器基本依赖
sudo yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
sudo yum-config-manager \
--add-repo \
https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/centos/docker-ce.repo
# 安装特定版本容器,19.03.15
yum list docker-ce --showduplicates | sort -r
VERSION_STRING=19.03.15; sudo yum install -y docker-ce-${VERSION_STRING} docker-ce-cli-${VERSION_STRING} containerd.io
# 启动服务
systemctl start docker && systemctl enable docker
容器配置
1 | /home/imsdata/docker是Docker基本路径,镜像和容器都会占用此空间,根据需要改动 |
安装前准备
设置主机名
1 | 设置主机名 |
禁用防火墙和缓存
CentOS主机
以下步骤二选一,若跳过禁用防火墙,需要增加端口规则:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20禁用防火墙
systemctl stop firewalld && systemctl disable firewalld
开启防火墙后,需要打开NAT转发
否则k8s正常启动后,dns解析失败
firewall-cmd --permanent --add-masquerade
开放端口规则
必须开放k8s使用的6443 10250
firewall-cmd --permanent --zone=public --add-port=6443/tcp
firewall-cmd --permanent --zone=public --add-port=10250/tcp
删除端口规则
firewall-cmd --permanent --zone=public --remove-port=6443/tcp
开放指定范围端口规则
firewall-cmd --permanent --zone=public --add-port=31080-31090/tcp
开发虚拟网卡
firewall-cmd --permanent --zone=public --add-interface=cni0
让规则生效
firewall-cmd --reload
查看所有规则
firewall-cmd --list-all
禁用缓存:1
2
3
4
5
6
7
8
9
10
11
12禁用swap
注释/etc/fstab中swap行
swapoff -a
注释swap相关行 /mnt/swap swap swap defaults 0 0 这一行或者注释掉这一行
虚拟机中运行
sed -i 's%^/dev/mapper/centos-swap.%# swap %g' /etc/fstab
sed -i 's/\(^[^#|.]*swap.*swap.*\)/#\1/g' /etc/fstab
修改启动等待时间为1s
vim /boot/grub2/grub.cfg,修改第一个timeout值为0,跳过开机等待
sed -i 's/^ set timeout=.*/ set timeout=0/' /boot/grub2/grub.cfg
sed -i 's/\(^\s*set timeout=\).*/\13/' /boot/grub2/grub.cfg
关闭SELinux
CentOS主机
永久方法 – 需要重启服务器
修改 /etc/selinux/config 文件中设置 SELINUX=disabled ,然后重启服务器,命令1
2
3sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sed -i 's/\(^SELINUX=\).*/\1disabled/g' /etc/selinux/config
setenforce 0
打开iptables的支持
Debian10通过命令sysctl -a查看,默认已经打开iptabels支持。
CentOS主机
1 | 配置内核参数(虚拟机方式) |
配置国内源
若有vpn可以跳过,参考:阿里开源-Kubernetes镜像
CentOS主机
1 | 配置国内源 |
若出现签名验证不过signature could not be verified for kubernetes,可以强行跳过签名验证:1
2
3
4
5
6
7
8
9
10
11配置国内源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
sudo yum update
安装辅助工具
安装特定版本辅助工具,例如:1.21.14
CentOS主机
1 | 安装特定版本的工具 |
安装K8s
拉取镜像
从国内源在线拉取1
2
3
4
5
6
7
8
9
10列出所有需要的版本镜像
kubeadm config images list --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers
拉取所有需要的版本镜像
kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers
测试域名 docker pull busybox:1.28.4
网络控制器 docker pull quay.io/coreos/flannel:v0.14.0
ingress类控制器 docker pull quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.29.0
集群管理 docker pull swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3
若已下载镜像执行1
2ls k8s-images-1.21.9/*.image | while read file; do docker image load -i $file; done
ls flannel-images/*.image | while read file; do docker image load -i $file; done
【离线安装】批量镜像处理——非必须
1 | 批量保存镜像 |
初始化K8s
从节点无需初始化1
2
3
4
5
6
7
8
9
10
11
12指定镜像源初始化
需要特定网络,需要在此步骤中指定
使用flannel网络需要指定--pod-network-cidr参数
--pod-network-cidr 指定pod网络地址范围
--apiserver-advertise-address 指定集群API server绑定地址
--service-dns-domain 指定各服务顶级dns域名
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=172.20.1.25 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=ice | tee ./kubeadm-init.log
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.8.101 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=viid
出现Your Kubernetes control-plane has initialized successfully!表明初始化正常
服务状态正常,但因未配置网络/etc/cni/net.d,后台上报异常
systemctl status kubelet
若有cgroup改systemd告警,处理参考
安装命令补全工具
参考后面K8s配置章节
启动集群
1 | 启动集群 |
路由选择
默认kube-proxy使用iptables实现,随着service、pod数量增加,iptables顺序遍历的方式就会凸显。可替代方案可以选择:
- iptables
- IPVS
kube-proxy支持的另一种模式,性能比iptables更高,推荐使用。参考文章
- eBPF
整体成熟度低,不建议单独使用,参考文章
替换kube-proxy方法,参考文章
可以对数据包进行观测,参考文章
网络配置
上面安装成功后如果通过查询kube-system下Pod的运行情况,会放下和网络相关的Pod都处于Pending的状态,这是因为缺少相关的网络插件,而网络插件有很多个(以下任选一个),可以选择自己需要的。参考:Kubernetes指南
flannel
参考官网
官方网站,在Documentation文件夹中,参考kube-flannel.yaml, kube-flannel-aliyun.yaml, kube-flannel-old.yaml等描述文件
kube-flannel.yaml文件内容:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "host-gw"
}
}
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.14.0
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.14.0
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
Network.Network: 10.244.0.0/16需要改成kubeadm初始化参数所带网络参数。
Network.Backend.Type: vxlan为默认网络参数,已经改为host-gw,测试文档中发现host-gw模式性能更高。
创建网络:1
kubectl apply -f kube-flannel.yaml
等待片刻可以看到虚拟网桥cni0已经创建好。若coredns一直处于pending状态,请将flannel文件改为最新版本(0.19.2)
查看系统空间kube-system中flannel网络pod资源已经正常运行1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24kubectl -n kube-system get all
NAME READY STATUS RESTARTS AGE
pod/coredns-6f6b8cc4f6-95bng 1/1 Running 0 24m
pod/coredns-6f6b8cc4f6-hkdn8 1/1 Running 0 24m
pod/etcd-n1 1/1 Running 1 24m
pod/kube-apiserver-n1 1/1 Running 1 24m
pod/kube-controller-manager-n1 1/1 Running 1 24m
pod/kube-flannel-ds-vz49s 1/1 Running 0 74s
pod/kube-proxy-bbjzq 1/1 Running 1 24m
pod/kube-scheduler-n1 1/1 Running 1 24m
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 24m
# NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-flannel-ds 1 1 1 1 1 <none> 74s
daemonset.apps/kube-proxy 1 1 1 1 1 kubernetes.io/os=linux 24m
# NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 2/2 2 2 24m
# NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-6f6b8cc4f6 2 2 2 24m
cni
1 | mkdir -p /etc/cni/net.d |
等待片刻可以看到虚拟网桥cni0已经创建好。
查看是否安装成功
1 | 查看系统pod是否正常启动 |
域名解析测试方法
1 | 启动容器,-n xxx可以指定创建的命名空间 |
域名解析时都会优先尝试直接域名解析,若无法连通则会逐步扩大解析范围。使用简短域名解析通过tcpdump -i cni0 -w result.cap抓包dns流程分析,简短域名解析耗时10+ms,而完整域名解析1+ms,因此推荐使用完整域名解析。
工作Node节点加入集群
操作步骤到安装辅助工具后,只需要加载pause、proxy、网络插件(若有)镜像。再运行主节点部署时的回显命令即可加入集群。1
2
3加载节点镜像
docker image load -i k8s-images-1.21.8/registry.cn-hangzhou.aliyuncs.com_google_containers_pause_3.4.1.image
docker image load -i k8s-images-1.21.8/registry.cn-hangzhou.aliyuncs.com_google_containers_kube-proxy_v1.21.8.image
Ingress类控制器安装
ingress-nginx
官方GitHub仓,注意官方使用的镜像k8s.gcr.io可能无法下载,需要替换一下
nginx分支
参考文章
以nginx-0.29.0分支为例,进入到资源描述路径deploy/static
configmap.yaml提供configmap可以在线更新nginx的配置namespace.yaml创建一个独立的命名空间 ingress-nginxrbac.yaml创建对应的role rolebinding 用于rbacwith-rbac.yaml有应用rbac的nginx-ingress-controller组件mandatory.yaml以上所有文件的集合
1 | apiVersion: v1 |
svc服务文件deploy/baremetal/service-nodeport.yaml,将ingress内部80端口暴露到节点1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21apiVersion: v1
kind: Service
metadata:
name: ingress-nginx
namespace: ingress-nginx
spec:
type: ClusterIP
#type: NodePort
ports:
- name: http
port: 31080
#nodePort: 31080
targetPort: 31080
protocol: TCP
- name: https
port: 31443
targetPort: 31443
protocol: TCP
selector:
app: ingress-nginx
#externalTrafficPolicy: Cluster
默认nginx-ingress-controller会随意选择一个node节点运行pod,为此需要我们把nginx-ingress-controller运行到指定的node节点上。首先需要给需要运行nginx-ingress-controller的node节点打标签,在此我们把nginx-ingress-controller运行在指定的node节点上
此步骤非必须
为节点打标签1
2
3
4
5为指定节点打标签,标签可以换成其他的
kubectl label node localhost.localdomain nodeFeature=nginx
查看节点已有标签
kubectl get nodes --show-labels
在mandatory.yaml文件中nodeSelector属性中增加nodeFeature: nginx
验证程序
ingress-nginx测试程序httpd-dep.yaml:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: httpd
labels:
name: httpd
spec:
rules:
- http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: httpd
port:
number: 8000
apiVersion: v1
kind: Service
metadata:
name: httpd
spec:
selector:
app: httpd
ports:
- port: 8000
protocol: TCP
targetPort: 80
type: ClusterIP
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpd
spec:
replicas: 4
selector:
matchLabels:
app: httpd
template:
metadata:
labels:
app: httpd
spec:
containers:
- name: httpd
image: httpd:2.4.52
resources:
limits:
memory: "128Mi"
cpu: "500m"
ports:
- containerPort: 80
若未指定ingress会随机选择节点启动pod,查看命令1
2
3
4kubectl -n ingress-nginx get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-ingress-controller-fd46d8644-s772j 1/1 Running 0 103m 192.168.31.192 n2 <none> <none>
K8s上部署Redis集群
【重点】不适合部署带持久化的Redis集群,因Redis集群以IP建立的。
重启后无法恢复原有集群。
本方案采用StatefulSet进行redis的部署。参考文章
环境信息
| 序列 | 节点 | IP |
|---|---|---|
| 1 | master | 192.168.1.100 |
| 2 | node1 | 192.168.1.101 |
| 3 | node2 | 192.168.1.102 |
| 4 | node3 | 192.168.1.103 |
创建存储卷
- 安装nfs软件包
1 | 任选一台节点(这里选k8s-master) |
- 创建共享存储
1 | 创建共享目录 |
创建PV
pv.yaml:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv1
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv1"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-vp2
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv2"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv3
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv3"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-vp4
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv4"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv5
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv5"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-vp6
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "redis"
nfs:
server: 192.168.1.10
path: "/usr/local/k8s/redis/pv6"
创建configmap
vim redis.conf1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38# appendonly yes
# cluster-enabled yes
# cluster-config-file /var/lib/redis/nodes.conf
# cluster-node-timeout 5000
# dir /var/lib/redis
# port 6379
# redis端口
port 6379
# # 连接密码
# requirepass hzzhcs@2020
# masterauth hzzhcs@2020
# 关闭保护模式
protected-mode no
# 开启集群
cluster-enabled yes
# 集群节点配置
cluster-config-file nodes-${PORT}.conf
# 超时
cluster-node-timeout 5000
# # 集群节点IP host模式为宿主机IP
# # cluster-announce-ip 192.168.195.10
# # cluster-announce-ip 192.168.28.170
# cluster-announce-ip 192.168.11.215
# # 节点端口 6379 - 6381
# # 集群端口 16379 - 16381
# cluster-announce-port ${PORT}
# cluster-announce-bus-port ${CPORT}
# 开启 appendonly 备份模式
appendonly yes
# 每秒钟备份
appendfsync everysec
# 对aof文件进行压缩时,是否执行同步操作
no-appendfsync-on-rewrite no
# 当目前aof文件大小超过上一次重写时的aof文件大小的100%时会再次进行重写
auto-aof-rewrite-percentage 100
# 重写前AOF文件的大小最小值 默认 64mb
auto-aof-rewrite-min-size 64mb
创建1
kubectl create configmap redis-conf --from-file=redis.conf
创建headless service
vim headless-service.yaml1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: v1
kind: Service
metadata:
name: redis-service
labels:
app: redis
spec:
ports:
- name: redis-port
port: 6379
clusterIP: None
selector:
app: redis
创建redis集群节点
通过StatefulSet创建6个redis的pod ,实现3主3从的redis集群。vim redis.yaml1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-app
spec:
serviceName: "redis-service"
replicas: 6
selector:
matchLabels:
app: redis
appCluster: redis-cluster
template:
metadata:
labels:
app: redis
appCluster: redis-cluster
spec:
containers:
- name: redis
image: "redis:3.2.8"
command:
- "redis-server"
args:
- "/etc/redis/redis.conf"
- "-protected-mode"
- "no"
resources:
requests:
cpu: "100m"
memory: "100Mi"
ports:
- name: redis
containerPort: 6379
protocol: "TCP"
- name: cluster
containerPort: 16379
protocol: "TCP"
volumeMounts:
- name: "redis-conf"
mountPath: "/etc/redis"
- name: "redis-data"
mountPath: "/var/lib/redis"
volumes:
- name: "redis-conf"
configMap:
name: "redis-conf"
items:
- key: "redis.conf"
path: "redis.conf"
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: "redispv"
resources:
requests:
storage: 2Gi
初始化redis集群
获取节点信息1
2
3
4
5
6
7
8kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-app-0 1/1 Running 0 2m22s 10.244.0.248 n175 <none> <none>
redis-app-1 1/1 Running 0 97s 10.244.0.252 n175 <none> <none>
redis-app-2 1/1 Running 0 101s 10.244.0.251 n175 <none> <none>
redis-app-3 1/1 Running 0 117s 10.244.0.250 n175 <none> <none>
redis-app-4 1/1 Running 0 2m1s 10.244.0.249 n175 <none> <none>
redis-app-5 1/1 Running 0 2m31s 10.244.0.247 n175 <none> <none>
进入任意节点执行命令初始化集群1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68redis-cli --cluster create \
10.244.0.248:6379 \
10.244.0.252:6379 \
10.244.0.251:6379 \
10.244.0.250:6379 \
10.244.0.249:6379 \
10.244.0.247:6379 \
--cluster-replicas 1
无法正常建立集群
redis-cli --cluster create \
redis-app-0.redis-service.default.svc.ice:6379 \
redis-app-1.redis-service.default.svc.ice:6379 \
redis-app-2.redis-service.default.svc.ice:6379 \
redis-app-3.redis-service.default.svc.ice:6379 \
redis-app-4.redis-service.default.svc.ice:6379 \
redis-app-5.redis-service.default.svc.ice:6379 \
--cluster-replicas 1
回显信息
>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.244.0.249:6379 to 10.244.0.248:6379
Adding replica 10.244.0.247:6379 to 10.244.0.252:6379
Adding replica 10.244.0.250:6379 to 10.244.0.251:6379
M: 8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379
slots:[0-5460] (5461 slots) master
M: f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379
slots:[5461-10922] (5462 slots) master
M: dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379
slots:[10923-16383] (5461 slots) master
S: b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379
replicates dc0a1908830468f2070883e1c026fd5b1b2ff526
S: 4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379
replicates 8921255530faaa44fb793a7248d93c179211b7d9
S: dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379
replicates f32bcfa4dcacef662096d5ccdef4d741588aa2cb
Can I set the above configuration? (type 'yes' to accept): yes
>> Nodes configuration updated
>> Assign a different config epoch to each node
>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
..
>> Performing Cluster Check (using node 10.244.0.248:6379)
M: 8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379
slots: (0 slots) slave
replicates 8921255530faaa44fb793a7248d93c179211b7d9
M: f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379
slots: (0 slots) slave
replicates dc0a1908830468f2070883e1c026fd5b1b2ff526
S: dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379
slots: (0 slots) slave
replicates f32bcfa4dcacef662096d5ccdef4d741588aa2cb
M: dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>> Check for open slots...
>> Check slots coverage...
[OK] All 16384 slots covered.
验证状态1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27redis-cli -c
操作回显信息
127.0.0.1:6379> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:315
cluster_stats_messages_pong_sent:335
cluster_stats_messages_sent:650
cluster_stats_messages_ping_received:330
cluster_stats_messages_pong_received:315
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:650
127.0.0.1:6379> CLUSTER NODES
4ac41eded080c70132ab4de211fcdfa653874469 10.244.0.249:6379@16379 slave 8921255530faaa44fb793a7248d93c179211b7d9 0 1658228657078 1 connected
f32bcfa4dcacef662096d5ccdef4d741588aa2cb 10.244.0.252:6379@16379 master - 0 1658228657580 2 connected 5461-10922
b35dcfba64b30050d3f71dc347acd3ce222a99e5 10.244.0.250:6379@16379 slave dc0a1908830468f2070883e1c026fd5b1b2ff526 0 1658228658082 3 connected
dc2ddbb2ef518a23583dab53bdecccc0e017146d 10.244.0.247:6379@16379 slave f32bcfa4dcacef662096d5ccdef4d741588aa2cb 0 1658228657000 2 connected
8921255530faaa44fb793a7248d93c179211b7d9 10.244.0.248:6379@16379 myself,master - 0 1658228657000 1 connected 0-5460
dc0a1908830468f2070883e1c026fd5b1b2ff526 10.244.0.251:6379@16379 master - 0 1658228657078 3 connected 10923-16383
创建用于访问的service
之前创建了用于实现StatefulSet的Headless Service,但该Service没有Cluster IP,因此不能用于外界访问。所以,我们还需要创建一个Service,专用于为Redis集群提供访问和负载均衡;也可以部署为Ingress供集群外部访问。这里只创建用于内部访问的service
redis-access-service.yaml1
2
3
4
5
6
7
8
9
10
11
12
13
14
15apiVersion: v1
kind: Service
metadata:
name: redis-access-service
labels:
app: redis
spec:
ports:
- name: redis-port
protocol: "TCP"
port: 6379
targetPort: 6379
selector:
app: redis
appCluster: redis-cluster
K8s上部署Kafka集群
K8s上部署MySQL集群
Helm离线安装Chart包
参考文章
CICD引擎(非必须)
BuildMaster Drone.io GoCD
Argo
可视化管理工具(非必须)
kuboard
在线安装1
2
3
4
5
6
7
8
9
10
11
12也可以使用镜像 swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3 ,可以更快地完成镜像下载。
请不要使用 127.0.0.1 或者 localhost 作为内网 IP \
# Kuboard 不需要和 K8S 在同一个网段,Kuboard Agent 甚至可以通过代理访问 Kuboard Server \
sudo docker run -d \
--restart=unless-stopped \
--name=kuboard \
-p 30080:80/tcp \
-p 31089:10081/tcp \
-e KUBOARD_ENDPOINT="http://192.168.11.178:30080" \
-e KUBOARD_AGENT_SERVER_TCP_PORT="31089" \
-v /opt/kuboard-data:/data \
swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3.4.1.0
docker-compose.yml方式:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48version: "3"
networks:
kuboard-net:
external: false
driver: bridge
# ipam:
# config:
# - subnet: 172.90.161.0/24
services:
kuboard:
image: swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3.4.1.0
container_name: kuboard
restart: unless-stopped # always
environment:
# user/passwd:admin/Kuboard123
# modify: Zdxf@2021
KUBOARD_ENDPOINT: "http://172.20.1.25:30080"
KUBOARD_AGENT_SERVER_TCP_PORT: "31089"
KUBERNETES_CLUSTER_DOMAIN: "ice"
KUBOARD_ICP_DESCRIPTION: "ICP备案号"
KUBOARD_DISABLE_AUDIT: true
ports:
- 30080:80/tcp
- 31089:10081/tcp
volumes:
- /etc/localtime:/etc/localtime:ro
- ./data:/data
networks:
- kuboard-net
logging:
#driver: none
driver: json-file
options:
max-size: "200k"
max-file: "1"
# 使用deploy限制资源,启动时需要增加--compatibility参数,防止报错
deploy:
resources:
limits:
cpus: '1'
memory: 1G
reservations:
cpus: '1'
memory: 200M
在浏览器输入 http://your-host-ip:31088 即可访问 Kuboard v3.x 的界面,登录方式:
- 用户名:
admin - 密 码:
Kuboard123
浏览器兼容性
请使用 Chrome / FireFox / Safari 等浏览器
不兼容 IE 以及以 IE 为内核的浏览器
资源监控
metrics-server.yaml内容:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: 443
selector:
k8s-app: metrics-server
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: 'true'
rbac.authorization.k8s.io/aggregate-to-edit: 'true'
rbac.authorization.k8s.io/aggregate-to-view: 'true'
name: 'system:aggregated-metrics-reader'
namespace: kube-system
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: 'system:metrics-server'
namespace: kube-system
rules:
- apiGroups:
- ''
resources:
- pods
- nodes
- nodes/stats
- namespaces
- configmaps
verbs:
- get
- list
- watch
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: 'metrics-server:system:auth-delegator'
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: 'system:auth-delegator'
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: 'system:metrics-server'
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: 'system:metrics-server'
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
namespace: kube-system
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: metrics-server
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
weight: 100
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
k8s-app: metrics-server
namespaces:
- kube-system
topologyKey: kubernetes.io/hostname
containers:
- args:
- '--cert-dir=/tmp'
- '--secure-port=443'
- '--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname'
- '--kubelet-use-node-status-port'
- '--kubelet-insecure-tls=true'
- '--authorization-always-allow-paths=/livez,/readyz'
- '--metric-resolution=15s'
image: >-
swr.cn-east-2.myhuaweicloud.com/kuboard-dependency/metrics-server:v0.5.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
tolerations:
- effect: ''
key: node-role.kubernetes.io/master
operator: Exists
volumes:
- emptyDir: {}
name: tmp-dir
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: metrics-server
namespace: kube-system
spec:
minAvailable: 1
selector:
matchLabels:
k8s-app: metrics-server
k8sLens
K8s配置
删除资源
若资源是DEPLOY.yaml创建的,则可以使用命令1
2删除yaml文件中描述的资源
kubectl delete -f DEPLOY.yaml
删除命名空间中所有资源–暂时未验证1
2
3
4
5
6
7
8
9
10
11
12
13
14
151、先查找该命名空间下的资源有哪些,
kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n ingress-nginx
确定资源类型如下,
ingress
deployment
service
2、清理ingress-nginx命名空间下的资源
kubectl get ingress -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete ingress -n ingress-nginx
kubectl get service -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete service -n ingress-nginx
kubectl get deployment -n ingress-nginx |grep clife |awk '{print $1}'|xargs kubectl delete deployment -n ingress-nginx
3、删除命名空间ingress-nginx
kubectl delete ns ingress-nginx
4、查看该命名空间是否已删除
kubectl get ns ingress-nginx
命令补全工具
bash
bash需要安装bash-completion:1
2
3yum install bash-completion
echo "source <(kubectl completion bash)" >> ~/.bashrc
source ~/.bashrc
zsh
命令行执行:1
2echo "source <(kubectl completion zsh)" >> ~/.zshrc
source ~/.zshrc
普通用户命令权限
命令行执行:1
2
3
4
5
6
7比如用户名为USER
mkdir -p ~/.kube
sudo cp -i /etc/kubernetes/admin.conf ~/.kube
sudo chown USER:USER /etc/kubernetes/admin.conf
配置环境变量
zsh配置到.zshrc文件中添加
export KUBECONFIG=~/.kube/admin.conf
集群证书
集群证书存放位置/etc/kubernetes/pki/1
2
3
4
5检查哪些证书过期
kubeadm certs check-expiration
手动刷新证书
kubeadm certs renew all
性能测试
工具使用参考测试文档
基础网络工具有:curl,iperf,Locust,kubemark
业务性能测试有:JMeter, LoadRunner
Apache JMeter是压力测试工具。
LoadRunner是一种预测系统行为和性能的负载测试工具。1
2
3
4
5
6时间指标说明
单位:秒
time_connect:建立到服务器的 TCP 连接所用的时间
time_starttransfer:在发出请求之后,Web 服务器返回数据的第一个字节所用的时间
time_total:完成请求所用的时间
curl -o /dev/null -s -w '%{time_connect} %{time_starttransfer} %{time_total}' "http://sample-webapp:8000/"
iperf在CentOS安装方法1
2
3yum install epel-release
yum update
yum install iperf
使用方法1
2
3
4
5
6
7
8
9
10
11启动tcp服务端
iperf -s
启动客户端测试
iperf -c <SERVER_IP>
启动udp服务端
iperf -s -u
启动客户端测试,udp可能受参数限制带宽,可以用-b更改最大带宽
iperf -c <SERVER_IP> -u
双向测试只需要客户端增加-d参数
容器化构建发布
构建发布
参考文章,Dockerfile:1
2
3
4
5
6
7
8
9
10
11FROM golang:buster as build
WORKDIR /go/src/greeter-server
RUN curl -o main.go https://github.com/grpc/grpc-go/blob/91e0aeb192456225adf27966d04ada4cf8599915/examples/features/reflection/server/main.go && \
go mod init greeter-server && \
go mod tidy && \
go build -o /greeter-server main.go
FROM gcr.io/distroless/base-debian10
COPY --from=build /greeter-server /
EXPOSE 50051
CMD ["/greeter-server"]
grpc示例
ingress支持grpc示例,参考文章
卸载K8s
清理本体
1 | 卸载集群本体 |
清理组件
CentOS主机
1 | 清理组件 |
Debian/Ubuntu主机
1 | 清理组件 |
清理相关镜像
1 | 由于从registry.cn-hangzhou.aliyuncs.com拉取的镜像,将镜像删除 |
FAQ
bridge-nf-call-iptables异常
安装时报错[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
参考文章1
2
3解决方案
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
flannel无法访问kubernetes资源10.96.0.1不可达
10.96.0.1地址是指向k8s集群default空间创建的kubernetes服务的,底层基础网络不通(tcpdump无法抓到到此IP的包,可以考虑更换工具或者增加参数)。虚拟机使用的是虚拟主机网络,怀疑是网络配置问题,将此网络包丢弃导致。虚拟机更换为普通桥接问题解决。
先参考资料
kube-proxy开启ipvs的前置条件(所有节点)1
2
3
4
5
6
7
8
9cat > /etc/sysconfig/modules/ipvs.modules << EOF
!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules
docker使用systemd的cgroup1
2
3/etc/docker/daemon.json中增加
"exec-opts": ["native.cgroupdriver=systemd"]
Cgroup Driver: cgroupfs
11
2
3
4
5
6
7
8
9
10
11
12
13
14kubeadm config print init-defaults > kubeadm-config.yaml
修改advertiseAddress值为本机IP
修改kubernetesVersion为k8s版本
修改dnsDomain为最高级域名--非必须
新增podSubnet: 10.244.0.0/16容器子网--非必须
修改serviceSubnet服务子网--非必须
镜像名需要改--
kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.log
简化初始化流程
kubeadm init --apiserver-advertise-address=192.168.208.3 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --service-dns-domain=imsv2
kubectl -n kube-system get all1
2日志中关键错误
E1228 14:03:55.799748 1 main.go:234] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-rbzdn': Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-rbzdn": dial tcp 10.96.0.1:443: connect: network is unreachable
网络性能调优
参考文章
CentOS7内核版本低导致部分域名解析失败
参考文章