## 机器准备
3台机器,一台做master,两台做slave
配置:16G 16核
Docker版本(18.06.01)
k8s版本(1.15.0)
ansible是一个很强大的运维自动化运维工具,基于ssh,非常方便。
## 一、环境准备(三个节点都需要执行)
```
1.配置免密
三台互做免密 执行以下命令
ssh-copy-id -i ~/.ssh/id_rsa.pub root@<hostname> -p 22022
2.安装ansible 并配置ansible inventory文件
yum install -y ansible
[k8s-master]
172.18.0.171 ansible_ssh_user=root ansible_ssh_port=22022//端口默认是22 如果没有更改则使用默认
[k8s-slave]
172.18.0.172 ansible_ssh_user=root ansible_ssh_port=22022
172.18.0.173 ansible_ssh_user=root ansible_ssh_port=22022
ansible使用示例:
在所有节点执行命令:ansible all -m shell -a "systemctl start docker "
在master上执行:ansible k8s-master -m shell -a "systemctl start docker"
3.关闭防火墙和selinux
systemctl stop firewalld && systemctl disable firewalld
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config && setenforce 0
4.关闭swap分区
swapoff -a # 临时
$ sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab #永久
5.修改主机名
hostnamectl set-hostname master
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2
6.修改hosts文件
172.18.0.171 master
172.18.0.172 slave1
172.18.0.173 slave2
7.内核调整,将桥接的IPV4流量传递到iptable的链
cat > /etc/sysctl.d/k8s.conf << EOFnet.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1EOF$ sysctl --system
8.三台机器同步时钟
yum install -y ntpdate
ntpdate time.windows.com
```
## 二、安装docker(三个节点都需要执行)
```
yum install -y yum-utils device-mapper-persistent-data lvm2
配置docker源
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
查看docker版本
yum list docker-ce --showduplicates | sort -r
安装docker
yum install docker-ce-18.09.6 docker-ce-cli-18.09.6 containerd.io(指定版本安装)
yum install -y docker-ce docker-ce-cli containerd.io(最新版本安装)
启动docker
systemctl start docker
systemctl enable docker
安装命令补全
ansible all -m shell -a "yum -y install bash-completion"
ansible all -m shell -a "source /etc/profile.d/bash_completion.sh"
```
## 三、安装k8s(三个节点都需要执行)
```
1.申请阿里源docker加速器
https://cr.console.aliyun.com/cn-hangzhou/instances/mirrors
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://wv8lwzcp.mirror.aliyuncs.com"]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
2.添加k8s yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
更新缓存
yum clean all& yum -y makecache"
3.安装kubeadm kubelet kubectl
yum install -y kubelet-1.15.0 kubeadm-1.15.0 kubectl-1.15.0
systemctl enable kubelet
```
## 四、配置master节点(只在master执行)
```
kubeadm init --apiserver-advertise-address=172.18.0.171 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.15.0 --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16
踩坑:这地方如果执行报超时,则有可能是swap分区没关,执行下面脚本
#!/bin/bash
rm -rf /etc/kubernetes/*
rm -rf ~/.kube/*
rm -rf /var/lib/etcd/*
lsof -i :6443|grep -v "PID"|awk '{print "kill -9",$2}'|sh
lsof -i :10251|grep -v "PID"|awk '{print "kill -9",$2}'|sh
lsof -i :10252|grep -v "PID"|awk '{print "kill -9",$2}'|sh
lsof -i :10250|grep -v "PID"|awk '{print "kill -9",$2}'|sh
lsof -i :2379|grep -v "PID"|awk '{print "kill -9",$2}'|sh
lsof -i :2380|grep -v "PID"|awk '{print "kill -9",$2}'|sh
swapoff -a && kubeadm reset && systemctl daemon-reload && systemctl restart kubelet && iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
重新初始化,成功后出现下面命令
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.18.0.171:6443 --token mi2ip9.629f41c46tvh79g1 \
--discovery-token-ca-cert-hash sha256:649afe0a5c0f9599a0b4a6e4baa6aac3e3e6007adf98d215f495182d31d2dfac
按照要求执行上述命令
[root@master ~]# cat kube_preinstall.sh
#!/bin/bash
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
记住上面的命令,用来把子节点接入到master中。
```
## 五、子节点加入到mater中(两个子节点都需要执行)
```
kubeadm join 172.18.0.171:6443 --token mi2ip9.629f41c46tvh79g1 \
--discovery-token-ca-cert-hash sha256:649afe0a5c0f9599a0b4a6e4baa6aac3e3e6007adf98d215f495182d31d2dfac
```
**踩坑:**如果上面的token忘记了,则执行下面的命令重新生成token的sha256编码
```
[root@master ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
zxjr2d.ecnowzegec34w8vj <invalid> 2021-02-04T13:37:09+08:00 authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
[root@master ~]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
d7d8d27c50c1ef63cd56e8894e154d6e2861693b8f554460df4eb6fc14ce84aa
```
## 六、安装网络插件,使用flannel网络
```
wget https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml
```
**踩坑:** :可能会遇到 https://raw.githubusercontent.com 打不开的情况,这是由于dns域名解析失败的原因。进入到如下网站:
https://www.ipaddress.com/
输入域名:https://raw.githubusercontent.com
解析出域名的ip地址,并添加到主机的hosts文件中
```
[root@master ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.18.0.171 master
172.18.0.172 slave1
172.18.0.173 slave2
199.232.96.133 raw.githubusercontent.com
```
下载成功后,开始修改镜像
```
由于默认的可能不能拉取,如果确保能够访问到quay.io这个registery,则不用修改,否则修改如下内容:
169 serviceAccountName: flannel
170 initContainers:
171 - name: install-cni
172 image: easzlab/flannel:v0.11.0-amd64
173 command:
174 - cp
175 args:
176 - -f
177 - /etc/kube-flannel/cni-conf.json
178 - /etc/cni/net.d/10-flannel.conflist
179 volumeMounts:
180 - name: cni
181 mountPath: /etc/cni/net.d
182 - name: flannel-cfg
183 mountPath: /etc/kube-flannel/
184 containers:
185 - name: kube-flannel
186 image: easzlab/flannel:v0.11.0-amd64
187 command:
188 - /opt/bin/flanneld
修改完成后,开始拉起flannel镜像
kubectl apply -f kube-flannel.yml
查看是否被拉起
ps -ef|grep flannel
```
查看集群的网络状态,只有在如下状态,说明所有的节点都已经ready
```
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 25h v1.15.0
slave1 Ready <none> 25h v1.15.0
slave2 Ready <none> 25h v1.15.0
[root@master ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-bccdc95cf-cpc96 1/1 Running 0 25h
coredns-bccdc95cf-d5fs2 1/1 Running 0 25h
etcd-master 1/1 Running 0 25h
kube-apiserver-master 1/1 Running 0 25h
kube-controller-manager-master 1/1 Running 0 25h
kube-flannel-ds-amd64-25ztw 1/1 Running 0 25h
kube-flannel-ds-amd64-cqmx8 1/1 Running 0 25h
kube-flannel-ds-amd64-f6mxw 1/1 Running 0 25h
kube-proxy-mz2rb 1/1 Running 0 25h
kube-proxy-nd9zp 1/1 Running 0 25h
kube-proxy-s4xfh 1/1 Running 0 25h
kube-scheduler-master 1/1 Running 0 25h
kubernetes-dashboard-79ddd5-nchbb 1/1 Running 0 21h
```
## 七、测试功能
```
创建一个pod,并暴露端口,验证是否能访问:
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=NodePort
[root@master ~]# kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-554b9c67f9-gsgsm 1/1 Running 0 25h
pod/redis-686d55dddd-lhhl8 1/1 Running 0 25h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 25h
service/nginx NodePort 10.1.6.189 <none> 80:30551/TCP 25h
service/redis NodePort 10.1.228.85 <none> 2379:30642/TCP 25h
```
## 八.配置kubernetes-dashboard
```
wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
修改yaml文件:
[root@k8s-master ~]# vim kubernetes-dashboard.yaml
修改内容:
109 spec:
110 containers:
111 - name: kubernetes-dashboard
112 image: easzlab/kubernetes-dashboard-amd64:v1.10.1 # 修改此行
......
157 spec:
158 type: NodePort # 增加此行
159 ports:
160 - port: 443
161 targetPort: 8443
162 nodePort: 30001 # 增加此行
163 selector:
164 k8s-app: kubernetes-dashboard
k8s-dashboard只允许30000以上端口访问
[root@k8s-master ~]# kubectl apply -f kubernetes-dashboard.yaml
访问页面 https://172.18.0.171:30001
可能会不能访问,因为原yaml文件的token有问题。
此时需要我们手动生成一个token
进入到目录 cd /etc/kubernetes/pki/
1.创建一个证书
[root@master pki]# (umask 077; openssl genrsa -out dashboard.key 2048)
2.签署证书
openssl req -new -key dashboard.key -out dashboard.csr -subj "/O=zkxy/CN=kubernetes-dashboard"
3.使用集群的ca签署证书
openssl x509 -req -in dashboard.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out dashboard.crt -days 5000
4.把创建好的证书给集群使用
完全删除:dashboard:sudo kubectl -n kube-system delete $(sudo kubectl -n kube-system get pod -o name | grep dashboard)
kubectl create secret generic kubernetes-dashboard-certs --from-file=dashboard.crt=./dashboard.crt --from-file=dashboard.key=./dashboard.key -n kube-system #我们需要把我们创建的证书创建为secret给k8s使用
5.注释掉dashboad.yaml的关于secret的配置
#apiVersion: v1
#kind: Secret
#metadata:
# labels:
# k8s-app: kubernetes-dashboard
# name: kubernetes-dashboard-certs
# namespace: kube-system
#type: Opaque
6.重新应用yaml文件
kubectl create -f kubernetes-dashboard.yaml
7.重新登录即可弹出kubernetes仪表盘
此时需要创建默认的集群管理员用户,并申请token访问集群
kubectl create serviceaccount zkxy-admin -n kube-system
## 绑定集群资源
kubectl create clusterrolebinding zkxy-cluster-admin --clusterrole=zkxy-cluster-admin --serviceaccount=kube-system:zkxy-admin
## 获取命名空间中的token
[root@master ~]# kubectl get secret -n kube-system
zkxy-admin-token-4dpbz kubernetes.io/service-account-token 3 22h
## 使用该secret 获取token
[root@master ~]# kubectl describe secret zkxy-admin-token-4dpbz -n kube-system
Name: zkxy-admin-token-4dpbz
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name: zkxy-admin
kubernetes.io/service-account.uid: 3a169baf-55f9-4cc4-abb5-950962b2315c
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1025 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJ6a3h5LWFkbWluLXRva2VuLTRkcGJ6Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6InpreHktYWRtaW4iLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIzYTE2OWJhZi01NWY5LTRjYzQtYWJiNS05NTA5NjJiMjMxNWMiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06emt4eS1hZG1pbiJ9.am-UOQPoSEYWWp-CKLu5k9q3Ysh5GksRQBG9zOqNsJ2O_5zWUChdKrPPTlSTGJfz1ZiHtYRuKeRloSKem65IbuSHSfKfI_bKqTioqpzfDQSBMh2Hz4gvmiyJw3sk2g2DRCynjFjjSWB0QDgVemMn7vEPdcnPD0AwFxW0pwSPJI--hkdSbCTfm5ZXtHsvDt4avQGP1BAVw1IWeke9XsRouHurJU9I19-14LXzUWmY7nBceMCf7pWiho68gyea3kIar0JmCMtRJHAWOyWOxojocsfIb2iDsq9eK6SqhgJjXCrDMABUMErjZ-ACIA94e3q1gbwFPBGIhEXrDFUPK1z-dQ
复制上述token到页面,就可以访问集群了。
```
## 九、总结
至此,一个一主二从的k8s集群就已经搭建好,期间踩了很多的坑,遇到错误不要紧,追根溯源去解决就好了。下一篇将详细阐述如何将已有的微服务架构(spring cloud)迁移到容器云上。