合规国际互联网加速 OSASE为企业客户提供高速稳定SD-WAN国际加速解决方案。 广告
#### 这是本人学习[部署落地+业务迁移 玩转k8s进阶与企业级实践技能](https://coding.imooc.com/learn/list/335.html "部署落地+业务迁移 玩转k8s进阶与企业级实践技能")的笔记,如果需要附件,请到首页百度网盘地址获取 基础环境 | 系统 | IP地址 | 节点角色 | CPU | 内存 | 主机名 | | :------------: | :------------: | :------------: | :------------: | :------------: | :------------: | | centos-7.7 |192.168.88.101 | Master | 2 | 2G | docker-2-12-101 | | centos-7.7 |192.168.88.102 | Master | 2 | 2G |docker-2-12-102 | | centos-7.7 |192.168.88.103 | Node | 2 | 2G | docker-2-12-103 | | centos-7.7 | 192.168.88.104 | Node | 2 | 2G | docker-2-12-104 | MasterVIP:192.168.88.188(APIServer) 软件环境 ``` kubernetes 1.14.10 etcd 3.3.10 coredns 1.3.1 calico 3.1.3 docker 18.09(验证版本),实际我用19.03.5 ``` 部署依赖 ``` yum update yum install -y conntrack ipvsadm ipset jq sysstat curl iptables libseccomp ``` 内核参数优化 ``` cat > /etc/sysctl.d/kubernetes.conf <<EOF net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 net.ipv4.ip_forward=1 vm.swappiness=0 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 EOF sysctl -p /etc/sysctl.d/kubernetes.conf ``` 关闭服务 ``` # 关闭防火墙 systemctl stop firewalld && systemctl disable firewalld # 重置iptables iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT # 关闭swap(基于性能考虑,初始化参数可以忽略) swapoff -a sed -i '/swap/s/^\(.*\)$/#\1/g' /etc/fstab # 关闭selinux setenforce 0 # 关闭dnsmasq(否则可能导致docker容器无法解析域名) service dnsmasq stop && systemctl disable dnsmasq ``` 初始化主机名 ``` cat >> /etc/hosts << EOF 192.168.88.101 main-101 c7-docker-101 192.168.88.102 main-102 c7-docker-102 192.168.88.103 node-103 c7-docker-103 192.168.88.104 node-104 c7-docker-104 EOF ``` 修改Docker的驱动模式为systemd,请先确认方式一没有配置/etc/docker/daemon.json ``` cat /etc/docker/daemon.json { ..... "exec-opts": ["native.cgroupdriver=systemd"] ..... } ``` 安装工具(所有节点) ``` # 配置阿里云yum源 cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF # 安装 yum install -y --nogpgcheck kubelet-1.14.10 kubeadm-1.14.10 kubectl-1.14.10 # 启动 systemctl enable kubelet && systemctl start kubelet ``` #### 部署Keepalived集群(任意两台Master) /opt/kubeadm-k8s1.14/configs和的keepalived配置文件和/opt/kubeadm-k8s1.14/scritps的脚本文件 ``` yum install -y keepalived ``` #### 初始化Master-1 修改kubeadm-config.yaml的k8s版本和VIP,并上传到/root目录 ``` apiVersion: kubeadm.k8s.io/v1beta1 kind: ClusterConfiguration kubernetesVersion: v1.14.10 controlPlaneEndpoint: "192.168.88.188:6443" networking: # This CIDR is a Calico default. Substitute or remove for your CNI provider. podSubnet: "172.22.0.0/16" imageRepository: registry.aliyuncs.com/google_containers ``` 初始化 ``` cd ~ kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs # 配置文件中从阿里云拉取的容器镜像,速度很快 # 1.16之后参数有变化 experimental-upload-certs更换为upload-certs ``` 拷贝配置,master执行 ``` mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config ``` ##### 初始化成功,记录相关命令,用于其他节点加入集群 Master-2执行加入master节点命令 ``` kubeadm join 192.168.88.233:6443 --token xzp2kb.habisql3vkgyx02d \ --discovery-token-ca-cert-hash sha256:4526f6e8f08a5c5564e5488c5b939753ee26b7fd0c8ca81423af2d4a58c718a6 \ --experimental-control-plane --certificate-key 5d1af50558253c92b5d9df07a14144ffb37edeb2f0afef5af4dcc3fc022846b3 ``` 加入Node节点 ``` kubeadm join 192.168.88.233:6443 --token xzp2kb.habisql3vkgyx02d \ --discovery-token-ca-cert-hash sha256:4526f6e8f08a5c5564e5488c5b939753ee26b7fd0c8ca81423af2d4a58c718a6 ``` #### 初始化Calico网络 ``` # 创建目录(在配置了kubectl的节点上执行) mkdir -p /etc/kubernetes/addons # 上传calico配置到配置好kubectl的节点(一个节点即可) cd /opt/kubernetes-ha-kubeadm/ scp target/addons/calico* 192.168.88.101:/etc/kubernetes/addons/ # 部署calico kubectl apply -f /etc/kubernetes/addons/calico-rbac-kdd.yaml kubectl apply -f /etc/kubernetes/addons/calico.yaml # 查看状态 $ kubectl get pods -n kube-system # 由于没有Node节点,部分节点可能失败 ``` 加入Master和Work节点一段时间后,集群状态如下 ``` #kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-6dxfm 2/2 Running 0 19m kube-system calico-node-d2fq6 2/2 Running 1 19m kube-system calico-node-kfj78 2/2 Running 0 16m kube-system calico-node-l6vgh 2/2 Running 2 20m kube-system calico-node-rqkrr 2/2 Running 0 19m kube-system calico-typha-666749994b-lnkbt 1/1 Running 0 20m kube-system coredns-8567978547-g7pkr 1/1 Running 4 40m kube-system coredns-8567978547-xzjkb 1/1 Running 4 40m kube-system etcd-docker-2-12-101 1/1 Running 0 40m kube-system etcd-m2 1/1 Running 0 19m kube-system etcd-m3 1/1 Running 0 16m kube-system kube-apiserver-docker-2-12-101 1/1 Running 0 40m kube-system kube-apiserver-m2 1/1 Running 0 19m kube-system kube-apiserver-m3 1/1 Running 0 16m kube-system kube-controller-manager-docker-2-12-101 1/1 Running 1 40m kube-system kube-controller-manager-m2 1/1 Running 0 19m kube-system kube-controller-manager-m3 1/1 Running 0 16m kube-system kube-proxy-2bx2q 1/1 Running 0 16m kube-system kube-proxy-8vwqn 1/1 Running 0 19m kube-system kube-proxy-cv7vg 1/1 Running 0 19m kube-system kube-proxy-mmh7f 1/1 Running 0 40m kube-system kube-proxy-pzk2r 1/1 Running 0 19m kube-system kube-scheduler-docker-2-12-101 1/1 Running 1 39m kube-system kube-scheduler-m2 1/1 Running 0 19m kube-system kube-scheduler-m3 1/1 Running 0 16m ``` 检查集群状态(Master) ``` curl -k https://localhost:6443/healthz ``` #### 集群可用性测试 创建nginx ds ``` cat > nginx-ds.yml <<EOF apiVersion: v1 kind: Service metadata: name: nginx-ds labels: app: nginx-ds spec: type: NodePort selector: app: nginx-ds ports: - name: http port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: nginx-ds labels: addonmanager.kubernetes.io/mode: Reconcile spec: template: metadata: labels: app: nginx-ds spec: containers: - name: my-nginx image: nginx:1.7.9 ports: - containerPort: 80 EOF ``` 创建ds ``` kubectl create -f nginx-ds.yml ``` #### 检查各种ip连通性 ``` # 检查各 Node 上的 Pod IP 连通性 kubectl get pods -o wide # 在每个节点上ping pod ip ping <pod-ip> # 检查service可达性 kubectl get svc # 在每个节点上访问服务 curl <service-ip>:<port> # 在每个节点检查node-port可用性 curl <node-ip>:<port> ``` #### 检查dns可用性 ``` # 创建一个nginx pod $ cat > pod-nginx.yaml <<EOF apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80 EOF # 创建pod kubectl create -f pod-nginx.yaml # 进入pod,查看dns kubectl exec nginx -i -t -- /bin/bash # 查看dns配置 cat /etc/resolv.conf # 查看名字是否可以正确解析 ping nginx-ds ``` #### 部署dashboard ``` # 上传dashboard配置 scp target/addons/dashboard-all.yaml 192.168.88.101:/etc/kubernetes/addons/ # 创建服务 kubectl apply -f /etc/kubernetes/addons/dashboard-all.yaml # 查看服务运行情况 kubectl get deployment kubernetes-dashboard -n kube-system kubectl --namespace kube-system get pods -o wide kubectl get services kubernetes-dashboard -n kube-system netstat -ntlp|grep 30005 ``` 访问Dashboard 如果第一次部署报错,删除pods后重新再创建,成功 ``` xxx namespaces is forbidden xxx ``` ``` https://192.168.88.101:30005 ``` 获取Token ``` # 创建service account kubectl create sa dashboard-admin -n kube-system # 创建角色绑定关系 kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin # 查看dashboard-admin的secret名字 ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}') # 打印secret的token kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}' ```