参考文档:https://cooting.cn/archives/166.html
1.找到故障节点
先在ha上摘除故障节点
#删除节点
kubectl get node
#删除故障的master节点
kubectl delete node d0-master001
登陆其他可用的master节点的etcd节点
kubectl exec -it etcd-master001 sh -n kube-system
定义别名alias
alias etcdctlold='etcdctl --endpoints=https://192.168.31.110:2379,https://192.168.31.113:2379,https://192.168.31.114:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'
查看集群状态
etcdctlold endpoint status
查看集群节点
etcdctlold member list
删除废弃节点(此次我们删掉https://192.168.31.113:2379对应的节点)
etcdctlold member remove 75abeddc78aef692
------------------------------------------------------到这里master节点以删除完毕,为确保没有问题,我们再次查看
重新定义alias
alias etcdctlnew='etcdctl --endpoints=https://192.168.31.110:2379,https://192.168.31.114:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'
查看集群状态
etcdctlnew endpoint status
查看集群节点
etcdctlnew member list