You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

参考文档:https://cooting.cn/archives/166.html

1.找到故障节点

先在ha上摘除故障节点
#删除节点
 kubectl get node
#删除故障的master节点
kubectl delete node d0-master001

登陆其他可用的master节点的etcd节点

kubectl exec -it etcd-master001 sh -n kube-system

定义别名alias
alias etcdctlold='etcdctl --endpoints=https://192.168.31.110:2379,https://192.168.31.113:2379,https://192.168.31.114:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'

查看集群状态
etcdctlold endpoint status

查看集群节点
etcdctlold member list

删除废弃节点(此次我们删掉https://192.168.31.113:2379对应的节点)
etcdctlold member remove 75abeddc78aef692

------------------------------------------------------到这里master节点以删除完毕,为确保没有问题,我们再次查看

重新定义alias
alias etcdctlnew='etcdctl --endpoints=https://192.168.31.110:2379,https://192.168.31.114:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'
查看集群状态
etcdctlnew endpoint status

查看集群节点
etcdctlnew member list

  • No labels