1、检查证书有效期

华东1正式环境的work节点的kubelet证书将最早在8月14日过期,过期的worker节点将无法访问集群的master节点,导致节点上部署的服务异常!

master节点

/etc/kubernetes/pki/目录下的证书

1)3个CA证书(集群CA证书、front-proxy的CA证书、etcd的CA证书):

[root@d1-k8s-master-002 ~]# for certFile in $(ll /etc/kubernetes/pki/{*ca.crt,etcd/*ca.crt}|awk '{print $NF}')
do
    echo "$certFile:"
    openssl x509 -noout -text -in $certFile|grep -A2 Validity
done
/etc/kubernetes/pki/ca.crt:
        Validity
            Not Before: Aug 13 03:25:55 2018 GMT
            Not After : Aug 10 03:25:55 2028 GMT
/etc/kubernetes/pki/front-proxy-ca.crt:
        Validity
            Not Before: Aug 13 03:25:56 2018 GMT
            Not After : Aug 10 03:25:56 2028 GMT

  • 这里etcd不是通过kubelet部署的,而且没有使用https,没有CA证书。

2)apiserver相关的证书(apiserver的服务器证书、apiserver访问etcd的客户端证书、apiserver访问kubelet的客户端证书)

[root@d1-k8s-master-002 ~]# for certFile in $(ll /etc/kubernetes/pki/apiserver*.crt|awk '{print $NF}')
do
    echo "$certFile:"
    openssl x509 -noout -text -in $certFile|grep -A2 Validity
done
/etc/kubernetes/pki/apiserver.crt:
        Validity
            Not Before: Aug 13 03:54:45 2018 GMT
            Not After : Aug 10 03:54:45 2028 GMT
/etc/kubernetes/pki/apiserver-kubelet-client.crt:
        Validity
            Not Before: Aug 13 03:25:55 2018 GMT
            Not After : Aug 10 03:25:56 2028 GMT

  • 这里etcd没有使用HTTPS,apiserver访问etcd无需客户端证书。

3)etcd相关的证书(etcd的服务器证书server.crt、etcd实例的健康检查证书healthcheck-client.crt、etcd集群节点之间使用的peer证书peer.crt)

[root@d1-k8s-master-002 ~]# for certFile in $(ll /etc/kubernetes/pki/etcd/*.crt|awk '{print $NF}')
do
    echo "$certFile:"
    openssl x509 -noout -text -in $certFile|grep -A2 Validity
done

  • 这里etcd使用docker单独部署,不在集群里面,而且没有使用https,无需关心etcd相关的证书。

4)front-proxy相关的证书(front-proxy客户端证书front-proxy-client.crt)

[root@d1-k8s-master-002 ~]# for certFile in $(ll /etc/kubernetes/pki/front-proxy-client.crt|awk '{print $NF}')
do
    echo "$certFile:"
    openssl x509 -noout -text -in $certFile|grep -A2 Validity
done
/etc/kubernetes/pki/front-proxy-client.crt:
        Validity
            Not Before: Aug 13 03:25:56 2018 GMT
            Not After : Aug 10 03:25:57 2028 GMT

2.kubeconfig内嵌的证书

总共有4个kubeconfig文件:

[root@d1-k8s-master-002 ~]# ll /etc/kubernetes/*.conf
-rw------- 1 root root 5453 Aug 13  2018 /etc/kubernetes/admin.conf
-rw------- 1 root root 5489 Aug 13  2018 /etc/kubernetes/controller-manager.conf
-rw------- 1 root root 5529 Aug 13  2018 /etc/kubernetes/kubelet.conf
-rw------- 1 root root 5437 Aug 13  2018 /etc/kubernetes/scheduler.conf
[root@d1-k8s-master-002 ~]# for confFile in $(ll /etc/kubernetes/*.conf|awk '{print $NF}')
do
    echo "$confFile:"
    openssl x509 -noout -text -in <(cat $confFile|grep client-certificate-data|awk '{print $2}'|base64 -d)|grep -A2 Validity
    openssl verify -CAfile /etc/kubernetes/pki/ca.crt <(cat $confFile|grep client-certificate-data|awk '{print $2}'|base64 -d)
done
/etc/kubernetes/admin.conf:
        Validity
            Not Before: Aug 13 03:25:55 2018 GMT
            Not After : Aug 10 03:25:57 2028 GMT
/dev/fd/63: OK
/etc/kubernetes/controller-manager.conf:
        Validity
            Not Before: Aug 13 03:25:55 2018 GMT
            Not After : Aug 10 03:25:58 2028 GMT
/dev/fd/63: OK
/etc/kubernetes/kubelet.conf:
        Validity
            Not Before: Aug 13 03:25:55 2018 GMT
            Not After : Aug 10 03:25:58 2028 GMT
/dev/fd/63: OK
/etc/kubernetes/scheduler.conf:
        Validity
            Not Before: Aug 13 03:25:55 2018 GMT
            Not After : Aug 10 03:25:58 2028 GMT
/dev/fd/63: OK

可见,master节点的所有证书都是10年有效期,无需更新!

worker节点

worker节点只需要检查kubelet的kubeconfig内嵌的证书:

[root@d1-dui-166 ~]# openssl x509 -noout -text -in <(cat /etc/kubernetes/kubelet.conf|grep client-certificate-data|awk '{print $2}'|base64 -d)|grep -A2 Validity
        Validity
            Not Before: Jan 24 03:50:00 2019 GMT
            Not After : Jan 24 03:50:00 2020 GMT
[root@d1-dui-166 ~]# openssl verify -CAfile /etc/kubernetes/pki/ca.crt <(cat /etc/kubernetes/kubelet.conf|grep client-certificate-data|awk '{print $2}'|base64 -d)
/dev/fd/63: OK

  • worker节点的kubelet证书有效期1年,最早将在8月14日过期!

2、证书更换步骤

集群kubernetes版本为v1.6,该版本不支持证书的自动轮换,也不支持使用kubeadm手动更新证书,现计划按如下的步骤生成证书来替换过期的证书。

安装必要的工具

安装证书生成工具cfssl:

[root@d1-k8s-master-002 ~]# mkdir kubernetes && cd kubernetes/
[root@d1-k8s-master-002 kubernetes]# curl -L https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -o cfssl
[root@d1-k8s-master-002 kubernetes]# curl -L https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -o cfssljson
[root@d1-k8s-master-002 kubernetes]# curl -L https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 -o cfssl-certinfo
[root@d1-k8s-master-002 kubernetes]# chmod +x cfssl* && mv cfssl* /usr/local/bin/

安装自动化工具Ansible:

[root@d1-k8s-master-002 ~]# yum install ansible -y

生成ca-config.json配置文件

[root@d1-k8s-master-002 kubernetes]# mkdir -p {cert,conf} && cat > cert/ca-config.json <<EOF
{
    "signing": {
        "default": {
            "expiry": "87600h"
        },
        "profiles": {
            "kubernetes": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}
EOF
[root@d1-k8s-master-002 kubernetes]# cp /etc/kubernetes/pki/ca.{crt,key} cert/

  • 有效期10年。

生成Ansible主机清单和每个节点kubelet的kubeconfig文件

# d4-beta
#MASTER_NAME="d4-beta-dui-001"
#SSH_PRIVATE_KEY="/root/.ssh/aispeech-shanghai.pem"
#API_SERVER="https://10.1.229.147:6443"
 
# d1-prod
MASTER_NAME="d1-k8s-master"
SSH_PRIVATE_KEY="/root/kubernetes/aispeech-dui-d1.pem"
API_SERVER="https://10.12.183.206:6443"
 
> ansible_hosts
for NODE_NAME in $(kubectl get nodes|grep -vE "NAME|$MASTER_NAME"|sort -nrk3|awk '{print $1}')
do
 
NODE_IP=$(kubectl describe nodes $NODE_NAME|grep "public-ip"|awk -F"=" '{print $2}')
 
# 准备Ansible主机清单
echo "$NODE_NAME \
  ansible_host=$NODE_IP \
  ansible_port=5837 \
  ansible_user=root \
  ansible_ssh_private_key_file=$SSH_PRIVATE_KEY \
  ansible_ssh_common_args='-o StrictHostKeyChecking=no'" >> ansible_hosts
 
# 创建证书签名请求
cat > cert/kubelet-${NODE_NAME}-csr.json <<EOF
{
    "CN": "system:node:${NODE_NAME}",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "hosts": [
    "127.0.0.1",
    "${NODE_NAME}",
    "${NODE_IP}"
    ],
    "names": [
      {
          "C": "CN",
          "ST": "Jiangsu",
          "L": "Suzhou",
          "O": "system:nodes",
          "OU": "aispeech"
      }
    ]
}
EOF
 
# 生成证书和私钥:
cfssl gencert \
  -ca=cert/ca.crt \
  -ca-key=cert/ca.key \
  -config=cert/ca-config.json \
  -profile=kubernetes \
  cert/kubelet-${NODE_NAME}-csr.json | cfssljson -bare cert/kubelet-${NODE_NAME}
 
# 创建kubeconfig文件:
kubectl config set-cluster kubernetes \
  --certificate-authority=cert/ca.crt \
  --embed-certs=true \
  --server=${API_SERVER} \
  --kubeconfig=conf/kubelet-${NODE_NAME}.kubeconfig
 
kubectl config set-credentials system:node:${NODE_NAME} \
  --client-certificate=cert/kubelet-${NODE_NAME}.pem \
  --client-key=cert/kubelet-${NODE_NAME}-key.pem \
  --embed-certs=true \
  --kubeconfig=conf/kubelet-${NODE_NAME}.kubeconfig
 
kubectl config set-context system:node:${NODE_NAME}@kubernetes \
  --cluster=kubernetes \
  --user=system:node:${NODE_NAME} \
  --kubeconfig=conf/kubelet-${NODE_NAME}.kubeconfig
 
kubectl config use-context system:node:${NODE_NAME}@kubernetes \
  --kubeconfig=conf/kubelet-${NODE_NAME}.kubeconfig
 
done

检查生成的kubelet的kubeconfig文件

除去3个master节点,总共有134个worker节点的kubelet需要更新:

[root@d1-k8s-master-002 kubernetes]# ll conf/|grep -v total|wc -l
134
[root@d1-k8s-master-002 kubernetes]# kubectl get nodes|grep -v NAME|wc -l
137

查看kubeconfig的配置信息:

[root@d1-k8s-master-002 kubernetes]# kubectl config view --kubeconfig=conf/kubelet-d1-dui-166.kubeconfig
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: REDACTED
    server: https://10.12.183.206:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: system:node:d1-dui-166
  name: system:node:d1-dui-166@kubernetes
current-context: system:node:d1-dui-166@kubernetes
kind: Config
preferences: {}
users:
- name: system:node:d1-dui-166
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED

查看kubeconfig的证书信息:

1)CA证书信息

[root@d1-k8s-master-002 kubernetes]# openssl x509 -noout -text -in <(cat conf/kubelet-d1-dui-166.kubeconfig|grep certificate-authority-data|awk '{print $2}'|base64 -d)|head
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 0 (0x0)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=kubernetes
        Validity
            Not Before: Aug 13 03:25:55 2018 GMT
            Not After : Aug 10 03:25:55 2028 GMT
        Subject: CN=kubernetes

2)kubelet访问apiserver的客户端证书信息

[root@d1-k8s-master-002 kubernetes]# openssl x509 -noout -text -in <(cat conf/kubelet-d1-dui-166.kubeconfig|grep client-certificate-data|awk '{print $2}'|base64 -d)|head -n11
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            7c:98:2e:b9:e4:0f:09:db:a1:83:3f:91:ca:1a:ee:45:e1:20:96:93
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=kubernetes
        Validity
            Not Before: Jul 22 09:36:00 2019 GMT
            Not After : Jul 19 09:36:00 2029 GMT
        Subject: C=CN, ST=Jiangsu, L=Suzhou, O=system:nodes, OU=aispeech, CN=system:node:d1-dui-166

更新kubelet证书

最后一步,备份并替换kubelet.conf文件,然后重启kubelet服务,慎重操作!

[root@d1-k8s-master-002 kubernetes]# cat > kubelet_cert_renew.sh <<"EOF"
RENEW_NODE_NAMES=""
cd /root/kubernetes
for NODE_NAME in ${RENEW_NODE_NAMES}
do
  ansible -i ansible_hosts ${NODE_NAME} -m copy -a "src=conf/kubelet-${NODE_NAME}.kubeconfig dest=/etc/kubernetes/kubelet.conf backup=yes"
  ansible -i ansible_hosts ${NODE_NAME} -m systemd -a "name=kubelet state=restarted enabled=yes"
done
EOF

  • 其中,RENEW_NODE_NAMES是需要更新kubelet证书的节点的主机名列表,例如:RENEW_NODE_NAMES="d1-dui-066 $(seq -f 'd1-dui-%03g' 100 110) d1-dui-166" 表示d1-dui-066、d1-dui-100、d1-dui-101 ... d1-dui-110、d1-dui-166这12个节点;
  • 可以放在crontab中定时执行。

检查kubelet证书

[d1-dui-166 ~]# openssl x509 -noout -text -in <(cat /etc/kubernetes/kubelet.conf|grep client-certificate-data|awk '{print $2}'|base64 -d)|head -n11
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            7c:98:2e:b9:e4:0f:09:db:a1:83:3f:91:ca:1a:ee:45:e1:20:96:93
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=kubernetes
        Validity
            Not Before: Jul 22 09:36:00 2019 GMT
            Not After : Jul 19 09:36:00 2029 GMT
        Subject: C=CN, ST=Jiangsu, L=Suzhou, O=system:nodes, OU=aispeech, CN=system:node:d1-dui-166
[d1-dui-166 ~]# systemctl status kubelet -l


3、操作影响

重启kubelet,对正常的服务无影响;

如果重启kubelet时服务异常,异常将延迟到kubelet重启完成之后处理;

如果kubelet重启失败,将导致该节点无法使用。

  • No labels