Worker Node Failure
-
Take me to the Lecture
-
Lets check the status of the Nodes in the cluster, are they
ReadyorNotReadykubectl get nodes -
If they are
NotReadythen check theLastHeartbeatTimeof the node to find out the time when node might have crashedkubectl describe node worker-1 -
Check the possible
CPUandMEMORYusingtopanddf -h -
Check the status and the logs of the
kubeletfor the possible issues.serivce kubelet statussudo journalctl -u kubelet -
Check the
kubeletCertificates, they are not expired, and in the right group and issued by the right CA.openssl x509 -in /var/lib/kubelet/worker-1.crt -text