How to resolve 'node unavailable, kubelet stopped posting node status' when using Rancher
Problem
When using Rancher, sometimes a worker node may stop working, and you may encounter a warning like this:
Unavailablekubelet stopped posting node status
Environment
- Docker: Server Version: 19.03.13
- Rancher 2.x
Debug
You can debug the node status by running this command:
kubectl describe nodes
Then check the kubelet logs on the node:
journalctl -u kubelet
Solution
Solution #1: Restart docker/kubelet service
You can try to restart the Docker service on the non-working node:
On CentOS:
service docker restart
On Ubuntu:
systemctl restart dockersystemctl restart kubelet
Solution #2: Reboot the node
If you have root permission and the server is ready to reboot, you can do this:
reboot
Solution #3: Recreate the cluster
You can follow this guide to recreate the cluster.
Solution #4: Remove and then re-add the node
- First, remove the node from the cluster.
- Second, add the node to the cluster again or perform an etcd snapshot restore by following this guide.
Solution #5: Disable swap memory on the node
You can follow this guide or simply execute the following command:
swapoff -a
Solution #6: Re-enable IP forwarding for Docker
Dockerd enables IP forwarding (sysctl net.ipv4.ip_forward
) when it starts. However, if you run service network restart
, it will disable IP forwarding while stopping networking. You need to re-enable it.
You can verify the ip_forward
status by running:
docker info|grep WARNING
If you see this:
WARNING: IPv4 forwarding is disabled
Then you should re-enable IP forwarding temporarily:
sudo sysctl -w net.ipv4.ip_forward=1
Or permanently:
echo "net.ipv4.ip_forward = 1" | sudo tee -a /etc/sysctl.conf
Summary
This post outlines several methods to resolve the “node unavailable, kubelet stopped posting node status” error in Rancher. Key solutions include restarting Docker and kubelet services, rebooting the node, recreating the cluster, and reconfiguring IP forwarding. These steps should help restore node functionality and ensure smooth operation of your Rancher-managed Kubernetes cluster.
Final Words + More Resources
My intention with this article was to help others who might be considering solving such a problem. So I hope that’s been the case here. If you still have any questions, don’t hesitate to ask me by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Rancher Cluster Provisioning Guide
- 👨💻 Restoring etcd in Rancher
- 👨💻 Disable Swap Partition in CentOS/Ubuntu
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!