kubernetes: Pods that fail health checks always restarting on the same minion instead of others?
Over the weekend the skydns
container in the kube-dns
pod died. The exact reason I’m not sure because I couldn’t find much detail from the logs, but watching the etcd and skydns logs showed the root issue could’ve been etcd
. A theory I have is that the /mnt/emphemeral/kubernetes filesystem was full (it’s only 3.75GB and has a few large empty-dir volumes). It was showing 3/4 ready for kube-dns.
This caused all of my application pods across 4 minions to go down. I had to manually delete the kube-dns
pod and when it launched on another minion it was fine and everything came back online.
On the same token. I had 1 minion that would never consider any of my pods “ready”, even though the other the other 3 minions did. I didn’t find out why and my logs weren’t helpful, so I just had to manually terminate that minion (EC2 instance) and auto-scale a new one (which happened to work fine).
For both of these cases, if k8s automatically moved the pods that constantly failed to other minions I think the cluster would’ve healed itself. Is the fact that failing pods always try to restart on the same minion intentional or something in the works?
I’m sorry I don’t have logs to show. I’m not sure how to retrieve them from 2 days ago after so many pods have been restarted.
About this issue
- Original URL
- State: closed
- Created 9 years ago
- Comments: 16 (10 by maintainers)
Hi all, additionally to this, there can be other problems, like in virtualized environment the reported sizes that we can detect might not be those we can work with and we might infact be running on swap and therefore services might not react in time and though should be moved to other hosts. I suggest a possibility to define an amount of healthcheck related restarts until the pod should be rescheduled.
@lavalamp certainly, it’s critical to be able to detect issues with nodes. Is this something on that roadmap that will be built into kubernetes/kubelets? In the meantime, I need someway to detect this internally and either automatically handle it and/or send alerts. What are some ways you’d advise to do this?
This issue bit me again over the weekend. I have a simple 3-node cluster in AWS that was provisioned with
cluster/kube-up
with only 3 nonkube-system
pods. Everything was healthy on Friday, and without any changes over the weekend, I checked it again a few days later and all pods on that particular node were failing [1], and had they restarted on another node everything would’ve been fine.[1] is this is another Github issue I should create?