k3s: Not killing pods by default, system OOM takes over
We have a customer cluster that is overloaded. However, if we describe one of the nodes with issues, it seems that at no point did Kubernetes out of resource handling kick in.
For example the events are:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NodeNotReady 25m (x3 over 46m) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeNotReady
Normal NodeHasNoDiskPressure 19m (x7 over 5d1h) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 19m (x7 over 5d1h) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeHasSufficientPID
Normal NodeReady 19m (x6 over 5d1h) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeReady
Normal NodeHasSufficientMemory 19m (x7 over 5d1h) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeHasSufficientMemory
Warning SystemOOM 12m (x4 over 25m) kubelet, kube-node-9f4e System OOM encountered
Warning ContainerGCFailed 12m (x4 over 46m) kubelet, kube-node-9f4e rpc error: code = DeadlineExceeded desc = context deadline exceeded
Normal NodeAllocatableEnforced 9m23s kubelet, kube-node-9f4e Updated Node Allocatable limit across pods
Normal Starting 9m23s kubelet, kube-node-9f4e Starting kubelet.
Normal Starting 9m23s kube-proxy, kube-node-9f4e Starting kube-proxy.
The instance felt stuck (NotReady node status, couldn’t SSH to it) so we rebooted the VM and the events continue with:
Warning Rebooted 9m23s kubelet, kube-node-9f4e Node kube-node-9f4e has been rebooted, boot id: 362cf7ed-b89b-44d6-accd-6a840bc56bdc
Warning InvalidDiskCapacity 9m23s kubelet, kube-node-9f4e invalid capacity 0 on image filesystem
Normal NodeHasSufficientPID 66s (x4 over 9m23s) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeHasSufficientPID
Normal NodeHasNoDiskPressure 66s (x4 over 9m23s) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeHasNoDiskPressure
Normal NodeReady 66s (x2 over 9m23s) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeReady
Normal NodeHasSufficientMemory 66s (x4 over 9m23s) kubelet, kube-node-9f4e Node kube-node-9f4e status is now: NodeHasSufficientMemory
Warning SystemOOM 15s (x3 over 67s) kubelet, kube-node-9f4e System OOM encountered
So, am is this correct from a k3s point of view, that the events say NodeHasSufficientMemory and then the next thing is System OOM encountered? Feels like at some point there should have been a memory pressure event and a pod killed, before it got to the point that the system’s OOM killer took over.
Obviously, we’ve mentioned to the customer about setting resource limits for their pods/containers and they will, but they quite rightly say that the cluster should handle this nicely rather than nodes just (effectively) dying completely.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 5
- Comments: 16 (7 by maintainers)
I have a similar situation:
should be terminated, but its not.
This appears to no be an issue as of v1.20.15+, spinning up a “guaranteeed” pod
This pod appears under
/sys/fs/cgroup/memory/kubepods/pod26f3fe02-9448-4434-8d53-fac22702a5d0/which is the correct location. There is no “guaranteed” directory, pods with that QOS sit directly in the/kubepods/directoryPlease make sure that swap is disabled on the system, in particular the OOM killer appears to take swap into consideration when it should not.