kubernetes: Scheduling fails with "Insufficient Memory" until restart of apiserver/master
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): No
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): memory scheduler / memory scheduling
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
Kubernetes version (use kubectl version
):
Client Version: version.Info{Major:“1”, Minor:“6”, GitVersion:“v1.6.1”, GitCommit:“b0b7a323cc5a4a2019b2e9520c21c7830b7f708e”, GitTreeState:“clean”, BuildDate:“2017-04-25T14:48:12Z”, GoVersion:“go1.8.1”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“6”, GitVersion:“v1.6.1+coreos.0”, GitCommit:“9212f77ed8c169a0afa02e58dce87913c6387b3e”, GitTreeState:“clean”, BuildDate:“2017-04-04T00:32:53Z”, GoVersion:“go1.7.5”, Compiler:“gc”, Platform:“linux/amd64”}
Environment:
-
Cloud provider or hardware configuration: Custom/mixed
-
OS (e.g. from /etc/os-release): CoreOS 1353.7.0 (stable)
-
Kernel (e.g.
uname -a
): Linux coreos04.kub.do.modio.se 4.9.24-coreos #1 SMP Wed Apr 26 21:44:23 UTC 2017 x86_64 Intel® Xeon® CPU E5-2650L v3 @ 1.80GHz GenuineIntel GNU/Linux -
Install tools: Manual guide / ansible
-
Others:
What happened: Scheduling pods which have a memory limit slowly fails after a few pod deployments, until the master node is restarted, upon which it starts working again.
Pods are configured with:
resources:
limits:
memory: "1Gi"
cpu: "1"
requests:
cpu: "100m"
memory: "30Mi"
While the node ouputs:
Capacity:
cpu: 2
memory: 2052872Ki
pods: 110
Allocatable:
cpu: 2
memory: 1950472Ki
pods: 110
And further down:
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
150m (7%) 1 (50%) 12Mi (0%) 128Mi (6%)
Events:
The numbers don’t add up, and manually stepping through sizes show that up to ~300Mi the nodes schedule, after that they fail. This behaviour appears consistently for us.
node.description.txt pod.description.txt
What you expected to happen: Requirements limiting the jobs on the machine to ~3 due to memory pressure
How to reproduce it (as minimally and precisely as possible): Schedule a Lot of pods with memory limits and delete them / let them complete.
For us, this is a giblab CI runner which connects to create pods for us, and after a while all our build machines stand empty, and jobs wait in scheduling tempo forever.
Anything else we need to know: pass.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 26 (12 by maintainers)
I’m seeing this with AKS. I have nodes with 8G of ram and I schedule 1 pod per node with limits and requests for 6.5G memory. Sometimes it works fine. Othertimes it says “insufficient memory” when there is clearly enough. Unfortunately I don’t think I can restart the kube-apiserver on an AKS managed cluster
Similar issues: #33777 #34920 (with
Insufficient CPU
)