kubernetes: Memory manager UnexpectedAdmissionError
What happened?
Dual socket server with 96 threads total (2242), ~192G of RAM, cpu & memory manager static policy, topologyManagerPolicy best-effort, 10Gi of RAM reserved on NUMA node 0, 1 core (2 threads) reserved on NUMA node 0 kubeadm config:
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
...
cpuManagerPolicy: static
reservedSystemCPUs: 0,48
memoryManagerPolicy: Static
reservedMemory:
- numaNode: 0
limits:
# systemReserved memory + evictionHard memory.available
memory: 10340Mi
systemReserved:
memory: 10240Mi
topologyManagerPolicy: best-effort
If I try to allocate 2 Guaranteed pod with 85Gi of RAM each, one pod is admitted, and the second one fails with UnexpectedAdmissionError
even if it would fit using memory of both NUMA nodes.
As I’m using deployments a new pod is recreated right away and you end up with tons of failed pod in UnexpectedAdmissionError
state.
Error in the logs is:
E0918 12:16:32.162481 2865761 memory_manager.go:249] "Allocate error" err="[memorymanager] failed to find NUMA nodes to extend the current topology hint"
What did you expect to happen?
Either the pod becomes pending, or memory from both NUMA nodes is used
How can we reproduce it (as minimally and precisely as possible)?
Have some memory reserved on NUMA 0, launch 2 identical pods with memory limits really close to the max so 1 pod fit on NUMA 1 but the second doesn’t fit on NUMA 0 (the sum fits on the server)
Anything else we need to know?
Here the *_manager_state
If I launch 1 pod with 46 CPUs / 170Gi RAM, I get CPUs from 1 NUMA nodes, and memory from both NUMA nodes
# jq . <cpu_manager_state
{
"policyName": "static",
"defaultCpuSet": "0,24-48,72-95",
"entries": {
"d6c70f6d-adbe-4901-8b82-9505171b1368": {
"mycontainer": "1-23,49-71"
}
},
"checksum": 2697499474
}
# jq . <memory_manager_state
{
"policyName": "Static",
"machineState": {
"0": {
"numberOfAssignments": 1,
"memoryMap": {
"hugepages-1Gi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"hugepages-2Mi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"memory": {
"total": 99434803200,
"systemReserved": 10842275840,
"allocatable": 88592527360,
"reserved": 88592527360,
"free": 0
}
},
"cells": [
0,
1
]
},
"1": {
"numberOfAssignments": 1,
"memoryMap": {
"hugepages-1Gi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"hugepages-2Mi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"memory": {
"total": 101409165312,
"systemReserved": 0,
"allocatable": 101409165312,
"reserved": 93943582720,
"free": 7465582592
}
},
"cells": [
0,
1
]
}
},
"entries": {
"d6c70f6d-adbe-4901-8b82-9505171b1368": {
"mycontainer": [
{
"numaAffinity": [
0,
1
],
"type": "memory",
"size": 182536110080
}
]
}
},
"checksum": 382005683
}
If I launch a pod with 46 CPUs / 85Gi RAM, it’s properly placed on NUMA 1
# jq . <cpu_manager_state
{
"policyName": "static",
"defaultCpuSet": "0-23,47-71,95",
"entries": {
"62f6c733-dae0-4c97-a5a9-bbb5c4757a07": {
"mycontainer": "24-46,72-94"
}
},
"checksum": 3342740372
}
# jq . <memory_manager_state
{
"policyName": "Static",
"machineState": {
"0": {
"numberOfAssignments": 0,
"memoryMap": {
"hugepages-1Gi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"hugepages-2Mi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"memory": {
"total": 99434803200,
"systemReserved": 10842275840,
"allocatable": 88592527360,
"reserved": 0,
"free": 88592527360
}
},
"cells": [
0
]
},
"1": {
"numberOfAssignments": 1,
"memoryMap": {
"hugepages-1Gi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"hugepages-2Mi": {
"total": 0,
"systemReserved": 0,
"allocatable": 0,
"reserved": 0,
"free": 0
},
"memory": {
"total": 101409165312,
"systemReserved": 0,
"allocatable": 101409165312,
"reserved": 91268055040,
"free": 10141110272
}
},
"cells": [
1
]
}
},
"entries": {
"62f6c733-dae0-4c97-a5a9-bbb5c4757a07": {
"mycontainer": [
{
"numaAffinity": [
1
],
"type": "memory",
"size": 91268055040
}
]
}
},
"checksum": 3166358723
}
If I try to launch 2 pods with 46 CPUs / 85Gi RAM, one fails with
E0918 12:16:32.162481 2865761 memory_manager.go:249] "Allocate error" err="[memorymanager] failed to find NUMA nodes to extend the current topology hint"
If I try to launch 2 pods with 46 CPUs / 80Gi RAM, everything works
Kubernetes version
1.26.8
Cloud provider
NONE / bare-metal
OS version
Alma 8.8 base + rpm-ostree
Install tools
kubeadm
Container runtime (CRI) and version (if applicable)
containerd 1.7.5
Related plugins (CNI, CSI, …) and versions (if applicable)
No response
About this issue
- Original URL
- State: open
- Created 9 months ago
- Comments: 17 (17 by maintainers)
NP. I won’t be here next week either, so no rush
/assign @ffromani