kubernetes: When I change hugepage size 2Mi to 1Gi,kubelet can't update node status

What happened:

  • When I change default hugepage size 2Mi to 1Gi and reboot, kubelet can’t update node status
[root@worker-01 ~]# cat /etc/default/grub
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto default_hugepagesz=1GB rhgb quiet idle=halt biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs"
GRUB_DISABLE_RECOVERY="true"
E0723 03:23:36.757554    2183 kubelet_node_status.go:385] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"allocatable\":{\"devices.kubevirt.io/tun\":\"110\",\"devices.kubevirt.io/vhost-net\":\"110\",\"hugepages-1Gi\":\"16Gi\",\"memory\":\"0\"},\"capacity\":{\"devices.kubevirt.io/kvm\":\"110\",\"devices.kubevirt.io/tun\":\"110\",\"devices.kubevirt.io/vhost-net\":\"110\",\"hugepages-1Gi\":\"16Gi\"},\"conditions\":[{\"lastHeartbeatTime\":\"2019-07-23T03:23:36Z\",\"lastTransitionTime\":\"2019-07-23T03:23:36Z\",\"message\":\"kubelet has sufficient memory available\",\"reason\":\"KubeletHasSufficientMemory\",\"status\":\"False\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2019-07-23T03:23:36Z\",\"lastTransitionTime\":\"2019-07-23T03:23:36Z\",\"message\":\"kubelet has no disk pressure\",\"reason\":\"KubeletHasNoDiskPressure\",\"status\":\"False\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2019-07-23T03:23:36Z\",\"lastTransitionTime\":\"2019-07-23T03:23:36Z\",\"message\":\"kubelet has sufficient PID available\",\"reason\":\"KubeletHasSufficientPID\",\"status\":\"False\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2019-07-23T03:23:36Z\",\"lastTransitionTime\":\"2019-07-23T03:23:36Z\",\"message\":\"kubelet is posting ready status\",\"reason\":\"KubeletReady\",\"status\":\"True\",\"type\":\"Ready\"}],\"nodeInfo\":{\"bootID\":\"35150190-4de9-46d8-b1f5-05dd0cface2b\"},\"volumesInUse\":null}}" for node "worker-01": Node "worker-01" is invalid: [status.capacity.hugepages-1Gi: Invalid value: resource.Quantity{i:resource.int64Amount{value:17179869184, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"", Format:"BinarySI"}: may not have pre-allocated hugepages for multiple page sizes, status.capacity.pods: Invalid value: resource.Quantity{i:resource.int64Amount{value:250, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"250", Format:"DecimalSI"}: may not have pre-allocated hugepages for multiple page sizes, status.capacity.devices.kubevirt.io/kvm: Invalid value: resource.Quantity{i:resource.int64Amount{value:110, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"110", Format:"DecimalSI"}: may not have pre-allocated hugepages for multiple page sizes, status.capacity.cpu: Invalid value: resource.Quantity{i:resource.int64Amount{value:12, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"12", Format:"DecimalSI"}: may not have pre-allocated hugepages for multiple page sizes, status.allocatable.hugepages-2Mi: Invalid value: resource.Quantity{i:resource.int64Amount{value:16030629888, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"", Format:"BinarySI"}: may not have pre-allocated hugepages for multiple page sizes, status.allocatable.devices.kubevirt.io/tun: Invalid value: resource.Quantity{i:resource.int64Amount{value:110, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"110", Format:"DecimalSI"}: may not have pre-allocated hugepages for multiple page sizes]

What you expected to happen:

  • kubelet can update node info and node status was ready

How to reproduce it (as minimally and precisely as possible):

  • change the node condition validate

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:36:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
hyperkube
  • OS (e.g: cat /etc/os-release):
centos 7.6
  • Kernel (e.g. uname -a):
[root@worker-01 ~]# uname -a
Linux worker-01 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
kubevirt(It will import kvm tun device to kubelet,but i think it is not the reason)
  • Network plugin and version (if this is a network-related bug):
canal
  • Others:
[root@worker-01 ~]# cat /proc/meminfo
MemTotal:       24521676 kB
MemFree:         5402000 kB
MemAvailable:    6477760 kB
Buffers:          118828 kB
Cached:          1218632 kB
SwapCached:            0 kB
Active:           916956 kB
Inactive:        1043992 kB
Active(anon):     628436 kB
Inactive(anon):     1184 kB
Active(file):     288520 kB
Inactive(file):  1042808 kB
Unevictable:       13912 kB
Mlocked:           13912 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               992 kB
Writeback:             0 kB
AnonPages:        637432 kB
Mapped:           235740 kB
Shmem:              2052 kB
Slab:             120568 kB
SReclaimable:      61004 kB
SUnreclaim:        59564 kB
KernelStack:       19392 kB
PageTables:        17148 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     3872228 kB
Committed_AS:    7299148 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       64008 kB
VmallocChunk:   34359569928 kB
HardwareCorrupted:     0 kB
AnonHugePages:    319488 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:      16
HugePages_Free:       16
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
DirectMap4k:      145256 kB
DirectMap2M:     5097472 kB
DirectMap1G:    22020096 kB

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (8 by maintainers)

Commits related to this issue

Most upvoted comments

I am hitting the same, I used tuned to change the default size from 2Mi to 1Gi, i hit the same problem, after the reboot node is NotReady and it cannot rejoin , giving the error: Node “test-worker-0” is invalid: [status.capacity.hugepages-2Mi: Invalid value: resource.Quantity{i:resource.int64Amount{value:20971520, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:“20Mi”, Format:“BinarySI”}: may not have pre-allocated hugepages for multiple page sizes, status.allocatable.hugepages-2Mi: Invalid value: resource.Quantity{i:resource.int64Amount{value:20971520, scale:0}, My system just reports information for 1Gi, not 2Mi anymore. Removing the node and rebooting fixed it, the node rejoined again reporting the right capacity. But this is not a solution for us, changing sizes should be supported.