kubeadm: Kubelet 'failed to get cgroup stats for "/system.slice/kubelet.service"' error messages

BUG REPORT

Versions

kubeadm version : 1.17.3

Environment:

  • Kubernetes version : 1.17.3
  • Cloud provider or hardware configuration: on prem Dell R740XD
  • OS (e.g. from /etc/os-release): RHEL 7.7
  • Kernel (e.g. uname -a): 3.10.0-1062.12.1.el7.x86_64
  • Others: docker-ce 19.03.7-3.el7.x86_64

What happened?

Kubelet is printing regularly to the logs: summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"

This looks like the issue described here: https://github.com/kubernetes/kops/issues/4049

This was fixed by restarting the kubelet. Rebooting the machine sees the problem persist, so I think it’s related to the systemd start order.

What you expected to happen?

No error message logging, and correct reporting of container stats

How to reproduce it (as minimally and precisely as possible)?

kubeadm cluster on RHEL 7.7

Anything else we need to know?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 21 (10 by maintainers)

Most upvoted comments

@cjreyn Did you solve the issue ? I have it on CentOS 7

Yes, I put the following in the file /usr/lib/systemd/system/kubelet.service.d/11-kubeadm.conf

[Service]
After=docker.service
ExecStartPre=/bin/sleep 10

In theory, adding just the “After=docker.service” should be enough, but in my testing it also needed the sleep.