kubernetes: cpu_manager "failed to write a *:* rwm to devices.allow" error message

Is this a BUG REPORT or FEATURE REQUEST?: /kind bug /sig node

What happened:

On 1.8+ kubelet with --cpu-manager-policy=static, there is a constant stream of logging every 10 seconds or so in the following formats:

kubelet: E1030 14:13:05.012702 23623 remote_runtime.go:302] UpdateContainerResources "f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986" from runtime service failed: rpc error: code = Unknown desc = failed to update container "f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986": Error response from daemon: Cannot update container f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986: rpc error: code = Unknown desc = failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/kubepods/burstable/pod74095402-a88b-11e7-9aea-90b11c4094cf/f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986/devices.allow: invalid argument

kubelet: E1030 14:13:05.012729 23623 cpu_manager.go:242] [cpumanager] reconcileState: failed to update container (pod: kube-proxy-47dt8, container: kube-proxy, container id: f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986, cpuset: "0-31", error: rpc error: code = Unknown desc = failed to update container "f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986": Error response from daemon: Cannot update container f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986: rpc error: code = Unknown desc = failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/kubepods/burstable/pod74095402-a88b-11e7-9aea-90b11c4094cf/f0f24b4647d8a9694427eb7900ce93c8402f81d5b7724eb2207f1a4755f79986/devices.allow: invalid argument)

for every privileged pod where update would fail but looks expected.

What you expected to happen:

It appears the error is expected, and should not be logged continuously.

How to reproduce it (as minimally and precisely as possible):

On 1.8+ kubelet with --cpu-manager-policy=static, have privileged pods (such as kube-proxy) and watch system log.

Anything else we need to know?:

This is for a bug that originally reported in #54804, so there is a little bit of context there. This is the second half of it.

Environment:

  • Kubernetes version (use kubectl version): 1.8.1
  • Cloud provider or hardware configuration: aremetal/onprem
  • OS (e.g. from /etc/os-release): CentOS 7.4.1708
  • Kernel (e.g. uname -a): 4.13.8-1.el7.elrepo.x86_64
  • Install tools: custom
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 30 (14 by maintainers)

Commits related to this issue

Most upvoted comments

Similar error, kubelet 1.12.1

Reason

This problem happens when kube-proxy is deployed as DaemonSet, and eanables static cpu-manager-policy for kublet on k8s clusters older than 1.16( I verified 1.14, but not 1.15)

As discussed, this problem relates to the resource-container parameter of kube-proxy. Although it’s deprecated, its default value is still kube-proxy, and kube-proxy under version 1.16 will still create the kube-proxy sub directory under its each cgroup subsystem dirs. And for the devices subsystem, it’s not allow to write to devices.allow file when there is any sub directory. Otherwise, system will complain with "invalid argument" message.

Solution

Set resource-container parameter of kube-proxy to blank:

spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kube-proxy
  template:
    metadata:
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ""
      creationTimestamp: null
      labels:
        k8s-app: kube-proxy
    spec:
      containers:
      - args:
        - --kubeconfig=/var/lib/kube-proxy/config
        - --hostname-override=$(NODE_NAME)
        - --v=2
        - --resource-container=