kubernetes: CPU scheduler can't configure kube-proxy related POD cgroup settings in "static" mode
Is this a BUG REPORT or FEATURE REQUEST?: /kind bug
What happened: Kubelet spams syslog with the following messages:
E0725 08:27:03.880692 55872 remote_runtime.go:302] UpdateContainerResources “8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d” from runtime service failed: rpc error: code = Unknown desc = failed to update container “8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d”: Error response from daemon: Cannot update container 8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d: docker-runc did not terminate sucessfully: failed to write 0-1,4-15 to cpuset.cpus: write /sys/fs/cgroup/cpuset/kubepods/besteffort/podec2753a1-88e4-11e8-8dc3-9cb654aebf72/8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d/cpuset.cpus: device or resource busy
E0725 08:27:03.880718 55872 cpu_manager.go:267] [cpumanager] reconcileState: failed to update container (pod: kube-proxy-dmtlz, container: kube-proxy, container id: 8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d, cpuset: “0-1,4-15”, error: rpc error: code = Unknown desc = failed to update container “8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d”: Error response from daemon: Cannot update container 8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d: docker-runc did not terminate sucessfully: failed to write 0-1,4-15 to cpuset.cpus: write /sys/fs/cgroup/cpuset/kubepods/besteffort/podec2753a1-88e4-11e8-8dc3-9cb654aebf72/8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d/cpuset.cpus: device or resource busy
What you expected to happen: No error messages
How to reproduce it (as minimally and precisely as possible):
- Setup a base environment with kubeadm
- Make sure that kube-proxy is running fine
- Configure kubelet to use the --cpu-manager-policy=static flags with --kube-reserved
- Deploy a POD with guaranteed QOS
- Examine the syslog messages
Anything else we need to know?: The cpu_manager can’t set the cpuset.cpus’ value for that POD because kube-proxy configures an additional child group there:
/sys/fs/cgroup/cpuset/kubepods/besteffort/podec2753a1-88e4-11e8-8dc3-9cb654aebf72/8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d/cpuset.cpus /sys/fs/cgroup/cpuset/kubepods/besteffort/podec2753a1-88e4-11e8-8dc3-9cb654aebf72/8f2acab777a52bb17ff5884908d635fceae63bcc48ec6e17396cdc93269add4d/kube-proxy/cpuset.cpus
Apparently the cpu_manager does not expect an additional child group to be present there, and thus fails to configure that first before attempting to configure it’s parent So far I experienced this behavior with kube-proxy only.
This might be related to an old kube-proxy issue descibed in https://github.com/kubernetes/kubernetes/issues/17619
Environment:
- Kubernetes version (use
kubectl version):
Client Version: version.Info{Major:“1”, Minor:“11”, GitVersion:“v1.11.0”, GitCommit:“91e7b4fd31fcd3d5f436da26c980becec37ceefe”, GitTreeState:“clean”, BuildDate:“2018-06-27T20:17:28Z”, GoVersion:“go1.10.2”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“11”, GitVersion:“v1.11.0”, GitCommit:“91e7b4fd31fcd3d5f436da26c980becec37ceefe”, GitTreeState:“clean”, BuildDate:“2018-06-27T20:08:34Z”, GoVersion:“go1.10.2”, Compiler:“gc”, Platform:“linux/amd64”}
- Cloud provider or hardware configuration: Self managed cluster, consisting of 8 x HP ProLiant SL210t Gen8 servers
- OS (e.g. from /etc/os-release): Ubuntu 18.04 LTS (Bionic Beaver)
- Kernel (e.g.
uname -a):
4.15.0-24-generic #26-Ubuntu SMP Wed Jun 13 08:44:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
- Install tools: kubeadm
- Others: N/A
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 23 (13 by maintainers)
Commits related to this issue
- Disable resource containers for kube-proxy This disables resource containers for kube-proxy since that feature is not needed and creates subsequent issues when the cpu manager static policy is enable... — committed to MarioCarrilloA/config by jimgauld 5 years ago
- Disable resource containers for kube-proxy This disables resource containers for kube-proxy since that feature is not needed and creates subsequent issues when the cpu manager static policy is enable... — committed to starlingx-staging/puppet by jimgauld 5 years ago
After adding the
--resource-container=""flag to kube-proxy, I’m no longer seeing the reported error, however I still see these:Though I believe this is a separate issue https://github.com/kubernetes/kubernetes/issues/54967, @P1ng-W1n any ideas?
@dannyk81 - Correct, that’s a separate issue (kube-proxy seems to have quite a lot of them lately 😃)