kind: CrashLoopBackOff Error in kube-proxy with kernel versions 5.12.2.arch1-1 and 5.10.35-1-lts
What happened: After creating the cluster with kind create cluster
, the kube-proxy pod have a CrashLoopBackOff Error. This happens at the kernel versions 5.12.2.arch1-1 and 5.10.35-1-lts. With kernel versions 5.12.1.arch1-1 and 5.10.34-1-lts I didn’t had the issue.
What you expected to happen: All pods in the cluster should start without problems.
How to reproduce it (as minimally and precisely as possible): On a Arch Linux install with kernel version 5.12.2.arch1-1 or 5.10.35-1-lts with docker installed download the latest version of kind and run kind create cluster
.
Anything else we need to know?:
- Log of kube-proxy pod:
I0511 11:47:28.906526 1 node.go:172] Successfully retrieved node IP: 172.18.0.2
I0511 11:47:28.906613 1 server_others.go:142] kube-proxy node IP is an IPv4 address (172.18.0.2), assume IPv4 operation
I0511 11:47:28.953210 1 server_others.go:185] Using iptables Proxier.
I0511 11:47:28.953346 1 server_others.go:192] creating dualStackProxier for iptables.
W0511 11:47:28.960804 1 server_others.go:492] detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for I
I0511 11:47:28.962804 1 server.go:650] Version: v1.20.2
I0511 11:47:28.965997 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
F0511 11:47:28.966114 1 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied
- Events from pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 48s default-scheduler Successfully assigned kube-system/kube-proxy-s7w5w to kind-control-plane
Normal Pulled 2s (x4 over 48s) kubelet Container image "k8s.gcr.io/kube-proxy:v1.20.2" already present on machine
Normal Created 2s (x4 over 45s) kubelet Created container kube-proxy
Normal Started 2s (x4 over 45s) kubelet Started container kube-proxy
Warning BackOff 1s (x5 over 42s) kubelet Back-off restarting failed container
- tried it with iptables and nftables, same result with both.
Enviroment:
-
kind version: (use
kind version
): Tested both:- v0.11.0-alpha+1d4788dd7461b3 go1.16.4
- v0.10.0 go1.16.4
-
Kubernetes version: (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"archive", BuildDate:"2021-04-09T16:47:30Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-03-11T06:23:38Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
- Docker version: (use
docker info
):
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.5.1-tp-docker)
Server:
Containers: 12
Running: 1
Paused: 0
Stopped: 11
Images: 8
Server Version: 20.10.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8c906ff108ac28da23f69cc7b74f8e7a470d1df0.m
runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
init version: de40ad0
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.10.35-1-lts
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.666GiB
Name: avocado
ID: ZNGF:FTZV:6BK6:VPE3:ZGAR:A5A2:VYEI:LUQE:AEU6:6MHN:ZGTZ:WR2V
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
- OS (e.g. from
/etc/os-release
):
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
Kernel: 5.10.35-1-lts
CPU: Intel i5-7200U (4) @ 3.100GHz
- iptables version: v1.8.7 (legacy)
- nftables version: v0.9.8 (E.D.S.)
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 6
- Comments: 20 (6 by maintainers)
Commits related to this issue
- Bump kind version (#380) It contains a fix for https://github.com/kubernetes-sigs/kind/issues/2240 We've hit when running GitHub actions https://github.com/actions/virtual-environments/issues/3673 — committed to capactio/capact by lukaszo 3 years ago
- old-operator.yaml: update ci - use a ubuntu 20.04 - set nf_conntrack_max to avoid CrashLoopBackOff for kube proxy (see https://github.com/kubernetes-sigs/kind/issues/2240#issuecomment-838510890) - pr... — committed to matt-mazzucato/astarte-kubernetes-operator by matt-mazzucato 3 years ago
- test.yaml: update ci - use ubuntu 20.04 - set nf_conntrack_max to avoid CrashLoopBackOff for kube proxy (see https://github.com/kubernetes-sigs/kind/issues/2240#issuecomment-838510890) - print cluste... — committed to matt-mazzucato/astarte-kubernetes-operator by matt-mazzucato 3 years ago
- helm.yaml: update ci - use ubuntu 20.04 - set nf_conntrack_max to avoid CrashLoopBackOff for kube proxy (see https://github.com/kubernetes-sigs/kind/issues/2240#issuecomment-838510890) - print cluste... — committed to matt-mazzucato/astarte-kubernetes-operator by matt-mazzucato 3 years ago
- old-operator.yaml: update ci - use a ubuntu 20.04 - set nf_conntrack_max to avoid CrashLoopBackOff for kube proxy (see https://github.com/kubernetes-sigs/kind/issues/2240#issuecomment-838510890) - pr... — committed to matt-mazzucato/astarte-kubernetes-operator by matt-mazzucato 3 years ago
- test.yaml: update ci - use ubuntu 20.04 - set nf_conntrack_max to avoid CrashLoopBackOff for kube proxy (see https://github.com/kubernetes-sigs/kind/issues/2240#issuecomment-838510890) - print cluste... — committed to matt-mazzucato/astarte-kubernetes-operator by matt-mazzucato 3 years ago
- helm.yaml: update ci - use ubuntu 20.04 - set nf_conntrack_max to avoid CrashLoopBackOff for kube proxy (see https://github.com/kubernetes-sigs/kind/issues/2240#issuecomment-838510890) - print cluste... — committed to matt-mazzucato/astarte-kubernetes-operator by matt-mazzucato 3 years ago
- Makefile: fix e2e tests We seem to be running into https://github.com/kubernetes-sigs/kind/issues/2240: kube-proxy is crashlooping, which in turn causes CoreDNS to fail to connect to the API server o... — committed to squat/kilo by squat 3 years ago
- [e2e] Add workaround for issue https://github.com/kubernetes-sigs/kind/issues/2240 This workaround has been added in GitHub Action Workflow config and the workaround command sysctl works only in Linu... — committed to karuppiah7890/community-edition-bak by karuppiah7890 3 years ago
- [e2e] Add workaround for issue https://github.com/kubernetes-sigs/kind/issues/2240 This workaround has been added in GitHub Action Workflow config and the workaround command sysctl works only in Linu... — committed to karuppiah7890/community-edition-bak by karuppiah7890 3 years ago
- [e2e] Add workaround for issue https://github.com/kubernetes-sigs/kind/issues/2240 Fixes #1014 This workaround has been added in GitHub Action Workflow config and the workaround command sysctl works... — committed to karuppiah7890/community-edition-bak by karuppiah7890 3 years ago
- [e2e] Add workaround for issue https://github.com/kubernetes-sigs/kind/issues/2240 This workaround has been added in GitHub Action Workflow config and the workaround command sysctl works only in Linu... — committed to karuppiah7890/community-edition-bak by karuppiah7890 3 years ago
- Tell kube-proxy not to try to set nf_conntrack_max For kube-proxy not becoming ready, like this: semaphore@semaphore-vm:~$ kubectl logs kube-proxy-42v55 -n kube-system I0727 19:55:26.230888 ... — committed to projectcalico/node by deleted user 3 years ago
- Tell kube-proxy not to try to set nf_conntrack_max For kube-proxy not becoming ready, like this: semaphore@semaphore-vm:~$ kubectl logs kube-proxy-42v55 -n kube-system I0727 19:55:26.230888 ... — committed to projectcalico/node by deleted user 3 years ago
- Bump kind to v0.11.1 (#313) Resolves an crash issue on linux machines noted here: https://github.com/kubernetes-sigs/kind/issues/2240 Co-authored-by: Ken Sipe <kensipe@gmail.com> — committed to kudobuilder/kuttl by croomes 3 years ago
- Inject sysctl changing nf_conntrack_max to 131072. This addresses https://github.com/SovereignCloudStack/k8s-cluster-api-provider/issues/18 https://github.com/kubernetes-sigs/kind/issues/2240 Signed... — committed to SovereignCloudStack/k8s-cluster-api-provider by garloff 3 years ago
- Inject sysctl changing nf_conntrack_max to 131072. This addresses https://github.com/SovereignCloudStack/k8s-cluster-api-provider/issues/18 https://github.com/kubernetes-sigs/kind/issues/2240 Signed... — committed to SovereignCloudStack/k8s-cluster-api-provider by garloff 3 years ago
- Fix/conntrack sysctl2 (#20) * Inject sysctl changing nf_conntrack_max to 131072. This addresses https://github.com/SovereignCloudStack/k8s-cluster-api-provider/issues/18 https://github.com/kuber... — committed to SovereignCloudStack/k8s-cluster-api-provider by garloff 3 years ago
- k8s: bump up kind version to v0.11.1 We also update the base image version to v1.21.1. https://github.com/kubernetes-sigs/kind/issues/2240 Signed-off-by: Hajime Tazaki <thehajime@gmail.com> — committed to ukontainer/runu by thehajime 3 years ago
I’m getting same results with
5.12.2-arch1-1
.Quick workaround if a cluster is needed fast: Manually set the parameter with
sudo sysctl net/netfilter/nf_conntrack_max=131072
before creating the Kind cluster.Thanks for your response Ben.
On further investigation, the old Kind executable is taking precedence in the path on that particular environment. Removing it out showed no issues, cluster is up and running as expected. The Kind 0.11.1 with node images 1.20.7 works without the additional settings
@yharish991 run
brew upgrade kind
which will upgrade your kind version to0.11.1
and fix the issue.For anyone testing stuff with old releases of Kubernetes using
kind
, you can work around the issue since the fix for thenf_conntrack_max
kernel change by using the latest version ofkind
. As of 9 July 2021, that’s kindv0.11.1
. Then, look at the Kubernetes images built for that kind version in the GitHub Release page. For example:thanks all, #2241 should be in shortly, and since we’re quite overdue for a release it should be released soon.
how do i fix this issue on mac os?
Man! Thanks! That worked, I should have thought of that myself.
@hyutota @BenTheElder I don’t think this is an Arch Linux-only issue.
According to the changleog of Linux 5.12.2, this commit (torvalds/linux@671c54ea8c7ff47bd88444f3fffb65bf9799ce43) has changed the behaviour of netfilter conntrack. I believe this is the commit that has caused this issue after upgrading to Linux 5.12.2.