kubernetes: DualStack: Fail to deploy dualstack cluster, kube-proxy panics
What happened: I deployed a dualstack cluster with a config file. First, kube-controller-manager CrashLoopBackOff, because it add a default option --node-cidr-mask-size=24, I deleted it from /etc/kubernetes/manifests/kube-controller-manager.yaml, I think in dualstack mode, kube-controller-manager should ignore the --node-cidr-mask-size. Then, kube-proxy CrashLoopBackOff, [root@master ~]# kubectl logs -f kube-proxy-jpnl6 -n kube-system I0102 09:57:44.553192 1 node.go:135] Successfully retrieved node IP: 172.18.130.251 I0102 09:57:44.553270 1 server_others.go:172] Using ipvs Proxier. I0102 09:57:44.553287 1 server_others.go:174] creating dualStackProxier for ipvs. W0102 09:57:44.555671 1 proxier.go:420] IPVS scheduler not specified, use rr by default W0102 09:57:44.556213 1 proxier.go:420] IPVS scheduler not specified, use rr by default W0102 09:57:44.556278 1 ipset.go:107] ipset name truncated; [KUBE-6-LOAD-BALANCER-SOURCE-CIDR] -> [KUBE-6-LOAD-BALANCER-SOURCE-CID] W0102 09:57:44.556303 1 ipset.go:107] ipset name truncated; [KUBE-6-NODE-PORT-LOCAL-SCTP-HASH] -> [KUBE-6-NODE-PORT-LOCAL-SCTP-HAS] I0102 09:57:44.556606 1 server.go:571] Version: v1.17.0 I0102 09:57:44.557622 1 config.go:313] Starting service config controller I0102 09:57:44.557654 1 shared_informer.go:197] Waiting for caches to sync for service config I0102 09:57:44.557717 1 config.go:131] Starting endpoints config controller I0102 09:57:44.557753 1 shared_informer.go:197] Waiting for caches to sync for endpoints config W0102 09:57:44.560310 1 meta_proxier.go:106] failed to add endpoints kube-system/kube-scheduler with error failed to identify ipfamily for endpoints (no subsets) W0102 09:57:44.560337 1 meta_proxier.go:106] failed to add endpoints kube-system/kube-dns with error failed to identify ipfamily for endpoints (no subsets) W0102 09:57:44.560428 1 meta_proxier.go:106] failed to add endpoints kube-system/kube-controller-manager with error failed to identify ipfamily for endpoints (no subsets) E0102 09:57:44.560646 1 runtime.go:78] Observed a panic: “invalid memory address or nil pointer dereference” (runtime error: invalid memory address or nil pointer dereference) goroutine 29 [running]: k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1682120, 0x27f9a40) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa3 k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82 panic(0x1682120, 0x27f9a40) /usr/local/go/src/runtime/panic.go:679 +0x1b2 k8s.io/kubernetes/pkg/proxy/ipvs.(*metaProxier).OnServiceAdd(0xc0003ba330, 0xc0001c3200) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/proxy/ipvs/meta_proxier.go:65 +0x2b k8s.io/kubernetes/pkg/proxy/config.(*ServiceConfig).handleAddService(0xc0003352c0, 0x1869ac0, 0xc0001c3200) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/proxy/config/config.go:333 +0x82 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(…) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/controller.go:198 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0xf, 0xc00031a1c0, 0x0) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/shared_informer.go:658 +0x218 k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc000594dd8, 0xc000557610, 0xf) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:292 +0x51 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1() /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/shared_informer.go:652 +0x79 k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00046b740) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000594f40, 0xdf8475800, 0x0, 0xc000686601, 0xc00009a240) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8 k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(…) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc000478100) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/shared_informer.go:650 +0x9b k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc0003be840, 0xc000428580) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x59 created by k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x14be59b]
goroutine 29 [running]: k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x105 panic(0x1682120, 0x27f9a40) /usr/local/go/src/runtime/panic.go:679 +0x1b2 k8s.io/kubernetes/pkg/proxy/ipvs.(*metaProxier).OnServiceAdd(0xc0003ba330, 0xc0001c3200) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/proxy/ipvs/meta_proxier.go:65 +0x2b k8s.io/kubernetes/pkg/proxy/config.(*ServiceConfig).handleAddService(0xc0003352c0, 0x1869ac0, 0xc0001c3200) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/proxy/config/config.go:333 +0x82 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(…) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/controller.go:198 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0xf, 0xc00031a1c0, 0x0) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/shared_informer.go:658 +0x218 k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc000594dd8, 0xc000557610, 0xf) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:292 +0x51 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1() /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/shared_informer.go:652 +0x79 k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00046b740) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000594f40, 0xdf8475800, 0x0, 0xc000686601, 0xc00009a240) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8 k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(…) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc000478100) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/shared_informer.go:650 +0x9b k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc0003be840, 0xc000428580) /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x59 created by k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start /workspace/anago-v1.17.0-rc.2.10+70132b0f130acc/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version
): Client Version: version.Info{Major:“1”, Minor:“17”, GitVersion:“v1.17.0”, GitCommit:“70132b0f130acc0bed193d9ba59dd186f0e634cf”, GitTreeState:“clean”, BuildDate:“2019-12-07T21:20:10Z”, GoVersion:“go1.13.4”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“17”, GitVersion:“v1.17.0”, GitCommit:“70132b0f130acc0bed193d9ba59dd186f0e634cf”, GitTreeState:“clean”, BuildDate:“2019-12-07T21:12:17Z”, GoVersion:“go1.13.4”, Compiler:“gc”, Platform:“linux/amd64”} - OS (e.g:
cat /etc/os-release
): CentOS Linux release 7.7.1908 (Core) - Kernel (e.g.
uname -a
): Linux master 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux - Install tools:
- Network plugin and version (if this is a network-related bug):
- kubeadm init config file: kubeadm-conf.txt
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 4
- Comments: 46 (31 by maintainers)
@aojea
To migrate ipv4->dual-stack can be reduced to enable dual-stack in k8s >=v1.17.0. The upgrade of a ipv4 cluster to >=v1.17.0 must work, so that is a no-issue. Once you are on k8s >=v1.17.0 I think the best way is to first enable dual-stack on the master(s), updating CIDRs etc, and let the workers stay with IPv6DualStack:false. Then re-boot them with IPv6DualStack:true one-by-one.
Then the case is the reverse as commented above https://github.com/kubernetes/kubernetes/issues/86773#issuecomment-570521112.
But this has to be discussed some place else 😃
The reason for the panic is not hard to see;
https://github.com/kubernetes/kubernetes/blob/65ef5dcc513ccfd60436bf4d04652224c9b6036f/pkg/proxy/ipvs/meta_proxier.go#L64-L66
There is no check for
nil
.The reason why IPFamily is nil is less clear. I tried to set
IPv6DualStack:false
for the “master” K8s processes, but keepIPv6DualStack:true
onkube-proxy
and I get exactly the panic described in this issue.So I think the problem is cluster misconfiguration.
I am unsure if the panic is acceptable. The error indication could be better of course but IMHO the
kube-proxy
shall not “help” the user in this case by make some assumption of ipv4 for instance. That would hide a serious misconfiguration.Same error. kube-proxy is v1.17.0. Using mode:ipvs for dual-stack
Refer to the validation guide (https://kubernetes.io/docs/tasks/network/validate-dual-stack/#validate-pod-addressing), pods, nodes and services work well, but there are the same error in kube-proxy logs
Want to know if there is some way to specify the ipfamily? Is this error caused by “W0304 03:43:47.485272 1 proxier.go:420] IPVS scheduler not specified, use rr by default”?
@Richard87
that’s the 1M dollar question 😉 https://github.com/kubernetes/kubernetes/pull/86895
seems we are getting closer to solve this
Because it is a configuration error. Since the user has enabled the feature-gate half-way he/she expects dual-stack to work, but it can’t. If this faulty configuration is just accepted this issue will be the first in an endless stream of duplicates.
An unspecified family will be set to the “main” family of the cluster (which may be ipv6) by the master processes (api-server?) when the feature-gate is enabled which ensures backward compatibility. But the decision which family is made by the master, not kube-proxy.
you seem to be using a pre-release version of kube-proxy (v1.17-rc.2.10+70132b0f130acc). try v1.17.0.
if v1.17.0 also does not work, try using
mode: iptables
instead of IPVS.@kubernetes/sig-network-bugs