kubernetes: gce-1.9-1.8-downgrade fails due to etcd crashloopbacking
/kind bug /priority failing-test /sig cluster-lifecycle testing
This test fails: https://k8s-testgrid.appspot.com/sig-release-1.9-all#gce-1.9-1.8-downgrade
Some clues:
- waiting for new master timeout: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-beta-stable1-downgrade-cluster/33/build-log.txt
== Waiting for new master to respond to API requests ==
W1204 16:55:41.879] 2017/12/04 16:54:57 util.go:196: Interrupt after 15h0m0s timeout during kubetest --test --test_args=--ginkgo.focus=\[Feature:ClusterDowngrade\] --upgrade-target=ci/k8s-stable1 --upgrade-image=gci --report-dir=/workspace/_artifacts --disable-log-dump=true --report-prefix=upgrade --v=true --check-version-skew=false. Will terminate in another 15m
- master healthz check failed: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-beta-stable1-downgrade-cluster/33/artifacts/bootstrap-e2e-master/kube-apiserver.log
logging error output: "[+]ping ok
[-]etcd failed: reason withheld
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[-]poststarthook/bootstrap-controller failed: reason withheld
[-]poststarthook/rbac/bootstrap-roles failed: reason withheld
[-]poststarthook/ca-registration failed: reason withheld
[+]poststarthook/start-kube-apiserver-informers ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-status-available-controller ok
[+]poststarthook/apiservice-openapi-controller ok
[+]poststarthook/kube-apiserver-autoregistration ok
[-]autoregister-completion failed: reason withheld
healthz check failed
"
[[curl/7.38.0] 35.184.43.102:54282]
- etcd is crashloopbacking: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-beta-stable1-downgrade-cluster/33/artifacts/bootstrap-e2e-master/etcd.log
2017-12-04 02:24:01.495921 I | etcdserver: starting server... [version: 3.0.17, cluster version: to_be_decided]
2017-12-04 02:24:01.506968 I | membership: added member a97d80ddc090126a [https://bootstrap-e2e-master:2380] to cluster 91078bdbe2ed539b
2017-12-04 02:24:01.507103 N | membership: set the initial cluster version to 3.1
2017-12-04 02:24:01.507116 C | membership: cluster cannot be downgraded (current version: 3.0.17 is lower than determined cluster version: 3.1).
I guess the cause may be something associated with etcd version compatibility. @xiang90 @hongchaodeng any thoughts or suggestions? Thanks!
This is also posted in https://groups.google.com/forum/#!topic/kubernetes-sig-cluster-lifecycle/c4MW3R5v4v0
/cc @enisoc @luxas @krzyzacy @kubernetes/sig-cluster-lifecycle-test-failures @kubernetes/sig-release-test-failures
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 23 (23 by maintainers)
I manually verified that the following works as expected to downgrade from 1.9 to 1.8:
I’ll work on adding a prompt to
gce/upgrade.shif these are unset.@xiangpengzhao great question. We would probably need to setup tests to actually test it, but I really believe that it should work fine (k8s 1.8 has already etcd client in 3.1.10 version). And also, we don’t really have big choice here - that’s the only thing we can do/recommend doing.