cluster-api: CI failure: capi-e2e-release-1-2-1-22-1-23 and capi-e2e-release-1-2-1-23-1-24 failing consistently
capi-e2e-release-1-2-1-22-1-23 and capi-e2e-release-1-2-1-23-1-24 tests run in release-1.2 branches are failing consistently since December 8th and 9th respectively.
Both tests (prow logs: https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/periodic-cluster-api-e2e-workload-upgrade-1-22-1-23-release-1-2/1602416045765693440 & https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/periodic-cluster-api-e2e-workload-upgrade-1-23-1-24-release-1-2/1602550432566087680) are failing with the exact same error messages:
When upgrading a workload cluster using ClusterClass and testing K8S conformance [Conformance] [K8s-Upgrade] [ClusterClass]
/home/prow/go/src/sigs.k8s.io/cluster-api/test/e2e/cluster_upgrade_test.go:29
Should create and upgrade a workload cluster and eventually run kubetest [It]
/home/prow/go/src/sigs.k8s.io/cluster-api/test/e2e/cluster_upgrade.go:118
Timed out after 1200.003s.
Expected
<bool>: false
to be true
/home/prow/go/src/sigs.k8s.io/cluster-api/test/framework/daemonset_helpers.go:66
Full Stack Trace
sigs.k8s.io/cluster-api/test/framework.WaitForKubeProxyUpgrade({0x252c150?, 0xc0004b2f40}, {{0x7f4de407bce0?, 0xc00040c7e0?}, {0xc00005810e?, 0xc0018d79e0?}}, {0xc0019aa1c0, 0x2, 0x2})
/home/prow/go/src/sigs.k8s.io/cluster-api/test/framework/daemonset_helpers.go:66 +0x4ca
sigs.k8s.io/cluster-api/test/framework.UpgradeClusterTopologyAndWaitForUpgrade({0x252c150?, 0xc0004b2f40}, {{0x2537c58, 0xc00196a980}, 0xc0020f2700, 0xc000744c00, {0xc000054158, 0x7}, {0xc00005840b, 0x6}, ...})
/home/prow/go/src/sigs.k8s.io/cluster-api/test/framework/cluster_topology_helpers.go:126 +0x918
sigs.k8s.io/cluster-api/test/e2e.ClusterUpgradeConformanceSpec.func2()
/home/prow/go/src/sigs.k8s.io/cluster-api/test/e2e/cluster_upgrade.go:145 +0x9ac
github.com/onsi/ginkgo/internal/leafnodes.(*runner).runSync(0x7f4dd4fbcd98?)
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/leafnodes/runner.go:113 +0xb1
github.com/onsi/ginkgo/internal/leafnodes.(*runner).run(0x0?)
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/leafnodes/runner.go:64 +0x125
github.com/onsi/ginkgo/internal/leafnodes.(*ItNode).Run(0x0?)
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/leafnodes/it_node.go:26 +0x7b
github.com/onsi/ginkgo/internal/spec.(*Spec).runSample(0xc0000290e0, 0xc001d49998?, {0x250e4c0, 0xc000066900})
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/spec/spec.go:215 +0x28a
github.com/onsi/ginkgo/internal/spec.(*Spec).Run(0xc0000290e0, {0x250e4c0, 0xc000066900})
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/spec/spec.go:138 +0xe7
github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpec(0xc0003329a0, 0xc0000290e0)
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/specrunner/spec_runner.go:200 +0xe8
github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpecs(0xc0003329a0)
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/specrunner/spec_runner.go:170 +0x1a5
github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).Run(0xc0003329a0)
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/specrunner/spec_runner.go:66 +0xc5
github.com/onsi/ginkgo/internal/suite.(*Suite).Run(0xc000198af0, {0x7f4dd46899b8, 0xc000682680}, {0x21c41fb, 0x8}, {0xc0005265c0, 0x2, 0x2}, {0x252db18, 0xc000066900}, ...)
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/suite/suite.go:79 +0x4d2
github.com/onsi/ginkgo.runSpecsWithCustomReporters({0x25111a0?, 0xc000682680}, {0x21c41fb, 0x8}, {0xc0005265a0, 0x2, 0x21e2821?})
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/ginkgo_dsl.go:245 +0x189
github.com/onsi/ginkgo.RunSpecsWithDefaultAndCustomReporters({0x25111a0, 0xc000682680}, {0x21c41fb, 0x8}, {0xc00009df10, 0x1, 0x1})
/home/prow/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/ginkgo_dsl.go:228 +0x1b6
sigs.k8s.io/cluster-api/test/e2e.TestE2E(0x0?)
/home/prow/go/src/sigs.k8s.io/cluster-api/test/e2e/e2e_suite_test.go:109 +0x232
testing.tRunner(0xc000682680, 0x22de800)
/usr/local/go/src/testing/testing.go:1439 +0x102
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:1486 +0x35f
------------------------------
STEP: Dumping logs from the bootstrap cluster
Failed to get logs for the bootstrap cluster node test-xbf4hn-control-plane: exit status 2
STEP: Tearing down the management cluster
This could be related to the recent changes to e2e framework
Environment:
- Cluster-api version:
- minikube/kind version:
- Kubernetes version: (use
kubectl version): - OS (e.g. from
/etc/os-release):
/kind bug [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15 (15 by maintainers)
https://github.com/kubernetes/test-infra/pull/28243 landed and the CI signal looks good, I think this issue can be closed.
/close
Let’s keep this open until we get a signal with the new kubekins images after revert https://github.com/kubernetes/test-infra/pull/28243 lands.
This issue is now resolved. TLDR; a registry change fix was missing in the release-1.2 branch. The issue was fixed by merging this cherry-pick https://github.com/kubernetes-sigs/cluster-api/pull/7505.
More details can be found in the CAPI slack discussion.
I’ve not been able to replicate this failure locally - both upgrade jobs are running successfully on a local CAPD testbed.
Looking again at the failures it seems clear this isn’t a direct issue of CAPI code. The code didn’t change, but the test started failing consistently 5 days ago.
This PR https://github.com/kubernetes/test-infra/commit/64d7bee707522937a12673c173fb6048cb2aa802 updating the kubekins image used in the CAPI test jobs was merged 5 days ago, though, so it seems the most likely candidate is some change in there.
Issue opened at the test-infra repo here: https://github.com/kubernetes/test-infra/issues/28233