calico: Cannot upgrade to v3.23.2
With a typha based setup, upgrading to calico 3.23.2 from 3.22.1 is not possible (at least not without a degradation of the CNI).
Expected Behavior
- Upgrade everything except calico-node to 3.23.2
- Cluster is healthy and calico-node 3.22.1 continues to function
Current Behavior
- Upgrade everything except calico-node (crds, calico-typha, etc) to 3.23.2
- Calico-node goes unready
- Networking for newly created pods is not functional
Speculation
Looking a bit into this it seems part of the problem might be that there might be some incompatible changes in libcalico-go such that typha and calico-node have different views of the world and what is valid. For example here is some log lines from calico-node 3.22.1 after the calico-typha upgrade:
2022-06-28 04:19:15.280 [ERROR][133] felix/sync_proto.go 302: BUG: cannot parse key. key="/calico/resources/v3/projectcalico.org/kubernetesendpointslices/kube-applier-8t5xd"
This looks like a potential bug where endpoint slices are no longer being treated as namespaced, potentially fixed with the following:
diff --git a/libcalico-go/lib/namespace/resource.go b/libcalico-go/lib/namespace/resource.go
index ebe909024..79e1fade8 100644
--- a/libcalico-go/lib/namespace/resource.go
+++ b/libcalico-go/lib/namespace/resource.go
@@ -39,6 +39,7 @@ func IsNamespaced(kind string) bool {
case KindKubernetesEndpointSlice:
// KindKubernetesEndpointSlice is a special-case resource. We don't expose it over the
// v3 API, but it is used in the felix syncer.
+ return true
case KindKubernetesService:
return true
}
However it also looks like KubernetesServices were changed to namespaced so also are not parsed in calico-node 3.22.1 when interacting with typha 3.23.2: https://github.com/projectcalico/calico/pull/5813/files#diff-0c1fa0f118bec26553d1dbbeb19112f956b9139f8b005d57cfd62b3cd4945c35
There may be more at play here, as I had mainly gravitated to the BUG log lines…
Steps to Reproduce (for bugs)
See expected behavior above.
Context
- Running with typha
- Upgrade from 3.22.1 to 3.23.2
P.S. Happy to dig up extra details if necessary, but figured this was worth posting sooner rather than later.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 16 (11 by maintainers)
Removing the defaulting is actually all that should be needed, e.g., this PR: https://github.com/projectcalico/calico/pull/6415
Gotcha, OK. I’ll do some more investigation on this then.