istio: Istio Operator Error Loops

Bug Description

We’re upgrading our Istio cluster from v1.12.7 to v1.14.1 using Istio Operator. Istio cluster itself is fine, ingressgateway and Istio proxy are updated, but Istio Operator keep looping error and usage memory keep increased.

Version

$ istioctl version
client version: 1.14.1
control plane version: 1.14.1
$ kubectl version --short
Client Version: v1.23.3
Server Version: v1.21.11-gke.900

Additional Information

error analysis error setting up error handling for kube crdclient: 2 errors occurred:
	* informer has already started
	* informer has already started

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 18
  • Comments: 17 (6 by maintainers)

Most upvoted comments

I reviewed the configs in our IOP. One of our k8s.overlay.patches was a valid current configuration and worked with 1.13.x operator logic but did not with 1.15.1 operator logic. We found a way to tweak that overlay patch so it did not confuse the 1.15.1 operator. That fixed it for us.

We were setting a value to key:{} in the overlay patch. Now in 1.15.1 we set it to key:.

Istio 1.15.1 has the change zirain added. I still get a continuous reconcile loop in my istio/operator pod but the logs are different now.

- Pruning removed resources 2022-09-29T12:57:48.412711Z info installer Reconciling IstioOperator -snip for length- 2022-09-29T12:57:49.690730Z info installer Generated manifest objects are the same as cached for component EgressGateways. 2022-09-29T12:57:49.706047Z info installer Generated manifest objects are the same as cached for component IngressGateways. - Pruning removed resources 2022-09-29T12:57:51.754029Z info installer Reconciling IstioOperator

Istio 1.15.1 has the change zirain added. I still get a continuous reconcile loop in my istio/operator pod but the logs are different now.

We have observed the same behavior but have also managed to resolve it. We have installed Istio components by using the operator approach, i.e. we run istioctl operator init and the apply the desired IstioOperator CR after which the operator deploys the necessary components.

After revisiting our IstioOperator CR, we found that using a specific profile (in our case “demo”) and at the same time specifying all generated fields in the CR we saw the above behavior. For this reason we have now chosen to only specify the fields which we overwrite, e.g. adding nodeports to the ingressGateway, HPA, resource request+limit, and podaffinity.

Hope it helps.