istio: Istio Gateway does not have readiness/liveness enabled

Describe the bug We rely on Istio Gateway to point the traffic into the cluster. Through our testing, we found out unready Envoy process can be added to the fleet to receive the traffic. This will result to Connection Refused at the client side.

More alarmingly, we also found out that sometimes Envoy process fail to establish a connection with the pilot, please see the log from the gateway pod below:

image

Expected behavior Only healthy gateway should be added to the fleet to receive traffic.

Steps to reproduce the bug Deploy the istio gateway and expose it as a NodePort service. Scale up the gateway multiple times while sending some traffic to the gateway, you should be able to reproduce the bug.

I added the readinessProbe manually and it reduced the ConnectionRefuse errors but sometimes as the above error log showed, Envoy proxy fails to start and still result ConnectionRefuse.

Version Kubernetes: 1.12 Istio: 1.1-snapshot-4

Installation Using Helm

Environment 4.14.67-coreos

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 4
  • Comments: 20 (16 by maintainers)

Commits related to this issue

Most upvoted comments

@berstend Hi. As @tcnghia said, we just add the probe manually. But please read my comments in this thread, adding readinessProbe does not address the problem entirely. You will still have the risk when just scaling up the gateway deployments for example. Because there is no guarantee which will be faster, the configuration streaming from pilot or the actual production requests.

My suggestion is prepare a new deployment and use istioctl proxy-status to do a sync check. Once all the tests are passed, then update the gateway service manifest.

Hopefully it helps.