helm: helm install --wait does not wait for deployment pod readiness properly

I recently discovered and started using --wait:

--wait                   if set, will wait until all Pods, PVCs, Services, and minimum number of Pods of a Deployment are in a ready state before marking the release as successful. It will wait for as long as --timeout

It’s great!

However, when I run helm install --wait, the command exits successfully before the pods in my deployments pass their configured readiness probe.

I did a little spelunking to confirm that my expectations were correct, and found this suspicious-looking code attempting to do the deployment readiness check:

func (c *Client) deploymentsReady(deployments []deployment) bool {
	for _, v := range deployments {
		if !(v.replicaSets.Status.ReadyReplicas >= *v.deployment.Spec.Replicas-deploymentutil.MaxUnavailable(*v.deployment)) {
			c.Log("Deployment is not ready: %s/%s", v.deployment.GetNamespace(), v.deployment.GetName())
			return false
		}
	}
	return true
}

permalink: https://github.com/kubernetes/helm/blob/d790f7d843182f1c126cfa64989ffdd4e9583747/pkg/kube/wait.go#L174

I’m happy to send a PR to fix the if-check, but might need some advice on testing the fix.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 22
  • Comments: 37 (13 by maintainers)

Commits related to this issue

Most upvoted comments

@thomastaylor312 I still have the problem with Helm 3. I install a Helm chart (service catalog for example, but this is not important) with the flag --wait and helm terminates before pods are ready. The chart contains one deployment of one pod. The update strategy is set to type RollingUpdate. The problem seems to be in function deploymentReady : https://github.com/helm/helm/blob/ec1d1a3d3eb672232f896f9d3b3d0797e4f519e3/pkg/kube/wait.go#L221-L228 The deployment does not provide configuration for the rolling update so some default values are used. A kubectl describe show these values :

replicas: 1
strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate

The calculation of expectedReady variable in deploymentReady function gives a value of 0 as replica is 1 and maxUnavailable is also 1. So the condition rs.Status.ReadyReplicas >= expectedReady is always true, even when no pod started yet. I think that the variable expectedReady should have a minimum value of 1 if it is equal to 0.

Please reopen the issue @Crazybus. I am having this problem with 2.14 version of Helm.

To reproduce just try the script:

helm repo add storageos https://charts.storageos.com
helm install storageos/storageos-operator --namespace storageos-operator --wait
kubectl patch storageclass fast -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

It installs a apiVersion: apps/v1 Deployment. https://github.com/storageos/charts/blob/master/stable/storageos-operator/templates/operator.yaml

If you wait some seconds you can patch the created storageclass.

I can also reproduce it: Trying to deploy the nginx operator: -> the chart is considered like deployed when pod is in “Creating container…” status

–wait option enabled + timeout increase to 15min without effects (also tried with the wait jobs options)

Hello @thomastaylor312

I understand this issue is close but i have few interrogation about, maybe some misunderstanding on my side

The wait option wait until the pods are ready according to rule Status.ReadyReplicas >= expectedReady But how can we seperate the behaviour during the deployment and the expected final state. For instance with a strategy like below

replicas: 2 strategy: rollingUpdate: maxSurge: 0 maxUnavailable: 1 type: RollingUpdate

We accept to have a pod unavailable during deployment as one of another they will be restarted( for a zero downtime scenario) but we would like the deployment to be considered finished and ok only when all pods are up and ready. If the two are not restarted correctly, the deployment should not be considered ok.

For now, we use ‘kubectl rollous status’ as it will wait until all pods are ready and if not the deployment will be rollback.

This behaviour is possible, planned ?