keel: Unversioned deployment with multiple pods does not update all pods

I have a deployment with 5 replicas following a :latest tag. From the logs, I can see that keel resets the image to 0.0.0 and after 5 seconds applies :latest.

The deployment seems to revert back to the previous version of replicaset, and the rollout does not continue for pods. The single pod that was recreated has the same rc version, but the latest image was pulled via imagePullPolicy. During the reset, I can see 2 pods in ErrImagePull state.

I’m running keel 0.5.0-rc.1 with native webhooks with keel.sh/policy: force in kubernetes 1.7.x

The events from the deployment are:

Normal  ScalingReplicaSet  2m     deployment-controller  Scaled up replica set app-1687492293 to 1
Normal  ScalingReplicaSet  2m     deployment-controller  Scaled down replica set app-3177265212 to 4
Normal  ScalingReplicaSet  2m     deployment-controller  Scaled up replica set app-1687492293 to 2
Normal  ScalingReplicaSet  2m     deployment-controller  Scaled up replica set app-3177265212 to 5
Normal  ScalingReplicaSet  2m     deployment-controller  Scaled down replica set app-1687492293 to 0

and in the end, the pods are:

Name                        RC                             AGE
app-3177265212-40zzc    app-3177265212   2017-10-27T13:26:34Z
app-3177265212-85jsp    app-3177265212   2017-10-27T13:43:07Z   <- This pod has changed
app-3177265212-b2mws    app-3177265212   2017-10-27T13:26:34Z
app-3177265212-p9mkc    app-3177265212   2017-10-27T13:26:34Z
app-3177265212-qth3s    app-3177265212   2017-10-27T13:26:34Z

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 22 (7 by maintainers)

Most upvoted comments

No, I didn’t see it. Yeah, totally forgot that perms are required for deletion. Only pod deletion permissions are required, thanks!

Regarding the quay:

After a simple unit test that pretty much does the same thing as for Zalando registry, Quay returns an error (every registry wants to be unique). Will get it fixed.

That would work for non-production workloads, which applies to working with latest tag anyway.

The other would be to set spec.strategy.type to ‘Recreate’ in the deployment which results to some downtime as well, but wouldn’t require changes in keel.

I’m currently trying out a very rough patch, where no reset is performed, but a ENV variable is set to each container, resulting in a new rc each time. What is your opinion on this? I remember seeing some discussion earlier on the force-policy feature ticket.