kubernetes: Upgrading to v1.5.2 creates duplicate replica sets

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): multiple replicaset, duplicate replicaset


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:57:25Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:52:34Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): Ubuntu 14.04.3 LTS
  • Kernel (e.g. uname -a): Linux ip-10-0-10-162 3.13.0-87-generic #133-Ubuntu SMP Tue May 24 18:32:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: Custom terraform+ansible setup; using hyperkube binary
  • Others: This cluster has disabled auth and serviceaccounts

What happened: Upgrading control plane to v1.5.2 (from v1.4.7) created duplicate replicasets (and hence duplicate pods) for some deployments

What you expected to happen: Existing deployments/replicasets/pods to be the same OR being replaced - not duplicates

How to reproduce it (as minimally and precisely as possible): Keep a deployment with volume mounts (probably EBS) running on v1.4.7. Upgrade control plane components to v1.5.2

Anything else do we need to know: Interestingly, this seems to have happened only for deployments that use shared volume mounts (either EBS or otherwise).

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 11
  • Comments: 65 (52 by maintainers)

Commits related to this issue

Most upvoted comments

We have the same issue. After 1.5.2 upgrade on GKE (from 1.4.1, 1.4.2, 1.4.7 depending on env) no volumes though), it started happening. However, it persists sporadically on certain deployments, even long after upgrade is complete, and we push new deployments.

To note, we use the ‘rolling restart’ hack of injecting a ‘date’ key into the metadata.labels, which forces the RR. This is due to our images having ‘test’ and ‘prod’ tags that we promote to. (kubectl patch deployment my-deployment -p “{"spec":{"template":{"metadata":{"labels":{"date":"datehere"}}}}}”)

What’s odd is that after scaling down the old RS, k8s immediately scales it back up to what it was stuck at (1 or 2 replicas usually, for a 4-6 replica deployment). Only deleting the old RS entirely gets it into the desirable state

It’s easily reproducible by using the rolling restart hack, and running it again a few seconds later as the first one is still restarting. Our deployment pipeline only does the restart hack once, and this still happens