kubernetes: Garbage collector deleted wrong resources

What happened: Today I ran into a problem that seems to be a race condition in the way kube-controller-manager primes the cache before starting the garbage collection. We have a cluster with hundreds of namespaces and thousands of custom resources that have parent child relationships and during a cluster upgrade, whilst replacing the control plane nodes the garbage collector deleted some resources that was not supposed to, they had proper owner references and the owners where live and well.

Looking at the audit logs I can see a spike in the number of objects being deleted by the garbage collector during the upgrade.

My theory is that the garbage collector started looking at the child resources before their parents were cached, would this be possible?

What you expected to happen: The healthy non orphan resources should not have been deleted

How to reproduce it (as minimally and precisely as possible): Not really sure but this seems to be a race condition I think that killing the leader kube-controller-manager in a cluster with a large amount of custom resources that have a parent child relationship is the way to reproduce, I was able to reproduce it 3 times but not constantly.

Environment:

  • Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.9", GitCommit:"2e808b7cb054ee242b68e62455323aa783991f03", GitTreeState:"clean", BuildDate:"2020-01-18T23:24:23Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: Azure VMs/Scale sets
  • OS (e.g: cat /etc/os-release):
NAME="Ubuntu"                                           VERSION="18.04.3 LTS (Bionic Beaver)"                   ID=ubuntu                                               ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"                        VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"                      SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"     PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"                             VERSION_CODENAME=bionic                                 UBUNTU_CODENAME=bionic
  • Kernel (e.g. uname -a):
Linux cpscu0c52000000 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 19 (10 by maintainers)

Most upvoted comments

Shouldn’t this cache be namespace scoped?

That fix is in progress in https://github.com/kubernetes/kubernetes/pull/92743

/remove-lifecycle rotten

Have the same problem with 1.19.0 and rook resources getting deleted, namely the mon deployments and services, the mgr deployments, the osd deployments and, most annoyingly, the mon endpoint configmap and dashboard password secret - making rook create new cluster from scratch and creating a lot of manual work for me to recover the cluster

tell me if you need more information

some owned resources in another namespace were incorrectly pointing to resources in another namespace which were owners of deleted resources. Is it a possible case that #92743 covers?

Yes, that is one of the primary issues addressed by that PR

Will this PR be backported to 1.19 ?

Thanks.

EDIT: This was a mistake on our part and there were cross-namespace owner references we were previously unaware of (due to copying resources between namespaces)

~I believe we are also running into this issue in 1.15.3. We have (namespace-scoped) CRs that have a lot of child objects in the same namespace. During k8s cluster maintenace, we see in the API server audit logs that the GC deletes a lot of the children objects, even though the parent CR has not been deleted. We are trying to reproduce it but have been unsuccessful so far, but wanted to chime in on this issue.~