kubernetes: eviction doesn't work with cronjob pods if PodDisruptionBudget specifies percentage minAvailable/maxUnavailable

What happened: PodDisruptionBudget is not working when CronJob and Deployment have same labels Error:

kubectl get pdb
NAME                  MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
{PDB name}   50%             N/A               0                     3m

kubectl describe pdb
Events:
  Type     Reason                           Age   From               Message
  ----     ------                           ----  ----               -------
  Warning  NoControllers                    16s   controllermanager  found no controllers for pod {CronJob pod name}
  Warning  CalculateExpectedPodCountFailed  16s   controllermanager  Failed to calculate the number of expected pods: found no controllers for pod {CronJob pod name}

What you expected to happen: PDB should run

How to reproduce it (as minimally and precisely as possible):

  • Create CronJob and Deployment with same labels
  • Use those labels in PDB selector

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-13T23:15:13Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.8-gke.6", GitCommit:"394ee507d00f15a63cef577a14026096c310698e", GitTreeState:"clean", BuildDate:"2019-03-30T19:31:43Z", GoVersion:"go1.10.8b4", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: GKE
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 24 (12 by maintainers)

Commits related to this issue

Most upvoted comments

I can’t think of many good reasons why someone would want to set PDBs on cronjob

How about setting a PDB on a cronjob to ensure that the job runs to completion in the event that another process tries to drain the node on which it’s running, before it’s finished?

After looking at the disruption controller in more detail, I think we can handle this better. In the situation described in this issue the controller can ignore the pods from the CronJob when computing the allowed disruptions, but create an event to notify the user that something isn’t quite right. I think events are better than conditions for reporting this kind of issues to the user. I have created #85553 for this. More eyes on this would be great to make sure there aren’t any scenarios I haven’t considered.

The disruption controller also has a failsafe functionality where the allowedDisruptions are set to 0 if the controller encounters any issues. I think we can use conditions to make it clearer to users that this happened. Currently, there isn’t any good way to see from the resource itself that allowedDisruptions are 0 because of this. I will try to address this in a separate PR.

So the issue here is that the PDB use minAvailable: 50% and also matches pods that belong to a controller that doesn’t have a concept of scale. Removing the deployment from the example will result in the same error.

When a PDB specifies either maxUnavailable or minAvailable in percent, the disruption controller needs to look up the “expected” number of pods: https://github.com/kubernetes/kubernetes/blob/29a2b201942ddb5404d5b11ca469a2cd51e79f1c/pkg/controller/disruption/disruption.go#L580-L612 This is not possible with cronjob, only with replicationcontrollers, deployments, replicasets, statefulsets and custom resources that implement the scale subresource.

If you want to set a PDB on a cronjob, it needs to use minAvailable with the actual number and not a percentage.