kubernetes: Daemonset Status is not updated when Pod create fails

What happened:

A Daemonset’s status field was not updated following its failure to create Pods.

What you expected to happen:

A Daemonset’s .status.desiredNumberScheduled should be set to match the number of matched nodes and .status.numberUnavailable should be set to reflect any failures where the daemon pods are not running or available even when a Pod cannot be created (i.e. due to lack of pod quota).

How to reproduce it (as minimally and precisely as possible):

$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ResourceQuota
metadata:
  name: quota
spec:
  hard:
    pods: '0'
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: kuard
  name: kuard
spec:
  selector:
    matchLabels:
      app: kuard
  template:
    metadata:
      labels:
        app: kuard
    spec:
      containers:
      - image: gcr.io/kuar-demo/kuard-amd64:1
        imagePullPolicy: IfNotPresent
        name: kuard
        resources: {}
EOF

$ kubectl get ds kuard -o json | jq .status
{
  "currentNumberScheduled": 0,
  "desiredNumberScheduled": 0,
  "numberMisscheduled": 0,
  "numberReady": 0
}

The above should have numberUnavailable set to 1 and desiredNumberScheduled set to 1.

$ kubectl get events
20s         Warning   FailedCreate             daemonset/kuard              Error creating: pods "kuard-hdpxb" is forbidden: exceeded quota: compute-resources, requested: pods=1, used: pods=0, limited: pods=0
1s          Warning   FailedCreate             daemonset/kuard              (combined from similar events): Error creating: pods "kuard-xcpgm" is forbidden: exceeded quota: compute-resources, requested: pods=1, used: pods=0, limited: pods=0

It is only after the pod creation is unblocked does status get updated:

$ kubectl delete quota quota
resourcequota "quota" deleted
$ # Wait some time for retry
$ kubectl get ds kuard -o json | jq .status
{
  "currentNumberScheduled": 1,
  "desiredNumberScheduled": 1,
  "numberAvailable": 1,
  "numberMisscheduled": 0,
  "numberReady": 1,
  "observedGeneration": 1,
  "updatedNumberScheduled": 1
}

Anything else we need to know?:

Just some conjecturing but it seems that updateDaemonSetStatus should be called even in the event of a pod error during dsc.manage in a similar way that the replica set controller calls updateReplicaSetStatus when there is an error from rsc.manageReplicas.

Environment:

Kubernetes version (use kubectl version):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:28:09Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.7", GitCommit:"1dd5338295409edcfff11505e7bb246f0d325d15", GitTreeState:"clean", BuildDate:"2021-01-13T13:15:20Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

About this issue

Original URL
State: open
Created 3 years ago
Reactions: 2
Comments: 28 (5 by maintainers)

Most upvoted comments

@pacoxu The ds controller taking quota into account doesn’t make sense as quota is a namespace level aspect and it isn’t known by the controller whether its own pods will be restricted or other pods owned by other controllers in that namespace.

muff1nman on Mar 31, 2021