kubernetes: Pods stuck terminating (MountVolume.SetUp failed)

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened: Updating ingress and deleted service, daemon set, service account, cluster role binding, and config map in that order. I then deployed the ingress pods into a new namespace. The new pods never could start because the old pods got stuck in terminating and have been for hours now. What you expected to happen: Old pods to die and new pods to take their place, with some downtime. How to reproduce it (as minimally and precisely as possible): I can provide yaml if valuable, but I’m using juju to deploy so I just did the following:

juju deploy canonical-kubernetes
juju config kubernetes-worker ingress=false
kubectl deploy k8s_1.12_ingress.yaml

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): 1.12.1
  • Cloud provider or hardware configuration: gce and aws
  • OS (e.g. from /etc/os-release): Ubuntu 18.04.1 LTS
  • Kernel (e.g. uname -a): 4.15.0-1021-gcp
  • Install tools: juju
  • Others:
$ kubectl get po --all-namespaces
NAMESPACE                         NAME                                                      READY   STATUS        RESTARTS   AGE
default                           microbot-66cb4987b7-h4wkd                                 1/1     Running       0          8m7s
default                           microbot-66cb4987b7-sn6dm                                 1/1     Running       0          8m7s
default                           microbot-66cb4987b7-tgrmc                                 1/1     Running       0          8m7s
default                           nginx-ingress-kubernetes-worker-controller-9rp7h          0/1     Terminating   0          45m
default                           nginx-ingress-kubernetes-worker-controller-b2qn2          0/1     Terminating   0          44m
default                           nginx-ingress-kubernetes-worker-controller-mnhgq          0/1     Terminating   0          42m
ingress-nginx-kubernetes-worker   default-http-backend-kubernetes-worker-5d9bb77bc5-g8lcz   1/1     Running       0          22m
ingress-nginx-kubernetes-worker   nginx-ingress-controller-kubernetes-worker-6gzw8          0/1     Pending       0          22m
ingress-nginx-kubernetes-worker   nginx-ingress-controller-kubernetes-worker-sksm6          0/1     Pending       0          22m
ingress-nginx-kubernetes-worker   nginx-ingress-controller-kubernetes-worker-tmjzv          0/1     Pending       0          22m
kube-system                       heapster-v1.6.0-beta.1-6db4b87d-c5cws                     4/4     Running       0          42m
kube-system                       kube-dns-596fbb8fbd-dr692                                 3/3     Running       0          46m
kube-system                       kubernetes-dashboard-67d4c89764-bgt6k                     1/1     Running       0          46m
kube-system                       metrics-server-v0.3.1-67bb5c8d7-t7lh7                     2/2     Running       0          44m
kube-system                       monitoring-influxdb-grafana-v4-65cc9bb8c8-pfsxn           2/2     Running       0          46m
$ kubectl describe po/nginx-ingress-kubernetes-worker-controller-9rp7h
Name:                      nginx-ingress-kubernetes-worker-controller-9rp7h
Namespace:                 default
Node:                      juju-fdbd8c-7/252.0.112.1
Start Time:                Fri, 19 Oct 2018 17:16:17 -0400
Labels:                    controller-revision-hash=5dc6fb876
                           name=nginx-ingress-kubernetes-worker
                           pod-template-generation=1
Annotations:               <none>
Status:                    Terminating (lasts <invalid>)
Termination Grace Period:  60s
IP:                        252.0.112.1
Controlled By:             DaemonSet/nginx-ingress-kubernetes-worker-controller
Containers:
  nginx-ingress-kubernetes-worker:
    Container ID:  docker://42a947b16172b00f53a8655e03e4d979ef810c851ddc146e8855583632330ea3
    Image:         quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.1
    Image ID:      docker-pullable://quay.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:2fa84bfa338fbc84672521c443074f0b2ab30ad0b6bea50c4c29ee2d012fbcba
    Ports:         80/TCP, 443/TCP
    Host Ports:    80/TCP, 443/TCP
    Args:
      /nginx-ingress-controller
      --default-backend-service=$(POD_NAMESPACE)/default-http-backend
      --configmap=$(POD_NAMESPACE)/nginx-load-balancer-conf
      --enable-ssl-chain-completion=False
    State:          Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:10254/healthz delay=30s timeout=5s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       nginx-ingress-kubernetes-worker-controller-9rp7h (v1:metadata.name)
      POD_NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  juju-application=kubernetes-worker
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason       Age                From                    Message
  ----     ------       ----               ----                    -------
  Normal   Scheduled    45m                default-scheduler       Successfully assigned default/nginx-ingress-kubernetes-worker-controller-9rp7h to juju-fdbd8c-7
  Normal   Pulling      45m                kubelet, juju-fdbd8c-7  pulling image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.1"
  Normal   Pulled       45m                kubelet, juju-fdbd8c-7  Successfully pulled image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.1"
  Normal   Created      45m                kubelet, juju-fdbd8c-7  Created container
  Normal   Started      45m                kubelet, juju-fdbd8c-7  Started container
  Warning  FailedMount  44m (x4 over 45m)  kubelet, juju-fdbd8c-7  MountVolume.SetUp failed for volume "nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw" : couldn't propagate object cache: timed out waiting for the condition
  Warning  FailedMount  22m (x4 over 22m)  kubelet, juju-fdbd8c-7  MountVolume.SetUp failed for volume "nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw" : secret "nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw" not found
  Normal   Killing      22m                kubelet, juju-fdbd8c-7  Killing container with id docker://nginx-ingress-kubernetes-worker:Need to kill Pod

Waited a few hours and then

$ kubectl get po
NAME                                               READY   STATUS        RESTARTS   AGE
microbot-66cb4987b7-h4wkd                          1/1     Running       0          4h3m
microbot-66cb4987b7-sn6dm                          1/1     Running       0          4h3m
microbot-66cb4987b7-tgrmc                          1/1     Running       0          4h3m
nginx-ingress-kubernetes-worker-controller-9rp7h   0/1     Terminating   0          4h40m
nginx-ingress-kubernetes-worker-controller-b2qn2   0/1     Terminating   0          4h39m
nginx-ingress-kubernetes-worker-controller-mnhgq   0/1     Terminating   0          4h37m
...
$ kubectl describe po/nginx-ingress-kubernetes-worker-controller-9rp7h
Name:                      nginx-ingress-kubernetes-worker-controller-9rp7h
Namespace:                 default
Node:                      juju-fdbd8c-7/252.0.112.1
Start Time:                Fri, 19 Oct 2018 17:16:17 -0400
Labels:                    controller-revision-hash=5dc6fb876
                           name=nginx-ingress-kubernetes-worker
                           pod-template-generation=1
Annotations:               <none>
Status:                    Terminating (lasts <invalid>)
Termination Grace Period:  60s
IP:                        252.0.112.1
Controlled By:             DaemonSet/nginx-ingress-kubernetes-worker-controller
Containers:
  nginx-ingress-kubernetes-worker:
    Container ID:  docker://42a947b16172b00f53a8655e03e4d979ef810c851ddc146e8855583632330ea3
    Image:         quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.1
    Image ID:      docker-pullable://quay.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:2fa84bfa338fbc84672521c443074f0b2ab30ad0b6bea50c4c29ee2d012fbcba
    Ports:         80/TCP, 443/TCP
    Host Ports:    80/TCP, 443/TCP
    Args:
      /nginx-ingress-controller
      --default-backend-service=$(POD_NAMESPACE)/default-http-backend
      --configmap=$(POD_NAMESPACE)/nginx-load-balancer-conf
      --enable-ssl-chain-completion=False
    State:          Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:10254/healthz delay=30s timeout=5s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       nginx-ingress-kubernetes-worker-controller-9rp7h (v1:metadata.name)
      POD_NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  nginx-ingress-kubernetes-worker-serviceaccount-token-w7kcw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  juju-application=kubernetes-worker
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:          <none>

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 21
  • Comments: 56 (13 by maintainers)

Commits related to this issue

Most upvoted comments

I’m facing this issue, is there any progress on this?

/remove-triage needs-information /triage accepted

(assuming the repro works)

/remove-lifecycle rotten

couldn’t propagate object cache: timed out waiting for the condition

I noticed this can happen when apiserver is on high load, but should retry and mount volume successful.

Pods stuck terminating

In my case, pod stuck terminating because wrong status reported by docker daemon. kubelet try to stop then kill and delete container, but docker report the container is live even with stop or kill. kubelet only call docker rm container_id, but in this situation, container can only be deleted by docker rm -f container_id (and yes this is a docker bug)

[root@node-192-168-134-14 ~]$ docker ps | grep 9cbf95577c3d
9cbf95577c3d        test.caicloudprivatetest.com/release/alert-notifier@sha256:ab92f98591e1e389e428ed27390482785e4ea9b8fec3cadd2de623e69aa4116d                                            "./notifier -confi..."   17 hours ago        Up 17 hours                             k8s_c0_alerting-notifier-notifier-v1-0-b95989757-fg9lr_default_45dd72af-3e42-11e9-a0d4-525400183fcd_118
[root@node-192-168-134-14 ~]$ docker inspect 9cbf95577c3d | grep -i pid
            "Pid": 802661,
            "PidMode": "",
            "PidsLimit": 0,
[root@node-192-168-134-14 ~]$ ps aux | grep 802661
root     1478718  0.0  0.0   9088   672 pts/0    S+   20:06   0:00 grep --color=auto 802661

[root@node-192-168-134-14 ~]$ docker stop 9cbf95577c3d
9cbf95577c3d
[root@node-192-168-134-14 ~]$ docker ps | grep 9cbf95577c3d
9cbf95577c3d        test.caicloudprivatetest.com/release/alert-notifier@sha256:ab92f98591e1e389e428ed27390482785e4ea9b8fec3cadd2de623e69aa4116d                                            "./notifier -confi..."   17 hours ago        Up 17 hours                             k8s_c0_alerting-notifier-notifier-v1-0-b95989757-fg9lr_default_45dd72af-3e42-11e9-a0d4-525400183fcd_118

kubelet logs on this node should always helpful for explaining why pod stuck terminating in my experience

I’ve been spending quite a large amount of time attempting to debug/understand why pods can get stuck in a terminating state like this (https://github.com/longhorn/longhorn/issues/2629 for reference, the fundamental issue is the same but has a different cause)

I believe I’ve identified one path that gets to this state, but it specifically deals with a restarted kubelet due to a mismatch between desired and actual state. It’s quite rare that someone would hit this but it’s extremely reproducible if manipulated properly.

Essentially, if a pod is created and uses a secret that is deleted after the pod is created/running, then the kubelet is restarted, the kubelet will emit the MountVolume.SetUp error messages as it attempts to reconcile desired state against the contents on disk.

The reconciler has a syncStates function: https://github.com/kubernetes/kubernetes/blob/v1.21.1/pkg/kubelet/volumemanager/reconciler/reconciler.go#L385 that inspects the kubelet directory, and attempts to reconcile the desired/actual state of the world with it. If you look at https://github.com/kubernetes/kubernetes/blob/v1.21.1/pkg/kubelet/volumemanager/reconciler/reconciler.go#L415-L424 you can see that if the volume exists in the desired state of the world (which it should, because the pod is there and is using the volume), then it marks the volume InUse, and the kubelet will then attempt to MountVolume.SetUp the volume again. The issue is, if the secret was deleted, then the MountVolume.SetUp will fail and the Actual State of the World will never be populated with the secret volume.

The issue with this is, on termination of a pod, when the kubelet is determining what volumes to remove in order to finish terminating the pod, it doesn’t attempt to unmount the secret because the secret isn’t in the actual state of the world: https://github.com/kubernetes/kubernetes/blob/v1.21.1/pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go#L281-L289

Interestingly, if I umount the tmpfs mount of the secret that it is complaining about, the kubelet will terminate the pod as it considers it “orphaned” i.e.

With my pod stuck in terminating, I see logs like:

I0603 02:13:36.816846   13556 kubelet_volumes.go:70] "Pod found, but volumes are still mounted on disk" podUID=45446e4b-dd76-4339-96fa-0d6881250103 paths=[/var/lib/kubelet/pods/45446e4b-dd76-4339-96fa-0d6881250103/volumes/kubernetes.io~secret/rke2-ingress-nginx-v3-token-6m4fk]
I0603 02:13:36.816858   13556 kubelet_pods.go:2014] "Orphaned pod found, but volumes not yet removed.  Reducing cpu to minimum" podUID=45446e4b-dd76-4339-96fa-0d6881250103

then after a umount /var/lib/kubelet/pods/45446e4b-dd76-4339-96fa-0d6881250103/volumes/kubernetes.io~secret/rke2-ingress-nginx-v3-token-6m4fk I see logs like:

I0603 02:13:38.815434   13556 kubelet_pods.go:2020] "Orphaned pod found, removing pod cgroups" podUID=45446e4b-dd76-4339-96fa-0d6881250103
I0603 02:13:38.817361   13556 noop.go:30] No-op Destroy function called
I0603 02:13:38.817370   13556 manager.go:1044] Destroyed container: "/kubepods/burstable/pod45446e4b-dd76-4339-96fa-0d6881250103" (aliases: [], namespace: "")
I0603 02:13:38.817379   13556 handler.go:325] Added event &{/kubepods/burstable/pod45446e4b-dd76-4339-96fa-0d6881250103 2021-06-03 02:13:38.817374768 +0000 UTC m=+1973.569615324 containerDeletion {<nil>}}
--
I0603 02:13:42.815109   13556 kubelet_volumes.go:170] "Orphaned pod found, removing" podUID=45446e4b-dd76-4339-96fa-0d6881250103

and the pod disappears.

@ehashman can we unstale this?

I’m also getting this issue, and nothing is fixing it. Everything was working fine until the last week or so, and I’ve made no changes to any of my configs. Literally the only thing I’ve done on the affected machine is apt-get update for the latest packages on Ubuntu 18.04.

I’ve tried:

  • Rebooting
  • Restarting the docker daemon
  • Deleting and re-applying all Kubernetes resources

Thanks for the help, @fntlnz!

This should be fixed via https://github.com/kubernetes/kubernetes/pull/110670 now.

/close

@gnufied to be fair, https://github.com/kubernetes/kubernetes/issues/96635 is a duplicate of this one. This one existed almost 2 years earlier.

@brandond you mean this bit from @Oats87 ? This is the simplest repro?

Essentially, if a pod is created and uses a secret that is deleted after the pod is created/running, then the kubelet is restarted, the kubelet will emit the MountVolume.SetUp error messages as it attempts to reconcile desired state against the contents on disk.