kubernetes: Regression: mirror pods get deleted and recreated repeated due to spec mismatch

During debugging #8637, I noticed that one of services pods: monitoring and logging pods stuck at pending sometimes forever:

$ cluster/kubectl.sh get pods --namespace="default"
fluentd-elasticsearch-kubernetes-minion-9q50                                                                                             kubernetes-minion-9q50/                 <none>                                                                        Pending      About an hour   
                                                            fluentd-elasticsearch   gcr.io/google_containers/fluentd-elasticsearch:1.5                                                                                                                                      
...

$ cluster/kubectl.sh get pods -o yaml fluentd-elasticsearch-kubernetes-minion-9q50
apiVersion: v1beta3
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.mirror: mirror
    kubernetes.io/config.source: file
  creationTimestamp: 2015-05-21T21:01:21Z
  name: fluentd-elasticsearch-kubernetes-minion-9q50
  namespace: default
  resourceVersion: "33083"
  selfLink: /api/v1beta3/namespaces/default/pods/fluentd-elasticsearch-kubernetes-minion-9q50
  uid: 884b09be-fffc-11e4-92b6-42010af084a4
spec:
  containers:
  - capabilities: {}
    env:
    - name: FLUENTD_ARGS
      value: -qq
    image: gcr.io/google_containers/fluentd-elasticsearch:1.5
    imagePullPolicy: IfNotPresent
    name: fluentd-elasticsearch
    resources:
      limits:
        cpu: 100m
    securityContext:
      capabilities: {}
      privileged: false
    terminationMessagePath: /dev/termination-log
    volumeMounts:
    - mountPath: /varlog
      name: varlog
    - mountPath: /var/lib/docker/containers
      name: containers
  dnsPolicy: ClusterFirst
  host: kubernetes-minion-9q50
  restartPolicy: Always
  serviceAccount: ""
  volumes:
  - hostPath:
      path: /var/log
    name: varlog
  - hostPath:
      path: /var/lib/docker/containers
    name: containers
status:
  phase: Pending

I logged into the node, and found the container is running happily on the node:

# docker ps -a | grep fluentd-elasticsearch-kubernetes-minion-9q50
5219fed9570b        gcr.io/google_containers/fluentd-elasticsearch:1.5   "\"/bin/sh -c '/usr/   2 hours ago         Up 2 hours                                     k8s_fluentd-elasticsearch.c99175f6_fluentd-elasticsearch-kubernetes-minion-9q50_default_a8d3815def5de29fd315adf5d9fc5acc_3feacf95   
68e799daa598        gcr.io/google_containers/pause:0.8.0                 "/pause"               2 hours ago         Up 2 hours                                     k8s_POD.e4cc795_fluentd-elasticsearch-kubernetes-minion-9q50_default_a8d3815def5de29fd315adf5d9fc5acc_e6a32500                      
a029f8710c93        gcr.io/google_containers/pause:0.8.0                 "/pause"               2 hours ago                                                        k8s_POD.e4cc795_fluentd-elasticsearch-kubernetes-minion-9q50_default_a8d3815def5de29fd315adf5d9fc5acc_82b1abf7                      
7b7c2c97aa50        gcr.io/google_containers/fluentd-elasticsearch:1.5   "\"/bin/sh -c '/usr/   2 hours ago         Exited (143) 2 hours ago                       k8s_fluentd-elasticsearch.c99175f6_fluentd-elasticsearch-kubernetes-minion-9q50_default_a8d3815def5de29fd315adf5d9fc5acc_f47f189b   
fe98a27d7687        gcr.io/google_containers/pause:0.8.0                 "/pause"               2 hours ago         Exited (0) 2 hours ago                         k8s_POD.e4cc795_fluentd-elasticsearch-kubernetes-minion-9q50_default_a8d3815def5de29fd315adf5d9fc5acc_d4c77781                      

Looks like status is never reported from kubelet. I checked kubelet log, then I found:

E0521 21:01:18.661917    2486 kubelet.go:1074] Deleting mirror pod "fluentd-elasticsearch-kubernetes-minion-9q50_default" because it is outdated
W0521 21:01:19.175759    2486 status_manager.go:60] Failed to updated pod status: error updating status for pod "fluentd-elasticsearch-kubernetes-minion-9q50": pods "fluentd-elasticsearch-kubernetes-minion-9q50" not found

Why we cannot delete mirror pod? and Created a new one? Because this regression, once we run into such state, the rest of test won’t be triggered, and eventually timeout / failed.

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Comments: 19 (16 by maintainers)

Commits related to this issue

Most upvoted comments

Although this issues was closed, it still appears in my environment:

05:17:15 ubuntu kubelet[1847]: W0416 05:17:15.477579    1847 kubelet.go:1602] Deleting mirror pod "etcd-ubuntu_kube-system(f335eecd-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:17 ubuntu kubelet[1847]: W0416 05:17:17.487889    1847 kubelet.go:1602] Deleting mirror pod "kube-scheduler-ubuntu_kube-system(f4672165-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:17 ubuntu kubelet[1847]: W0416 05:17:17.487964    1847 kubelet.go:1602] Deleting mirror pod "kube-apiserver-ubuntu_kube-system(f4489e78-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:17 ubuntu kubelet[1847]: W0416 05:17:17.488107    1847 kubelet.go:1602] Deleting mirror pod "kube-controller-manager-ubuntu_kube-system(f42a158a-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:18 ubuntu kubelet[1847]: W0416 05:17:18.493343    1847 kubelet.go:1602] Deleting mirror pod "kube-scheduler-ubuntu_kube-system(f5b6d0f6-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:18 ubuntu kubelet[1847]: W0416 05:17:18.493521    1847 kubelet.go:1602] Deleting mirror pod "etcd-ubuntu_kube-system(f4c2aaa6-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:19 ubuntu kubelet[1847]: W0416 05:17:19.500984    1847 kubelet.go:1602] Deleting mirror pod "kube-apiserver-ubuntu_kube-system(f598480b-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:19 ubuntu kubelet[1847]: W0416 05:17:19.501080    1847 kubelet.go:1602] Deleting mirror pod "kube-controller-manager-ubuntu_kube-system(f5d55a78-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:20 ubuntu kubelet[1847]: W0416 05:17:20.505977    1847 kubelet.go:1602] Deleting mirror pod "etcd-ubuntu_kube-system(f64f6755-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:20 ubuntu kubelet[1847]: W0416 05:17:20.506096    1847 kubelet.go:1602] Deleting mirror pod "kube-scheduler-ubuntu_kube-system(f630df99-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:22 ubuntu kubelet[1847]: W0416 05:17:22.519575    1847 kubelet.go:1602] Deleting mirror pod "kube-apiserver-ubuntu_kube-system(f72505bd-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:22 ubuntu kubelet[1847]: W0416 05:17:22.519738    1847 kubelet.go:1602] Deleting mirror pod "kube-controller-manager-ubuntu_kube-system(f7438f91-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:23 ubuntu kubelet[1847]: W0416 05:17:23.524652    1847 kubelet.go:1602] Deleting mirror pod "etcd-ubuntu_kube-system(f7dc2807-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:23 ubuntu kubelet[1847]: W0416 05:17:23.524734    1847 kubelet.go:1602] Deleting mirror pod "kube-scheduler-ubuntu_kube-system(f7faa454-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:24 ubuntu kubelet[1847]: W0416 05:17:24.529879    1847 kubelet.go:1602] Deleting mirror pod "kube-scheduler-ubuntu_kube-system(f968e218-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:24 ubuntu kubelet[1847]: W0416 05:17:24.529950    1847 kubelet.go:1602] Deleting mirror pod "kube-controller-manager-ubuntu_kube-system(f8b1c4f8-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:24 ubuntu kubelet[1847]: W0416 05:17:24.530095    1847 kubelet.go:1602] Deleting mirror pod "kube-apiserver-ubuntu_kube-system(f89339bb-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:25 ubuntu kubelet[1847]: W0416 05:17:25.534860    1847 kubelet.go:1602] Deleting mirror pod "etcd-ubuntu_kube-system(f94a5849-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:26 ubuntu kubelet[1847]: W0416 05:17:26.539998    1847 kubelet.go:1602] Deleting mirror pod "kube-apiserver-ubuntu_kube-system(fa7b83f7-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:27 ubuntu kubelet[1847]: W0416 05:17:27.545168    1847 kubelet.go:1602] Deleting mirror pod "kube-controller-manager-ubuntu_kube-system(fa5cfe8c-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:27 ubuntu kubelet[1847]: W0416 05:17:27.545233    1847 kubelet.go:1602] Deleting mirror pod "kube-scheduler-ubuntu_kube-system(fa3e7bef-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:28 ubuntu kubelet[1847]: W0416 05:17:28.550372    1847 kubelet.go:1602] Deleting mirror pod "kube-controller-manager-ubuntu_kube-system(fbcb35c7-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:28 ubuntu kubelet[1847]: W0416 05:17:28.550439    1847 kubelet.go:1602] Deleting mirror pod "etcd-ubuntu_kube-system(fad71441-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:29 ubuntu kubelet[1847]: W0416 05:17:29.555604    1847 kubelet.go:1602] Deleting mirror pod "kube-apiserver-ubuntu_kube-system(fb51241b-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:30 ubuntu kubelet[1847]: W0416 05:17:30.560627    1847 kubelet.go:1602] Deleting mirror pod "kube-apiserver-ubuntu_kube-system(fcfc6290-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:30 ubuntu kubelet[1847]: W0416 05:17:30.560701    1847 kubelet.go:1602] Deleting mirror pod "kube-controller-manager-ubuntu_kube-system(fc45466a-4156-11e8-84f4-0050569b7580)" because it is outdated
Apr 16 05:17:30 ubuntu kubelet[1847]: W0416 05:17:30.560844    1847 kubelet.go:1602] Deleting mirror pod "etcd-ubuntu_kube-system(fc63cf79-4156-11e8-84f4-0050569b7580)" because it is outdated