kubernetes: "Terminated" pod on shutdown node listed in service edpoints.

What happened?

On our GKE cluster (using preemptible nodes), pods are correctly marked as “terminated”, but are still showing as “ready” in the endpoints list, causing traffic to route to them.

Deleting the Terminated pods removes them from the endpoints list.

The Terminated pods reappear after a node preemption.

What did you expect to happen?

Terminated pods should not be listed in endpoints.

How can we reproduce it (as minimally and precisely as possible)?

Create a GKE cluster using k8s 1.22.8-gke.200 Add a node group using preemptible nodes, on Container OS and containerd. Create a deployment and service. Check endpoint list after a preemption event.

Anything else we need to know?

Here is the state of the pod, endpoints, edpointsslice, and the service definition.

Example pod (reason shows terminated due to node shutdown), redacted image name details and some args and command details:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
    kubernetes.io/limit-ranger: 'LimitRanger plugin set: cpu request for container
      docs; cpu request for init container docs-beanbag-init'
    prometheus.io/scrape: "true"
  creationTimestamp: "2022-04-26T10:48:14Z"
  generateName: docs-default-7c4ff6b58f-
  labels:
    app: docs
    dnsName: docs-staging.qubit.com
    externalAccess: public
    pod-template-hash: 7c4ff6b58f
    slack_channel: eng-frontend-alerts
    team: product_eng
    track: default
  name: docs-default-7c4ff6b58f-kqz9p
  namespace: default
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: docs-default-7c4ff6b58f
    uid: 9ea90c14-9c18-48ce-beea-43772405acc0
  resourceVersion: "820041662"
  uid: d7938ddc-c167-4c76-848e-eb373b4f8812
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - docs
          topologyKey: kubernetes.io/hostname
        weight: 1
  containers:
  - env:
    - name: KUBERNETES_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: KUBERNETES_POD_IP_ADDRESS
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIP
    - name: KUBERNETES_HOST_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    - name: KUBERNETES_CPU_REQUEST
      valueFrom:
        resourceFieldRef:
          containerName: docs
          divisor: "0"
          resource: requests.cpu
    - name: KUBERNETES_MEMORY_REQUEST
      valueFrom:
        resourceFieldRef:
          containerName: docs
          divisor: "0"
          resource: requests.memory
    image: XXXX
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /status
        port: http
        scheme: HTTP
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    name: docs
    ports:
    - containerPort: 1028
      name: http
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /status
        port: http
        scheme: HTTP
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        memory: 1Gi
      requests:
        cpu: 10m
        memory: 1Gi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /config
      name: templatesdir-docs
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-nqg5n
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - args:
    - XXXX
    command:
    - /XXXX
    env:
    - name: KUBERNETES_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: KUBERNETES_HOST_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    - name: TLS_DOMAIN
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: TLS_IP_ADDRESS
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIP
    image: XXXX
    imagePullPolicy: Always
    name: beanbag-init
    ports:
    - containerPort: 5000
      name: config-http
      protocol: TCP
    resources:
      requests:
        cpu: 10m
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /config
      name: templatesdir-docs
    - mountPath: /token-data
      name: tokendata-docs
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-nqg5n
      readOnly: true
  nodeName: gke-europe-west1-pre-n1-4cpu-26gb-0298d3a1-fcsj
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: gke.io/optimize-utilization-scheduler
  securityContext: {}
  serviceAccount: docs
  serviceAccountName: docs
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - emptyDir:
      medium: Memory
    name: templatesdir-docs
  - emptyDir:
      medium: Memory
    name: tokendata-docs
  - name: kube-api-access-nqg5n
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-04-27T10:53:47Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-04-27T10:53:56Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-04-27T10:53:56Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-04-26T10:48:14Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://2bf2d59e54e86e4fa0921813eb58e26defdbb18c731e00519645376dd7f6a024
    image: XXXX
    imageID: XXXX
    lastState: {}
    name: docs
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-04-27T10:53:47Z"
  hostIP: 10.132.0.11
  initContainerStatuses:
  - containerID: containerd://0a0f5165e9b3126c50efe478bee822f8ea773f7d07c1bc94ab63a4e33b7bb8fc
    image: XXXX
    imageID: XXXX
    lastState: {}
    name: docs-beanbag-init
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: containerd://0a0f5165e9b3126c50efe478bee822f8ea773f7d07c1bc94ab63a4e33b7bb8fc
        exitCode: 0
        finishedAt: "2022-04-27T10:53:29Z"
        reason: Completed
        startedAt: "2022-04-27T10:53:28Z"
  message: Pod was terminated in response to imminent node shutdown.
  phase: Failed
  podIP: 10.4.1.7
  podIPs:
  - ip: 10.4.1.7
  qosClass: Burstable
  reason: Terminated
  startTime: "2022-04-26T10:48:14Z"

endpoints:

apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    endpoints.kubernetes.io/last-change-trigger-time: "2022-04-28T11:03:10Z"
  creationTimestamp: "2020-09-22T10:23:21Z"
  labels:
    app: docs
    app.kubernetes.io/managed-by: Helm
    chart: baton-docs
    release: docs
    revision: "7"
  name: docs
  namespace: default
  resourceVersion: "820049716"
  uid: f0513b7c-cae3-4186-88c0-07297ca03932
subsets:
- addresses:
  - ip: 10.4.0.34
    nodeName: gke-europe-west1-pre-n1-4cpu-26gb-154195a1-qltt
    targetRef:
      kind: Pod
      name: docs-default-7c4ff6b58f-6l979
      namespace: default
      resourceVersion: "820049714"
      uid: 2323de45-c368-44ff-b2ae-fa7b47524c20
  - ip: 10.4.1.7
    nodeName: gke-europe-west1-pre-n1-4cpu-26gb-0298d3a1-fcsj
    targetRef:
      kind: Pod
      name: docs-default-7c4ff6b58f-kqz9p
      namespace: default
      resourceVersion: "820041662"
      uid: d7938ddc-c167-4c76-848e-eb373b4f8812
  ports:
  - name: http
    port: 1028
    protocol: TCP

endpointsslice

addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
  - 10.4.1.7
  conditions:
    ready: true
    serving: true
    terminating: false
  nodeName: gke-europe-west1-pre-n1-4cpu-26gb-0298d3a1-fcsj
  targetRef:
    kind: Pod
    name: docs-default-7c4ff6b58f-kqz9p
    namespace: default
    resourceVersion: "820041662"
    uid: d7938ddc-c167-4c76-848e-eb373b4f8812
  zone: europe-west1-b
- addresses:
  - 10.4.0.34
  conditions:
    ready: true
    serving: true
    terminating: false
  nodeName: gke-europe-west1-pre-n1-4cpu-26gb-154195a1-qltt
  targetRef:
    kind: Pod
    name: docs-default-7c4ff6b58f-6l979
    namespace: default
    resourceVersion: "820049714"
    uid: 2323de45-c368-44ff-b2ae-fa7b47524c20
  zone: europe-west1-c
kind: EndpointSlice
metadata:
  annotations:
    endpoints.kubernetes.io/last-change-trigger-time: "2022-04-28T11:03:10Z"
  creationTimestamp: "2021-01-25T08:54:44Z"
  generateName: docs-
  generation: 2217
  labels:
    app: docs
    app.kubernetes.io/managed-by: Helm
    chart: baton-docs
    endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
    kubernetes.io/service-name: docs
    release: docs
    revision: "7"
  name: docs-qh77b
  namespace: default
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Service
    name: docs
    uid: b56dc06a-987e-4b78-b3ef-5ea61c4ec124
  resourceVersion: "820049718"
  uid: 7118f7a9-6c2e-4088-9b59-ac8fae154831
ports:
- name: http
  port: 1028
  protocol: TCP

and the service

apiVersion: v1
kind: Service
metadata:
  annotations:
    baton.qutics.com/lazy-tristan: makes the template easier
    meta.helm.sh/release-name: docs
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2020-09-22T10:23:21Z"
  labels:
    app: docs
    app.kubernetes.io/managed-by: Helm
    chart: baton-docs
    release: docs
    revision: "7"
  name: docs
  namespace: default
  resourceVersion: "803310896"
  uid: b56dc06a-987e-4b78-b3ef-5ea61c4ec124
spec:
  clusterIP: 10.69.11.121
  clusterIPs:
  - 10.69.11.121
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 1028
  selector:
    app: docs
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

Kubernetes version

1.22.8-gke.200
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.6", GitCommit:"f59f5c2fda36e4036b49ec027e556a15456108f0", GitTreeState:"clean", BuildDate:"2022-01-19T17:33:06Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.8-gke.200", GitCommit:"87bd5277f85536686030dd1fd7d6e39776dd44d1", GitTreeState:"clean", BuildDate:"2022-03-17T20:42:25Z", GoVersion:"go1.16.14b7", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

GKE,

OS version

Nodes are cos_containerd 1.22.8-gke.200

Install tools

GKE

Container runtime (CRI) and version (if applicable)

containerd

Related plugins (CNI, CSI, …) and versions (if applicable)

GKE defaults

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 36 (15 by maintainers)

Most upvoted comments

This is not included in the changelog due to missing release-note in cherry-pick PRs. The PR above is cherry-picked. Not sure if we need to update the Changelog.

FWIW, https://github.com/kubernetes/kubernetes/issues/108594 sounds more closely related to the description in this ticket. It was cherry-picked into 1.22.9, but the initial description here mentions 1.22.8.