kubernetes: kube get pods --watch doesn't always report terminal status with exitCode during Deployment update

What happened:

When updating an existing Deployment (kubectl replace -f deployment.yaml), new pods are being created and old pods are killed. We are constantly doing kubectl get pods --watch to do some internal monitoring, however sometimes (cca 30-50% of the cases) we don’t receive the final update (for the old pod that is being stopped). Expecting somthing like:

"state": {
  "terminated": {
    "containerID": "docker://86714d667e1468048d982f2776020e4dc572985031eee75246f3c7f5a0280734",
    "exitCode": 0,
    "finishedAt": "2021-08-05T07:48:13Z",
    "reason": "Completed",
    "startedAt": "2021-08-05T07:45:56Z"
  }
}

Instead the last status update for such pod is weird:

"state": {
  "waiting": {
    "reason": "ContainerCreating"
  }
}

(complete logs below) We really need to know the exitCode of all terminated containers/pods during upgrade.

What you expected to happen:

Receive the exitCode (included in the final pod status update notification) for all pods during Deployment upgrade.

How to reproduce it (as minimally and precisely as possible):

It is hard to consistently reproduce, but it happens to us with almost every Deployment update for at least one pod.

Deployment spec: deployment.yml I am attaching 3 logs ( all done via kubectl get pods --output json --watch-only --output-watch-events | jq 'del(.object.metadata.managedFields)'):

  • Initial creation of the Deployment(not really interesting)

  • Logs from deployment upgrade that is OK (changed resources.limits.memory to trigger the upgrade), using kubectl replace that is without problems. All pods correctly reported the final terminated status with exitCode. This log is mainly for complete picture, nothing much interesting there. All pods starts terminating (MODIFIED event with deletionTimestamp appears, then they shift from state.running to state.terminated with attached exit code.

  • Logs from second (faulty) upgrade. <— THIS is the important one. Again just changed memory using kubectl replace And one pod doesn’t report the exitcode at all. This one is interesting:

    • 3 pods are running before the upgrade: example-deployment-5f846c8bc5-kzht5 example-deployment-5f846c8bc5-n49g2 example-deployment-5f846c8bc5-k8ppq
    • First two of them start terminating (deletionTimestamp appears, then their containerStatuses change to state.terminated)
    • the last one (k8ppq) starts terminating as well (deletionTimestamp), however it fails to report how it ended - and the last status is the weird ContainerCreating (line 3375) - and no state.terminated with the attached exit code.
  • I am also attaching one additional log of events (done by kube get events --watch-only --output-watch-events --output json | jq 'del(.object.metadata.managedFields)'). But nothing useful there (at least nothing I am aware of)

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"", GoVersion:"go1.16.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.13-eks-8df270", GitCommit:"8df2700a72a2598fa3a67c05126fa158fd839620", GitTreeState:"clean", BuildDate:"2021-07-31T01:36:57Z", GoVersion:"go1.15.14", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
AWS EKS, self managed worker nodes on EC2 isntances with https://cloud-images.ubuntu.com/docs/aws/eks/
  • OS (e.g: cat /etc/os-release):
Ubuntu 20.04.2 LTS
  • Kernel (e.g. uname -a):
5.4.0-1037-aws
  • Install tools:
AWS CloudFormation templates - mostly from official AWS docs
  • Network plugin and version (if this is a network-related bug):
Calico CNI
calico-node - docker.io/calico/cni:v3.19.1
tigera-operator - quay.io/tigera/operator:v1.17.4
  • Others:

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 1
  • Comments: 23 (12 by maintainers)

Most upvoted comments

@SergeyKanzhelev @bydga I have scripted reproducing of the issue and confirmed it on 1.21 and 1.22. However, it appears fixed on 1.23 and 1.25.

To reproduce I run a loop of 10 retries of create / replace / delete a deployment and check if the output (out.txt, see below) contains “The container could not be located when the pod was deleted” or “ContainerStatusUnknown”.

On one terminal I run:

kubectl get pods --output json --watch-only --output-watch-events | jq 'del(.object.metadata.managedFields)' | tee out.txt

on the other terminal run python repro.py against a cluster with 3 nodes:

import os, time

TMP_KUBE_BUILD = "/tmp/cmd-0123"

def exec_cmd(cmd):
    print("----")
    print(cmd)
    exitCode = os.system(cmd)
    print("exit code: ", exitCode)
    print("----")
    return exitCode

def is_clean(exp_running = 3):
    exec_cmd("kubectl get pods | tee %s" % TMP_KUBE_BUILD)
    with open(TMP_KUBE_BUILD) as file:
        lines = file.readlines()
        running_count = 0
        for line in lines:
            if "Running" in line:
                running_count += 1
        return running_count == exp_running and len(lines) == exp_running + 1

def await_clean(exp_running = 3):
    is_clean_tmp = False
    while not is_clean_tmp:
        is_clean_tmp = is_clean(exp_running)
        time.sleep(1.0)

if __name__ == "__main__":
    for i in range(0, 10):
        print("=============================== START: " + str(i))
        if exec_cmd("kubectl create -f deployment.yaml") != 0:
            continue
        await_clean()
        if exec_cmd("kubectl replace -f deployment-up.yaml") != 0:
            continue
        await_clean()
        if exec_cmd("kubectl delete -f deployment.yaml") != 0:
            continue
        print("=============================== END: " + str(i))

@codearky here is the syslog from the “faulty” upgrade

quick summary:

  • 3 pods were running:

    • hera-example-api-kube-bydga-daemon-695b4c6f9-4hlss
    • hera-example-api-kube-bydga-daemon-695b4c6f9-6vt29
    • hera-example-api-kube-bydga-daemon-695b4c6f9-t594g
  • they were replaced during the uprade to 3 new ones:

    • hera-example-api-kube-bydga-daemon-85949bbbc4-v5955
    • hera-example-api-kube-bydga-daemon-85949bbbc4-rzwfl
    • hera-example-api-kube-bydga-daemon-85949bbbc4-l9b5k
  • the one which failed to report the exitcode is the hera-example-api-kube-bydga-daemon-695b4c6f9-6vt29

I’m attaching 2 syslogs, because we had two servers for this test. The faulty pod is in the first one: syslog1.txt syslog2.txt

also the output of this current kubectl get pods -w: watchpods.txt

I was able to reproduce, but feels like these false containers status reports originate in kubelet (?) Will keep investigating and update /assign