kubernetes: [Flaky Test] [[sig-cli] Kubectl client Simple pod should return command exit codes

Which jobs are flaking: ci-kubernetes-kind-ipv6-e2e-parallel

Which test(s) are flaking:

  • Kubectl client Simple pod should return command exit codes
  • Kubectl client Simple pod should support inline execution and attach

Testgrid link: https://testgrid.k8s.io/sig-release-master-blocking#kind-ipv6-master-parallel&include-filter-by-regex= Kubectl client Simple pod should return command exit codes

Reason for failure:

• Failure [83.784 seconds]
[sig-cli] Kubectl client
test/e2e/kubectl/framework.go:23
  Simple pod
  test/e2e/kubectl/kubectl.go:382
    should return command exit codes [It]
    test/e2e/kubectl/kubectl.go:502
    Feb  7 16:23:48.033: Unexpected error:
      
        <exec.CodeExitError>: {
            Err: {
                s: "error running /home/prow/go/src/k8s.io/kubernetes/bazel-out/k8-fastbuild-ST-5e46445d989a/bin/cmd/kubectl/kubectl_/kubectl --server=https://[::1]:37209 --kubeconfig=/root/.kube/kind-test-config --namespace=kubectl-7185 run -i --image=docker.io/library/busybox:1.29 --restart=Never success -- /bin/sh -c exit 0:\nCommand stdout:\n\nstderr:\nerror: timed out waiting for the condition\n\nerror:\nexit status 1",
            },
            Code: 1,
        }
        error running /home/prow/go/src/k8s.io/kubernetes/bazel-out/k8-fastbuild-ST-5e46445d989a/bin/cmd/kubectl/kubectl_/kubectl --server=https://[::1]:37209 --kubeconfig=/root/.kube/kind-test-config --namespace=kubectl-7185 run -i --image=docker.io/library/busybox:1.29 --restart=Never success -- /bin/sh -c exit 0:
        Command stdout:
        
        stderr:
        error: timed out waiting for the condition
        
        error:
        exit status 1
    occurred
skipped 55993 lines unfold_more
[Fail] [sig-cli] Kubectl client Simple pod [It] should return command exit codes 
test/e2e/kubectl/kubectl.go:515

Anything else we need to know:

NOTE: 7 clusters of 86 failures (8 in last day) out of 146571 builds from 1/23/2021

Test Owner: yifan-gu -> https://github.com/kubernetes-csi/driver-registrar/blob/master/vendor/k8s.io/kubernetes/test/test_owners.csv#L216

Triage: https://storage.googleapis.com/k8s-gubernator/triage/index.html?test=Kubectl client Simple pod should return command exit codes

Spyglass failures: https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-kind-ipv6-e2e-parallel/1358448260980674560

/sig cli

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 24 (24 by maintainers)

Most upvoted comments

These failures appear to be the pod failing to start.

The assertions are very basic:

  1. Run pod with sh -c "exit 42"
  2. Check if exit code is 42

It’s exiting with a code of 1 which probably means the pod is failing to start. I’ll add some logging of the output and then we can figure it out from there. Most likely a timeout issue with the test cluster?

I would say this is not release blocking.

Failures appear to be down outside of mass group failures. If this creeps back up this is our next option.

/close

Will continue to monitor this one. We started with a slightly higher grace period (2 mins from 1 min) for pods to start. If it still flakes we can adjust.

https://github.com/kubernetes/kubernetes/pull/101295/files#r618594831

thank you!!