kubernetes: [Failing tests] Multiple job failures on the BeforeSuite step while waiting for CoreDNS to be ready

Which jobs are failing: Multiple jobs on master-informing.

Which test(s) are failing: Before Suite on the following jobs:

Since when has it been failing: Since 5/17.

Testgrid links: See above.

Reason for failure: All the failures mentioned above seem to have to have the same root cause, namely failure while waiting for CoreDNS to be ready. Example: https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-stable-master/1129638974428549121

May 18 07:09:00.075: Error waiting for all pods to be running and ready: 2 / 14 pods in namespace "kube-system" are NOT in RUNNING and READY state in 10m0s
POD                      NODE                   PHASE   GRACE CONDITIONS
coredns-65546fffc9-j9p6b kinder-upgrade-worker2 Running       [{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [coredns]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [coredns]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason: Message:}]

/sig testing /priority critical-urgent /kind failing test /milestone v1.15 /cc @jimangel @alejandrox1 @rarchk @alenkacz

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 18 (16 by maintainers)

Most upvoted comments

@smourapina thanks for the heads up. @rajansandeep let me know if you don’t have the time for the https://github.com/kubernetes/kubernetes/pull/78033 refactor that i requested and i can try to take over there. the alternative is to rollback the coredns version which i assume is not ideal, due to a variety of fixes in the new version.

neolit123 on May 28, 2019

I have opened https://github.com/kubernetes/kubernetes/pull/78302 which aims to fix all the failing tests except https://testgrid.k8s.io/sig-release-master-informing#kubeadm-kinder-upgrade-stable-master, which will be fixed via https://github.com/kubernetes/kubernetes/pull/78033.

rajansandeep on May 24, 2019

Some of this (possible all of it) is due to #78030, which is dependent on #78033 being merged.

chrisohaver on May 20, 2019