kubernetes: [Failing tests] Multiple job failures on the BeforeSuite step while waiting for CoreDNS to be ready

Which jobs are failing: Multiple jobs on master-informing.

Which test(s) are failing: Before Suite on the following jobs:

Since when has it been failing: Since 5/17.

Testgrid links: See above.

Reason for failure: All the failures mentioned above seem to have to have the same root cause, namely failure while waiting for CoreDNS to be ready. Example: https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-stable-master/1129638974428549121

May 18 07:09:00.075: Error waiting for all pods to be running and ready: 2 / 14 pods in namespace "kube-system" are NOT in RUNNING and READY state in 10m0s
POD                      NODE                   PHASE   GRACE CONDITIONS
coredns-65546fffc9-j9p6b kinder-upgrade-worker2 Running       [{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [coredns]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [coredns]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-05-18 06:52:59 +0000 UTC Reason: Message:}]

/sig testing /priority critical-urgent /kind failing test /milestone v1.15 /cc @jimangel @alejandrox1 @rarchk @alenkacz

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 18 (16 by maintainers)

Most upvoted comments

@smourapina thanks for the heads up. @rajansandeep let me know if you don’t have the time for the https://github.com/kubernetes/kubernetes/pull/78033 refactor that i requested and i can try to take over there. the alternative is to rollback the coredns version which i assume is not ideal, due to a variety of fixes in the new version.

Some of this (possible all of it) is due to #78030, which is dependent on #78033 being merged.