test-infra: Prow jobs are failing with 'Could not resolve host: github.com'

What happened: Many prow jobs started failing with error like:

Cloning into 'test-infra'...
fatal: unable to access 'https://github.com/kubernetes/test-infra/': Could not resolve host: github.com

e.g. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-scalability/1356950100881969152 https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-build-fast/1356957645537284096/ What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Please provide links to example occurrences, if any:

Anything else we need to know?: https://k8s-testgrid.appspot.com/sig-scalability-gce#gce-cos-master-scalability-100 suggests that this started happening between 03:43 PST and 04:27 PST.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 28 (28 by maintainers)

Most upvoted comments

Jakub Przychodzeń , did you see anything different there?

Sorry, a typo in my comment. I’ve fixed it. Thanks again for handling this issue.

s/not/now changes meaning a lot 😉

The most recent failure on https://testgrid.k8s.io/sig-scalability-gce#gce-cos-1.19-scalability-100 was from last Friday.

Screen Shot 2021-02-23 at 7 04 28 AM

The only red column was on last Friday at 4AM (https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-scalability-stable1/1362733725674115072). Which was before @spiffxp updating the nodepool with the fixes (See the timestamp at https://github.com/kubernetes/k8s.io/pull/1686#issuecomment-782260287).

@jprzychodzen , did you see anything different there?

/cc @cjwagner as test-infra-oncall

Cole, could you edit kube-dns configmap in cluster k8s-infra-prow-build and scale kube-dns pods? I guess that something like this would help

"data": {
        "linear":'{"coresPerReplica":256,"nodesPerReplica":8,"min":4,"preventSinglePointFailure":true}'
    },

Regarding NodeLocalDNS - kubernetes/kubernetes#56903 - it seems that this would increase reliability of DNS, so it should be enabled.

This is exactly the patch I mentioned per https://github.com/kubernetes/test-infra/issues/20816#issuecomment-777959845, + @BenTheElder for awareness

/cc @cjwagner as test-infra-oncall

Cole, could you edit kube-dns configmap in cluster k8s-infra-prow-build and scale kube-dns pods? I guess that something like this would help

"data": {
        "linear":'{"coresPerReplica":256,"nodesPerReplica":8,"min":4,"preventSinglePointFailure":true}'
    },

Regarding NodeLocalDNS - https://github.com/kubernetes/kubernetes/issues/56903 - it seems that this would increase reliability of DNS, so it should be enabled.