kubernetes: [Failing Test] [sig-apps] ReplicaSet should serve a basic image on each replica with a private image, ReplicationController should serve a basic image on each replica with a private image

Which jobs are failing: ci-kubernetes-e2e-gci-gce ci-kubernetes-e2e-gce-cos-k8sbeta-default

Which test(s) are failing: [sig-apps] ReplicaSet should serve a basic image on each replica with a private image [sig-apps] ReplicationController should serve a basic image on each replica with a private image

Since when has it been failing: Started failing between 2:04 and 2:40PM PST Dec 1

Testgrid link: https://k8s-testgrid.appspot.com/sig-release-master-blocking#gce-cos-master-default https://k8s-testgrid.appspot.com/sig-release-1.20-blocking#gce-cos-k8sbeta-default

Reason for failure: pod never run? Looks like both are timing out waiting for containers to be ready

ReplicaSet should serve a basic image on each replica with a private image:

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/apps/replica_set.go:98
Dec  1 22:54:13.321: Unexpected error:
    <*errors.errorString | 0xc0036f8ef0>: {
        s: "pod \"my-hostname-private-cd2ec0df-be38-465e-a00f-f868f9674320-rknrl\" never run (phase: Pending, conditions: [{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-cd2ec0df-be38-465e-a00f-f868f9674320]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-cd2ec0df-be38-465e-a00f-f868f9674320]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason: Message:}]): timed out waiting for the condition",
    }
    pod "my-hostname-private-cd2ec0df-be38-465e-a00f-f868f9674320-rknrl" never run (phase: Pending, conditions: [{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-cd2ec0df-be38-465e-a00f-f868f9674320]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-cd2ec0df-be38-465e-a00f-f868f9674320]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 22:49:07 +0000 UTC Reason: Message:}]): timed out waiting for the condition
occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/apps/replica_set.go:156

ReplicationController should serve a basic image on each replica with a private image

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/apps/rc.go:68
Dec  1 23:07:02.794: Unexpected error:
    <*errors.errorString | 0xc00348b1f0>: {
        s: "pod \"my-hostname-private-3071f600-7524-41d9-b7ea-f7a5cf5011e7-xz94v\" never run (phase: Pending, conditions: [{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-3071f600-7524-41d9-b7ea-f7a5cf5011e7]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-3071f600-7524-41d9-b7ea-f7a5cf5011e7]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason: Message:}]): timed out waiting for the condition",
    }
    pod "my-hostname-private-3071f600-7524-41d9-b7ea-f7a5cf5011e7-xz94v" never run (phase: Pending, conditions: [{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-3071f600-7524-41d9-b7ea-f7a5cf5011e7]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [my-hostname-private-3071f600-7524-41d9-b7ea-f7a5cf5011e7]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-12-01 23:02:02 +0000 UTC Reason: Message:}]): timed out waiting for the condition
occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/apps/rc.go:459

Anything else we need to know:

Example Spyglass links:

Having trouble finding a good Triage link - will drop one if I can find

Wondering whether this has anything to do with the Pod pending timeout errors happening on some of the jobs on the 1.20 boards now?

/sig apps /cc @kubernetes/ci-signal @kubernetes/sig-apps-test-failures

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 21 (21 by maintainers)

Most upvoted comments

@spiffxp – CI Signal should continue to monitor.

I think since @krzyzacy restored the project, y’all are in the clear for the time being. 🙃

/assign @justaugustus @hasheddan (Dan and I will be watching from the shadows. 😃)

We should create a community-owned equivalent project, I’ll open a followup issue for that

@spiffxp – Opened one here: https://github.com/kubernetes/k8s.io/issues/1458

https://kubernetes.slack.com/archives/C09QZ4DQB/p1606896985218000

The project hosting the GCR repo was swept up by a security audit because it hadn’t been properly accounted for. That change has been reverted. Now waiting to see affected jobs go back to green.

We should create a community-owned equivalent project, I’ll open a followup issue for that

Now that I have access to the project, I’m working on restoring permissions. I had hoped this would be a 10min fix, but it’s taking longer than I expected. I can currently list the backing bucket, but cannot list images.

@spiffxp I think we likely just need to add permissions to prow-build@k8s-infra-prow-build.iam.gserviceaccount.com to access the bucket in the restored project where the GCR images are hosted.

Hrm, I’m seeing this fail still in downstream repo tests. Are the tests injecting a secret (the only hardcoded GCR secret I see is in k8s.io/kubernetes/test/e2e/common/runtime.go but that is not called by those referenced tests), or is the auth rule on this repo limited to a set of projects now vs all projects on GCP before (since these tests passed in our GCP projects yesterday but not now, after access was supposedly restored)?

Who is able to access that repo? If it was previously “all projects” then I think that wasn’t restored correctly. https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/24887/pull-ci-openshift-origin-master-e2e-gcp/1334318300561149952 is a 1.19 codebase trying to run in the openshift-gce-devel-ci GCP project, but is getting access denied.

EDIT: This looks like it has started passing again at midnight EST? Maybe some sort of wierd perms propagation issue. DISREGARD

It seems this is the problem failed to resolve reference "gcr.io/k8s-authenticated-test/agnhost:2.6": failed to authorize: failed to fetch oauth token: unexpected status: 403 Forbidden

Dec 1 22:58:38.068: INFO: At 2020-12-01 22:53:38 +0000 UTC - event for my-hostname-private-1b815588-3c0e-49b2-bad4-77d652224eb8-tqctm: {kubelet bootstrap-e2e-minion-group-6s0c} Failed: Failed to pull image “gcr.io/k8s-authenticated-test/agnhost:2.6”: rpc error: code = Unknown desc = failed to pull and unpack image “gcr.io/k8s-authenticated-test/agnhost:2.6”: failed to resolve reference “gcr.io/k8s-authenticated-test/agnhost:2.6”: failed to authorize: failed to fetch oauth token: unexpected status: 403 Forbidden