kubernetes: [Failing Test] Release branch tests are failing to extract latest Kubernetes CI builds

Which jobs are failing: Release branch tests that require extracting a Kubernetes CI build. Examples include:

Which test(s) are failing: Overall / Extract

Since when has it been failing: 12-11 12:56 PST

Testgrid link: https://k8s-testgrid.appspot.com/sig-release-master-blocking#skew-cluster-latest-kubectl-stable1-gce

Reason for failure:

W1211 20:56:41.135] 2019/12/11 20:56:41 main.go:319: Something went wrong: failed to acquire k8s binaries: U=https://storage.googleapis.com/kubernetes-release-dev/ci R=v1.16.4-1+8aa1d9dd63e9a4 get-kube.sh failed: error during ./get-kube.sh: exit status 1
W1211 20:56:41.137] Traceback (most recent call last):
W1211 20:56:41.137]   File "/workspace/./test-infra/jenkins/../scenarios/kubernetes_e2e.py", line 778, in <module>
W1211 20:56:41.137]     main(parse_args())
W1211 20:56:41.137]   File "/workspace/./test-infra/jenkins/../scenarios/kubernetes_e2e.py", line 626, in main
W1211 20:56:41.138]     mode.start(runner_args)
W1211 20:56:41.138]   File "/workspace/./test-infra/jenkins/../scenarios/kubernetes_e2e.py", line 262, in start
W1211 20:56:41.138]     check_env(env, self.command, *args)
W1211 20:56:41.138]   File "/workspace/./test-infra/jenkins/../scenarios/kubernetes_e2e.py", line 111, in check_env
W1211 20:56:41.138]     subprocess.check_call(cmd, env=env)
W1211 20:56:41.138]   File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
W1211 20:56:41.138]     raise CalledProcessError(retcode, cmd)
W1211 20:56:41.139] subprocess.CalledProcessError: Command '('kubetest', '--dump=/workspace/_artifacts', '--gcp-service-account=/etc/service-account/service-account.json', '--up', '--down', '--test', '--provider=gce', '--cluster=bootstrap-e2e', '--gcp-network=bootstrap-e2e', '--check-leaked-resources', '--check-version-skew=false', '--extract=ci/k8s-stable1', '--extract=ci/latest', '--gcp-node-image=gci', '--gcp-zone=us-west1-b', '--ginkgo-parallel=25', '--skew', '--test_args=--ginkgo.focus=Kubectl --ginkgo.skip=\\[Serial\\] --minStartupPods=8', '--timeout=120m')' returned non-zero exit status 1

Anything else we need to know: /cc @kubernetes/ci-signal /priority critical-urgent /milestone v1.18

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 39 (38 by maintainers)

Most upvoted comments

Reporting back on the out of order releases. It seems the fixes I added in https://github.com/kubernetes/release/pull/1015 allow the releases to run in the right order.

I’ve successfully run mock stage/release steps for 1.16.5 and 1.15.8, as well as kicked off the official prod stages.

Below are links to the GCB for those with access to view:

1.16.5

Step Command Link Start Duration Succeeded?
Mock stage ./gcbmgr stage release-1.16 --buildversion=v1.16.5-beta.1.49+e7f962ba86f4ce https://console.cloud.google.com/cloud-build/builds/d0599968-479c-4a87-8e5c-08bfb289f68d?project=kubernetes-release-test January 14, 2020 at 7:16:37 PM UTC-5 45 min 50 sec yes, but was run without the --official flag
Mock release ./gcbmgr release release-1.16 --buildversion=v1.16.5-beta.1.49+e7f962ba86f4ce https://console.cloud.google.com/cloud-build/builds/99e69fb6-4c5a-41ac-b2e5-a123c92841f3?project=kubernetes-release-test January 14, 2020 at 8:04:28 PM UTC-5 17 min 31 sec yes, but was run without the --official flag
Mock stage (with https://github.com/kubernetes/release/pull/1015) RELEASE_TOOL_REPO="https://github.com/justaugustus/release" RELEASE_TOOL_BRANCH="out-of-order-releases" ./gcbmgr stage release-1.16 --official --build-at-head (ref: https://github.com/kubernetes/kubernetes/issues/86182) https://console.cloud.google.com/cloud-build/builds/013eb30a-86a2-4d0c-a8d4-9042992398d7?project=kubernetes-release-test January 15, 2020 at 1:28:34 AM UTC-5 1 hr 29 min yes
Mock release (with https://github.com/kubernetes/release/pull/1015) RELEASE_TOOL_REPO="https://github.com/justaugustus/release" RELEASE_TOOL_BRANCH="out-of-order-releases" ./gcbmgr release --official release-1.16 --buildversion=v1.16.5-beta.1.51+e7f962ba86f4ce (ref: https://github.com/kubernetes/kubernetes/issues/86182) https://console.cloud.google.com/gcr/builds/f1143dc6-dc95-4e05-ba25-c2abbd98b5fe?project=648026197307 January 15, 2020 at 3:05:19 AM UTC-5 25 min 3 sec yes
Stage (with https://github.com/kubernetes/release/pull/1015) RELEASE_TOOL_REPO="https://github.com/justaugustus/release" RELEASE_TOOL_BRANCH="out-of-order-releases" ./gcbmgr stage release-1.16 --official --build-at-head --nomock https://console.cloud.google.com/gcr/builds/baa439f5-bc01-47b6-bde2-cb6b8eb9ba01?project=648026197307 January 15, 2020 at 3:14:31 AM UTC-5 1 hr 29 min yes

1.15.8

Step Command Link Start Duration Succeeded?
Mock stage ./gcbmgr stage release-1.15 --official --build-at-head https://console.cloud.google.com/gcr/builds/af57732e-764c-4a5f-a1bc-c7825109c0dd?project=648026197307 January 12, 2020 at 4:37:16 PM UTC-5 1 hr 40 min yes
Mock release ./gcbmgr release --official release-1.15 --buildversion=v1.15.8-beta.1.30+14ede42c4fe699 https://console.cloud.google.com/gcr/builds/8f8cab38-4f5e-45ec-a359-6c70c8bca65c?project=648026197307 January 12, 2020 at 6:18:40 PM UTC-5 26 min 31 sec yes
Mock stage (with https://github.com/kubernetes/release/pull/1015) RELEASE_TOOL_REPO="https://github.com/justaugustus/release" RELEASE_TOOL_BRANCH="out-of-order-releases" ./gcbmgr stage release-1.15 --official --build-at-head https://console.cloud.google.com/gcr/builds/d14ae132-c0ea-4ee9-b608-3b010b6f4e2f?project=648026197307 January 15, 2020 at 3:20:34 AM UTC-5 1 hr 38 min yes
Mock release (with https://github.com/kubernetes/release/pull/1015) RELEASE_TOOL_REPO="https://github.com/justaugustus/release" RELEASE_TOOL_BRANCH="out-of-order-releases" ./gcbmgr release --official release-1.15 --buildversion=v1.15.8-beta.1.30+14ede42c4fe699 https://console.cloud.google.com/gcr/builds/1c3eb36b-cdda-4f69-b242-e4e31504c87b?project=648026197307 January 15, 2020 at 6:03:06 AM UTC-5
Stage (with https://github.com/kubernetes/release/pull/1015) RELEASE_TOOL_REPO="https://github.com/justaugustus/release" RELEASE_TOOL_BRANCH="out-of-order-releases" ./gcbmgr stage release-1.15 --official --build-at-head --nomock https://console.cloud.google.com/gcr/builds/be5ed7ba-bdf6-42a3-babd-93b4a6b451f8?project=648026197307 January 15, 2020 at 6:03:17 AM UTC-5 IN PROGRESS IN PROGRESS

Once that PR merges, we’ll proceed with the 1.16.5 and 1.15.8 releases.

@lachie83 opened https://github.com/kubernetes/release/issues/1020 to track the misversioned releases.

The Patch Release Team will be releasing new versions of Kubernetes (1.17.2, 1.16.6, 1.15.9) next week to fix this. We’re tentatively planning for Tuesday, January 21st, by EOD US PT, but given this week’s debugging of our build tools, it’s possible that this date may shift.

As of today, the build versions that we’re targeting are as follows:

Those commits represent essentially no-op PRs we used to trigger a new build version in CI:

The new releases will be functionally equivalent to this week’s releases, save the few commits we’ve added to fix the versioning issue:

When 1.17.2, 1.16.6, and 1.15.9 are released, please IMMEDIATELY use them instead of 1.17.1, 1.16.5, and 1.15.8.

Also sent to kubernetes-dev mailing list: https://groups.google.com/d/topic/kubernetes-dev/Mhpx-loSBns/discussion

Thanks, Stephen.

The commits after the tag are not a problem. This is expected, and typically happens between releases.

The issue seems to be that the release and then next -beta.0 tag were both applied to the same commit this time around, and git is using the release tag as its base in git describe, rather than the pre-release -beta.0 tag.

Previously, these were separated by some number of commits, with the -beta.0 tag being more recent.

Compare:

8aa1d9dd63e9a489da916580b42612a3ea286c37 (upstream/release-1.16, release-1.16) Add/Update CHANGELOG-1.16.md for v1.16.4.
224be7bdce5a9dd0c2fd0d46b83865648e2fe0ba (tag: v1.16.5-beta.0, tag: v1.16.4) Kubernetes version v1.16.5-beta.0 openapi-spec file updates
d9a25890317058606c241e87601865905e0ecb6e Merge pull request #85460 from misterikkit/automated-cherry-pick-of-#82152-upstream-release-1.16
bfafae8f1c2fdf3c3cfef04674db028531a7c098 Merge pull request #85239 from misterikkit/automated-cherry-pick-of-#84211-upstream-release-1.16
...

vs

...
d70a3ca08fe72ad8dd0b2d72cf032474ab2ce2a9 Add/Update CHANGELOG-1.16.md for v1.16.3.
e7eb19c958bec4bbe13c30b186affbe2e6fa0f11 (tag: v1.16.4-beta.0) Kubernetes version v1.16.4-beta.0 openapi-spec file updates
b6cc31d26684c2d550542b00f9dc362397e9ba13 Ensure health probes are created for local traffic policy UDP services on Azure
df0330614d57ab837b3d52aff4a5070b11983f3b fix vmss dirty cache issue
5f3d155fb7fd29bdc29836f295e8e82d2c911a91 fix race condition when attach/delete disk
843ceabdd59bbf4d1d5ae889a11dc0e0323fbaee Added new test, fixed existing tests.
3ec03b8910487240c269c82fa5d049aed5525588 Create ILB firewall name with prefix "k8s-fw".
7998a2101e78ab13fed58e4e8af6eba15acf9a85 skip deployment update if migration fails
db0bfe8b7aa17e0c929bc00d6f8a18df1a4cc200 retain corefile when migration fails
b3cbbae08ec52a7fc73d334838e18d17e8512749 (tag: v1.16.3) Merge pull request #85025 from neolit123/automated-cherry-pick-of-#85024-origin-release-1.16
c3f2a6524ed89bd9b3ded6f491f0d823b11523ae Merge pull request #84319 from wawa0210/automated-cherry-pick-of-#84156-upstream-release-1.16
...

Are we tagging the wrong commit for the release now, somehow?