kops: Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap a new node to a cluster
1. What kops version are you running? The command kops version, will display
this information.
Version 1.13.0
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
Version 1.13.0
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
Adding a node to a cluster results in nodeup to look for Downloading "http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm" which it does not exist anymore due to centos 7.7 release.
5. What happened after the commands executed?
kops tries to boostrap the node but nodeup fails due to pointing to a nonexistent package
6. What did you expect to happen? New node bootstrapped and joined to the cluster
8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
Sep 17 19:59:07 nodeup: I0917 19:59:07.667801 3560 executor.go:103] Tasks: 40 done / 48 total; 1 can run
Sep 17 19:59:07 nodeup: I0917 19:59:07.667844 3560 executor.go:178] Executing task "Package/docker-ce": Package: docker-ce
Sep 17 19:59:07 nodeup: I0917 19:59:07.667883 3560 package.go:206] Listing installed packages: /usr/bin/rpm -q docker-ce --queryformat %{NAME} %{VERSION}
Sep 17 19:59:07 nodeup: I0917 19:59:07.693153 3560 package.go:267] Installing package "docker-ce" (dependencies: [Package: container-selinux])
Sep 17 19:59:07 nodeup: I0917 19:59:07.747296 3560 files.go:100] Hash matched for "/var/cache/nodeup/packages/docker-ce": sha1:5369602f88406d4fb9159dc1d3fd44e76fb4cab8
Sep 17 19:59:07 nodeup: I0917 19:59:07.747368 3560 files.go:103] Hash did not match for "/var/cache/nodeup/packages/container-selinux": actual=sha1:93fdc15d22645b17bb1b2cc652f5bf51924d00a7 vs expected=sha1:d9f87f7f4f2e8e611f556d873a17b8c0c580fec0
Sep 17 19:59:07 nodeup: I0917 19:59:07.747458 3560 http.go:77] Downloading "http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm"
Sep 17 19:59:07 nodeup: I0917 19:59:07.891339 3560 files.go:103] Hash did not match for "/var/cache/nodeup/packages/container-selinux": actual=sha1:93fdc15d22645b17bb1b2cc652f5bf51924d00a7 vs expected=sha1:d9f87f7f4f2e8e611f556d873a17b8c0c580fec0
Sep 17 19:59:07 nodeup: W0917 19:59:07.891385 3560 executor.go:130] error running task "Package/docker-ce" (2m20s remaining to succeed): downloaded from "http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm"
but hash did not match expected "sha1:d9f87f7f4f2e8e611f556d873a17b8c0c580fec0"
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 14
- Comments: 24 (7 by maintainers)
Commits related to this issue
- Check the HTTP response code when downloading URLs I noticed that the recent container-selinux issue on centos was reporting a hash mismatch rather than a 404. See the error message here: https://gi... — committed to rifelpet/kops by rifelpet 5 years ago
- Check the HTTP response code when downloading URLs I noticed that the recent container-selinux issue on centos was reporting a hash mismatch rather than a 404. See the error message here: https://gi... — committed to mikesplain/kops by rifelpet 5 years ago
- Check the HTTP response code when downloading URLs I noticed that the recent container-selinux issue on centos was reporting a hash mismatch rather than a 404. See the error message here: https://gi... — committed to mikesplain/kops by rifelpet 5 years ago
- Check the HTTP response code when downloading URLs I noticed that the recent container-selinux issue on centos was reporting a hash mismatch rather than a 404. See the error message here: https://gi... — committed to mikesplain/kops by rifelpet 5 years ago
Below is an improved workaround, inspired by previous comments and pull requests. Kops supports arbitrary userdata. The snippet needs to be added to each instance group spec.
We are seeing this issue as well.
Looks like this package was removed from centos repo, returning a 404:
This causes a major issue when considering autoscaling (cluster-autoscaler) which takes down nodes and new ones never join the cluster.
Ideally, for resiliency Kops should not be resolving artifacts required for nodeup/bootstrapping during node runtime from public repos - not sure if this is the way to go but possibly consider placing such critical rpms/binaries in the state store during init and fetching from there during runtime? Also, if package is already installed (some may choose to bake in their AMI), it should skip trying to fetch this (not sure if this is the current behavior already).
Can the packages be externalised into a yaml/json file that nodeup reads in instead of being compiled into the binary? That would enable people to source the rpm and store it locally (s3, cloud storage, etc).
I’ve opted to save the rpm in S3 and then add it into kops with this in the instance groups:
Then you just need to sort out the bucket policy and iam privileges for kops to read from the bucket. This is in an AWS environment obviously, I’m sure there are similar approaches for the other cloud platforms.
OK so looks like we’ll be doing 1.13.2 this morning. I’d also really prefer to get away from the OS packaging (towards “tar.gz” installation) as it seems to be introducing more problems than it solves.
For 2.68.1 -> 2.107.3: We try not to make potentially breaking changes once we have released the 1.x.0 of kops. But we do so for security fixes etc. So we can look at getting it into 1.14.0 (which hasn’t quite released yet). But is it a security fix (in which case we would get it into 1.13.0)?
We’re working on getting a 1.13/1.14 cut with these fixes asap.
You’ll either need to build and deploy your own version of kops (including protokube and kubeup), a workaround as suggested above (you can probably utilize a hook to automate it https://github.com/kubernetes/kops/blob/master/docs/cluster_spec.md#hooks) or wait for a release which we’re actively working on getting out asap!
Here’s the changelog, looks like there’s not a strict security fix vs feature distinction, so we should probably shouldn’t introduce the new version in kops 1.13:
This workaround no longer works. As of today
http://mirror.centos.org/centos/7.6.1810/has been deprecated. This also breaks the fix that went in kops 1.13.1: https://github.com/kubernetes/kops/pull/7609As a workaround you can use
http://vault.centos.org/7.6.1810/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpmBut really contianer-selinux needsto be updated to
http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.107-3.el7.noarch.rpmalong with associated dependenciesIndeed, hooks won’t work. We figured that out the exact same time as @alexinthesky 😂
Then we switched for the Debian AMI to avoid further damage by dying spot instances.
kope.io/k8s-1.14-debian-stretch-amd64-hvm-ebs-2019-08-16Now that #7609 is merged how would I be able to leverage this change? Do I have to wait for a new kops release or how is nodeup released?
@rdjy Thanks for the answer, it did the trick for us.