kops: Can't access cluster after upgrading to K8s version 1.18.6 from 1.17.6
1. What kops version are you running? The command kops version, will display
this information.
1.18.0
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
1.18.6
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
kops upgrade cluster kops update cluster --name “${DNS_NAME}” --yes kops rolling-update cluster --yes
5. What happened after the commands executed? I repeatedly got the error: Cluster did not validate, will retry in “30s”: error listing nodes: Unauthorized.
6. What did you expect to happen? Cluster to validate after successfully upgrading
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: "2020-05-14T08:11:14Z"
generation: 16
name: <REDACTED>
spec:
additionalPolicies:
node: |
[
{
"Action": [
"sts:AssumeRole"
],
"Effect": "Allow",
"Resource": "*"
}
]
api:
loadBalancer:
sslCertificate: <REDACTED>
type: Internal
authorization:
rbac: {}
channel: stable
cloudProvider: aws
configBase: <REDACTED>
etcdClusters:
- cpuRequest: 200m
etcdMembers:
- instanceGroup: master-<REDACTED>
name: a
memoryRequest: 100Mi
name: main
version: 3.2.24
- cpuRequest: 100m
etcdMembers:
- instanceGroup: master-<REDACTED>
name: a
memoryRequest: 100Mi
name: events
version: 3.2.24
iam:
allowContainerRegistry: true
legacy: false
kubeDNS:
provider: CoreDNS
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
kubernetesApiAccess:
- <REDACTED>
kubernetesVersion: 1.17.6
masterInternalName: <REDACTED>
masterPublicName: <REDACTED>
networkCIDR: <REDACTED>
networking:
calico:
majorVersion: v3
nonMasqueradeCIDR: <REDACTED>
sshAccess:
- <REDACTED>
subnets:
- cidr: <REDACTED>
name: <REDACTED>
type: Private
zone: <REDACTED>
- cidr: <REDACTED>
name: <REDACTED>
type: Utility
zone: <REDACTED>
topology:
bastion:
bastionPublicName: <REDACTED>
dns:
type: Public
masters: private
nodes: private
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2020-05-14T08:11:14Z"
generation: 3
labels:
kops.k8s.io/cluster: <REDACTED>
name: bastions
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20200716
machineType: t2.micro
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: bastions
role: Bastion
subnets:
- <REDACTED>
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2020-05-14T08:11:14Z"
generation: 5
labels:
kops.k8s.io/cluster: <REDACTED>
name: master-<REDACTED>
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20200716
machineType: t3a.large
maxPrice: <REDACTED>
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-<REDACTED>
on-demand: "false"
role: Master
subnets:
- <REDACTED>
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2020-05-14T08:11:14Z"
generation: 20
labels:
kops.k8s.io/cluster: <REDACTED>
name: nodes
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20200716
machineType: t3a.xlarge
maxPrice: <REDACTED>
maxSize: 1
minSize: 1
nodeLabels:
autoscaler-enabled/<REDACTED>: "true"
kops.k8s.io/instancegroup: nodes
role: Node
subnets:
- <REDACTED>
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 19 (9 by maintainers)
I see you’re using an ACM cert on your API ELB. I think you might be running into the same issue I’m trying to workaround in https://github.com/kubernetes/kops/pull/9732. For Kubernetes 1.18 you can workaround the problem by explicitly re-enabling basic auth, adding this to your cluster spec:
But this wont work for K8s 1.19 with removes basic auth support, so I’m hoping we’ll have a better solution for API ELBs with ACM certs by then.
Yeah it’s something I want to move away from and I was hoping Kops would help with that, so it would be great if it was added into the next version of Kops.
Looks like the fix is coming. Just wanted to share this here: https://github.com/kubernetes/kops/blob/master/permalinks/acm_nlb.md