rancher: Not able to create ingresses on amazon ec2 node driver clusters in an HA rancher on 1.20 cluster
Information about the Cluster Rancher Server Setup
- Rancher version: v2.5.11
- Installation option (Docker install/Helm Chart): Helm Chart
- If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): RKE1, v1.20.12-rancher1-2, 1.2.14
- Proxy/Cert Details: Selfsigned
Information about the Cluster
- Kubernetes version: v1.20.12
- Cluster Type (Local/Downstream):
Downstream node driver 1 worker, 1 etcd, 1 cp RKE1
Describe the bug Creation of ingress on an AWS node driver cluster created on a rancher HA server does not go through and errors out with Failed calling webhook error. More details in the result and additional info.
To Reproduce
- Create a rancher HA server on v2.5.11
- Create a downstream RKE1 node driver Amazon EC2 cluster with any node count
- From any project create a workload
- Create an ingress pointing to this workload
Result The ingress creation does not go through and it errors out
baseType: "error"
code: "InternalError"
message: "Internal error occurred: failed calling webhook \"validate.nginx.ingress.kubernetes.io\": Post \"https://ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s\": context deadline exceeded"
status: 500
type: "error"
Expected Result Expected the ingress creation to go through without any error.
Additional context
- This is not seen on node driver RKE DO clusters on HA or custom clusters
- Not seen on docker install DO,Amazon ec2 or custom clusters.
- Only seen on Amazon ec2 node driver HA.
- Also not seen on clusters on k8s
v1.19.16
Errors seen in the rancher logs:
7 warnings.go:80] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
W1204 02:19:48.667464 7 warnings.go:80] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
W1204 02:20:04.308015 7 warnings.go:80] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W1204 02:22:02.682439 7 warnings.go:80] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
W1204 02:26:15.307942 7 warnings.go:80] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
2021/12/04 02:04:45 [ERROR] error syncing 'ingress-ip-domain': handler copy-settings: the server could not find the requested resource, requeuing
2021/12/04 02:04:45 [ERROR] error syncing 'install-uuid': handler copy-settings: the server could not find the requested resource, requeuing
2021/12/04 02:04:45 [ERROR] error syncing 'ingress-ip-domain': handler copy-settings: the server could not find the requested resource, requeuing
2021/12/04 02:04:45 [ERROR] error syncing 'install-uuid': handler copy-settings: the server could not find the requested resource, requeuing
2021/12/04 02:04:45 [ERROR] error syncing 'ingress-ip-domain': handler copy-settings: the server could not find the requested resource, requeuing
Errors seen in ingress-controller:
E1204 02:26:15.592042 6 server.go:77] "Failed to decode request body" err="couldn't get version/kind; json parse error: unexpected end of JSON input"
2021/12/04 02:26:16 http: TLS handshake error from :38364: read tcp 172.31.13.232:8443->:38364: read: connection reset by peer
E1204 02:26:17.685127 6 server.go:77] "Failed to decode request body" err="couldn't get version/kind; json parse error: unexpected end of JSON input"
E1204 02:26:17.758252 6 server.go:77] "Failed to decode request body" err="couldn't get version/kind; json parse error: unexpected end of JSON input"
- - [04/Dec/2021:02:26:45 +0000] "GET /v3/connect/config HTTP/2.0" 200 19691 "-" "Go-http-client/2.0" 2902 0.010 [cattle-system-rancher-80] [] 10.42.0.6:80 19704 0.012 200 69048a76b18c5e7762850db3ecb19c5d
- - [04/Dec/2021:02:27:40 +0000] "GET /v3/connect/config HTTP/2.0" 200 19708 "-" "Go-http-client/2.0" 2900 0.007 [cattle-system-rancher-80] [] 10.42.2.6:80 19721 0.008 200 73e8898aea23c402278561bdbec31eaa
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 16 (12 by maintainers)
Pretty sure this is because the fix (https://github.com/rancher/rke/pull/2626) we implemented in RKE for this issue was not backported/changed scope as it was added for k8s 1.21 and up but the NGINX ingress version which was initially only used in k8s 1.21 and up was backported to older versions but the fix was not backported.
The fix here is to change the version scope from k8s to NGINX ingress and then scope it to
>=0.48.0
(based on the templates) and backport it to Rancher 2.5/RKE 1.2Workaround is to manually set the mode to
hostPort
:I was able to reproduce this in rancher. The trick seems to be using a cluster with more than 2 nodes.
I was also able to reproduce this without rancher, with rke alone. So I think this issue should be transferred to the rke team.
Repro steps for RKE alone:
Also happens with
extensions/v1beta1
andnetworking.k8s.io/v1beta
:Extending the timeout in the
ingress-nginx-admission
validatingwebhookconfiguration doesn’t help, nor does changing theapiGroups
orapiVersions
setting.This issue seems to be known and unsolved in the upstream ingress-nginx controller: https://github.com/kubernetes/ingress-nginx/issues/5401
Workaround summary for release notes:
On custom clusters with two or more nodes provisioned using the Amazon EC2 infrastructure provider using default settings, a configuration problem causes creation of Ingress resources to fail. Options to work around this are either: