aws-load-balancer-controller: Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/validate-networking-v1beta1-ingress?timeout=10s": context deadline exceeded

Describe the bug After installing the aws-load-balancer-controller via Helm by following the instructions here, I am getting the following error:

Error from server (InternalError): error when creating "<REDACTED | Link to minimal ingress is below in reproduction steps>": Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/validate-networking-v1beta1-ingress?timeout=10s": context deadline exceeded

Steps to reproduce

Provision an EKS cluster with the AWS EKS Terraform module
Install aws-load-balancer-controller via helm as described here
Attempt to create any ingress resource, such as this one

It should be noted that if I remove the aws-load-balancer-controller, I can create the minimal ingress just fine.

Expected outcome To be able to create an ALB ingress resource that will have its state reflected in AWS.

Environment

AWS Load Balancer controller version - 2.3.1
Kubernetes version - 1.21
Using EKS (yes/no), if so version? - yes eks.4

Additional Context:

The permissions for the controller are granted via IAM Roles on the instances themselves, rather than via a service account.
I have other clusters that are working. I started having trouble once I upgraded the AWS EKS terraform module from version 11 to version 18.

Any help would be greatly appreciated! I’ve been digging into this for some time and cannot seem to figure out why I am having this problem.

About this issue

Original URL
State: closed
Created 2 years ago
Reactions: 3
Comments: 16

Most upvoted comments

@manoelhc You’re a godsend. Thank you @kishorj as well! I did not have multiple node groups, but after checking the security groups on the worker nodes I saw that the nodes could not talk to one another. Going to fix the automation now and move on to the next thing. Thanks again!

Hi, I am stuck at exactly this point for a few days now. Can you please share how to modify the security groups via terraform?

Never mind, got it, thanks. I added a couple of “aws_security_group_rule” to allow communication on port 9443 between node group and cluster security groups. Works well now.

ashish3dwe on Jan 18, 2022

Hey @ashish3dwe, sorry for the delay. I added an entry to node_security_group_additional_rules. This option is part of the terraform parameters for the EKS module.

llacoste on Jan 18, 2022

This resolved my issue hope it helps others https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2462#issuecomment-1031624085

practical-gitops on Feb 8, 2022

@manoelhc You’re a godsend. Thank you @kishorj as well! I did not have multiple node groups, but after checking the security groups on the worker nodes I saw that the nodes could not talk to one another. Going to fix the automation now and move on to the next thing.

Thanks again!

llacoste on Jan 14, 2022

I found the issue for my case. I enabled the VPC Flow Logs and I noticed the security groups created by the terraform-aws-modules module block the connection between pods from different node groups (which is my case).

manoelhc on Jan 14, 2022