amazon-vpc-cni-k8s: Add better error message indicate why cni ipamD is not starting

Today, in CNI daemonSet(aws-node) whenever ipamD restart, it query kubernetes API server about Pods already running on the node. If it can not reach kubernetes API server, ipamD will exit and you will see following logs in the /var/log/aws-routed-cni/ipamd.log.xxx

2018-07-02T15:00:33Z [INFO] Starting L-IPAMD 1.0.0 ...
2018-07-02T15:00:33Z [INFO] Testing communication with server

..
2018-07-02T15:00:33Z [INFO] Starting L-IPAMD 1.0.0 ...
2018-07-02T15:00:33Z [INFO] Testing communication with server

ipamD needs print out explicit error that it failed due to it can NOT communicate with API server.

To verify security groups are configured correctly between worker node and kubernetes API server, you can run following commands:

# find out kubernetes service IP
kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.100.0.1   <none>        443/TCP   24d

#verify  worker node can reach port 443 of master 
telnet 10.100.0.1 443
Trying 10.100.0.1...
Connected to 10.100.0.1.
Escape character is '^]'.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 1
  • Comments: 19 (9 by maintainers)

Most upvoted comments

@kuroneko25 the specific security group used for creating eks cluster, which is also returned from:

aws eks describe-cluster --name <your cluster>

It needs to have port 443 open on the inbound rule