amazon-vpc-cni-k8s: How to stop scheduling pods when aws-cni is not ready

What happened:

In case of aws-cni is not ready yet, we often face issues by Kubernetes already scheduling pods but can’t get an IP address when aws-cni is not up correctly. I wonder if there’s a way that aws-cni could notify node once it’s ready by setting taint on the nodes or something. Are there any plans for such use-case or is there even already something which we could pick up?

It would definitely help us, otherwise we might need to implement this by ourself.

Environment:

  • Kubernetes version (use kubectl version): 1.19.x
  • CNI Version 1.8.0
  • OS (e.g: cat /etc/os-release): flatcar
  • Kernel (e.g. uname -a): 5.x

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

@njuettner - Sorry for the delay.

I was discussing with @achevuru. One improvement which can be done here is - Typically when IP is not available, create will fail and kubelet will send a delete. Now this delete will return an error with existing code since pod is not found https://github.com/aws/amazon-vpc-cni-k8s/blob/146b9b910a89accf2a7e6bc1199dff10a8806e6b/pkg/ipamd/datastore/data_store.go#L1179 in IPAMD store and kubelet will keep retrying deletes. Since the IP is not allocated in the first place, IPAMD delete should instead reply with a success. So at least now the pod “might” be scheduled on a new node which might have the IPs available.

@achevuru want to add anything here?