kops: kops rolling-update doesn't de-register instances from ELB network load balancer gracefully
1. What kops version are you running? The command kops version, will display
this information.
1.20.0
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
1.20.5
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
kops rolling-update cluster <cluster-name> --instance-group <node-group-name> --yes --force
5. What happened after the commands executed?
Before running kops rolling-update, I ran a script in a loop to make HTTP requests to an NLB endpoint sitting in front of an echo server. At the point when kops rolling-update reported that it was stopping an instance, the HTTP requests started hanging and then recovered after ~90 seconds.
6. What did you expect to happen?
I expected the HTTP requests to continue to be handled successfully.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
I believe this issue is due to the fact that kops rolling-update detaches instances from their Auto Scaling groups without de-registering them from their NLB target groups first. My targetgroup has a de-registration delay of 90 seconds so this might explain the 90 second recovery time.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 27 (15 by maintainers)
Partial fix in #11273
Filed kubernetes/website#27639
Found in Kubernetes 1.19 release notes: