kubernetes: IPVS Graceful Termination can break NFS failover logic

What happened:

Kube-proxy is keeping RS until all connections have expired.

NFS client is trying to use this RS over and over and as fact they stuck by trying to reach the IP-address of old pod instead just using new one.

What you expected to happen:

After one pod is dying and new one created all the connections will be redirected to new one.

How to reproduce it (as minimally and precisely as possible):

  • Install nfs-server-provisioner
  • Create PVC
  • Run pod with this PVC or mount share on the node
  • Delete nfs-server-provisioner pod
  • Wait for new one will be created
  • Try to write something into share

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:36:53Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:09:21Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: bare metal
  • OS (e.g: cat /etc/os-release):
NAME="Ubuntu"
VERSION="18.04.2 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.2 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
  • Kernel (e.g. uname -a):
Linux m10c45 4.15.18-18-pve #1 SMP PVE 4.15.18-44 (Wed, 03 Jul 2019 11:19:13 +0200) x86_64 x86_64 x86_64 GNU/Li
  • Install tools: kubeadm
  • Network plugin and version: bridget (cni-bridge)
  • Others:

Not sure if it is connected: https://www.spinics.net/lists/linux-nfs/msg56520.html

https://github.com/kubernetes/kubernetes/blob/8c7244ac12ee1aaf13e2db9a52a505decccc5f70/pkg/proxy/ipvs/graceful_termination.go#L167-L173

/area ipvs /sig network /assign @m1093782566

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

Gonna go ahead and close this since this is a kernel issue and we added flags to configure the IPVS timeouts in https://github.com/kubernetes/kubernetes/pull/85517

Feel free to reopen if you think this doesn’t address the original issue.