kubernetes: kube-proxy generates the wrong iptables dnat rule

What happened?

kube-proxy generates the wrong iptables dnat rule, as shown following:

[root@controller-node-1 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.233.0.1      <none>        443/TCP   3h18m
my-dep       ClusterIP   10.233.60.220   <none>        80/TCP    26m
my-svc       ClusterIP   10.233.48.3     <none>        80/TCP    26m

[root@worker1 ~]# iptables-save -t nat | grep '10.233.0.1'
-A KUBE-SERVICES -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s 10.233.64.0/18 -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
[root@worker1 ~]# iptables-save -t nat | grep KUBE-SVC-NPX46M4PTMTKRN6Y
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
-A KUBE-SERVICES -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s 10.233.64.0/18 -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB
[root@worker1 ~]# iptables-save -t nat | grep KUBE-SEP-CC3HXZSKU6BR4DDB
:KUBE-SEP-CC3HXZSKU6BR4DDB - [0:0]
-A KUBE-SEP-CC3HXZSKU6BR4DDB -s 10.6.214.21/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB

This causes the worker node to be unable to access the apiserver because of this incorrect iptables rule.

What did you expect to happen?

kube-proxy should be generates correct iptables rule

How can we reproduce it (as minimally and precisely as possible)?

  • Create a cluster via kubespray, All are works.
[root@controller-node-1 ~]# kubectl get nodes -o wide
NAME                STATUS     ROLES           AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
controller-node-1   Ready      control-plane   129m   v1.25.3   10.6.214.12   <none>        CentOS Linux 7 (Core)   5.19.10-1.el7.elrepo.x86_64   containerd://1.6.8
  • Join a node
[root@controller-node-1 ~]# kubectl get nodes -o wide
NAME                STATUS     ROLES           AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
controller-node-1   Ready      control-plane   129m   v1.25.3   10.6.214.12   <none>        CentOS Linux 7 (Core)   5.19.10-1.el7.elrepo.x86_64   containerd://1.6.8
worker1             NotReady   <none>          85m    v1.25.3   10.6.214.13   <none>        CentOS Linux 7 (Core)   5.4.197-1.el7.elrepo.x86_64   containerd://1.6.8

And I found some hostNetwork Pod on work1 failed to start. Since these pods failed to visit apiServer, I check that the firewall has been disable.

I found kube-proxy generates incorrect iptables rule:

[root@worker1 ~]# iptables-save -t nat | grep KUBE-SEP-CC3HXZSKU6BR4DDB
:KUBE-SEP-CC3HXZSKU6BR4DDB - [0:0]
-A KUBE-SEP-CC3HXZSKU6BR4DDB -s 10.6.214.21/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB

--to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent ?

In controller-node-1, it works fine.

[root@controller-node-1 ~]# iptables-save -t nat | grep KUBE-SEP-CC3HXZSKU6BR4DDB
:KUBE-SEP-CC3HXZSKU6BR4DDB - [0:0]
-A KUBE-SEP-CC3HXZSKU6BR4DDB -s 10.6.214.21/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 10.6.214.21:6443
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB

I tried to create a new service, but the iptables rules generated by kube-proxy have the same problem.

[root@controller-node-1 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.233.0.1      <none>        443/TCP   3h18m
my-dep       ClusterIP   10.233.60.220   <none>        80/TCP    26m
my-svc       ClusterIP   10.233.48.3     <none>        80/TCP    26m
[root@worker1 ~]# iptables-save -t nat | grep  10.233.60.220
-A KUBE-SERVICES -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-SVC-YIDRKHK4K7YFNT5I
-A KUBE-SVC-YIDRKHK4K7YFNT5I ! -s 10.233.64.0/18 -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
[root@worker1 ~]# iptables-save -t nat | grep   KUBE-SVC-YIDRKHK4K7YFNT5I
:KUBE-SVC-YIDRKHK4K7YFNT5I - [0:0]
-A KUBE-SERVICES -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-SVC-YIDRKHK4K7YFNT5I
-A KUBE-SVC-YIDRKHK4K7YFNT5I ! -s 10.233.64.0/18 -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
-A KUBE-SVC-YIDRKHK4K7YFNT5I -m comment --comment "default/my-dep -> 10.233.74.77:80" -j KUBE-SEP-CKROCXU3WMRQYCUN
[root@worker1 ~]# iptables-save -t nat | grep   KUBE-SEP-CKROCXU3WMRQYCUN
:KUBE-SEP-CKROCXU3WMRQYCUN - [0:0]
-A KUBE-SEP-CKROCXU3WMRQYCUN -s 10.233.74.77/32 -m comment --comment "default/my-dep" -j KUBE-MARK-MASQ
-A KUBE-SEP-CKROCXU3WMRQYCUN -p tcp -m comment --comment "default/my-dep" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination
-A KUBE-SVC-YIDRKHK4K7YFNT5I -m comment --comment "default/my-dep -> 10.233.74.77:80" -j KUBE-SEP-CKROCXU3WMRQYCUN
[root@worker1 ~]# curl 10.233.60.220

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here
[root@controller-node-1 ~]# kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:57:26Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:49:09Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
[root@worker1 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

$ uname -a
[root@controller-node-1 ~]# uname -a
Linux controller-node-1 5.19.10-1.el7.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep 17 11:34:40 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

[root@worker1 ~]# uname -a
Linux worker1 5.4.197-1.el7.elrepo.x86_64 #1 SMP Sat Jun 4 08:43:19 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

kubespray

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, …) and versions (if applicable)

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 27 (27 by maintainers)

Most upvoted comments

Kernel guys suggest that this is purely an iptables-1.4 display problem; They believe the kernel has the correct representation of the rule (as seen by the fact that iptables 1.8 can consistently show it correctly), it’s just that iptables 1.4 isn’t displaying it correctly.

So in that case, whatever bug you’re hitting is somewhere else, and is unrelated to this particular iptables rule…