kubernetes: kube-proxy and kubelet setting up incorrect iptables rules when proxy-mode=iptables
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): No.
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): kube-proxy issue #36652 is somewhat related but in essence is NOT the same
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
Kubernetes version (use kubectl version
):
Client Version: version.Info{Major:“1”, Minor:“3”, GitVersion:“v1.3.6”, GitCommit:“ae4550cc9c89a593bcda6678df201db1b208133b”, GitTreeState:“clean”, BuildDate:“2016-08-26T18:13:23Z”, GoVersion:“go1.6.2”, Compiler:“gc”, Platform:“linux/amd64”}
Server Version: version.Info{Major:“1”, Minor:“3”, GitVersion:“v1.3.6”, GitCommit:“ae4550cc9c89a593bcda6678df201db1b208133b”, GitTreeState:“clean”, BuildDate:“2016-08-26T18:06:06Z”, GoVersion:“go1.6.2”, Compiler:“gc”, Platform:“linux/amd64”}
Environment:
-
Cloud provider or hardware configuration: x86_64 (virtual box virtual machines)
-
OS (e.g. from /etc/os-release): NAME=“Ubuntu” VERSION=“16.04.1 LTS (Xenial Xerus)” ID=ubuntu ID_LIKE=debian PRETTY_NAME=“Ubuntu 16.04.1 LTS” VERSION_ID=“16.04” HOME_URL=“http://www.ubuntu.com/” SUPPORT_URL=“http://help.ubuntu.com/” BUG_REPORT_URL=“http://bugs.launchpad.net/ubuntu/” UBUNTU_CODENAME=xenial
-
Kernel (e.g.
uname -a
): Linux k8s-node-0 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux -
Install tools:
-
Others: using flannel v0.6.1 as overlay network kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, and kube-proxy all run directly on the host operating system environment (no containerization) and are all managed by systemd. Same for etcd, docker, and flannel.
What happened:
-
when i setup all the kube-proxy instances in my minions to use proxy-mode=iptables, then I use curl to try to http/get my webserver (nginx server with 3 replicas) using the service ip address, the connection hangs. However, when I set proxy-mode=userspace it works as expected.
-
when I inspected the iptables rules that are setup when proxy-mode=iptables, I noticed some incorrect configurations: The chains KUBE-MARK-MASQ and KUBE-MARK-DROP in the nat table have the rules: ‘-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000 -A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000’ This is incorrect because the MARK target only works in the mangle table NOT the nat table. See iptables MARK target documentation.
Try this, though it is a bit dated. (https://www.frozentux.net/iptables-tutorial/chunkyhtml/x4389.html)
- The following is my iptables-save dump from one of the nodes: ~$ sudo iptables-save -t nat
Generated by iptables-save v1.6.0 on Tue Nov 15 11:57:54 2016
*nat :PREROUTING ACCEPT [9:490] :INPUT ACCEPT [9:490] :OUTPUT ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] :DOCKER - [0:0] :KUBE-MARK-DROP - [0:0] :KUBE-MARK-MASQ - [0:0] :KUBE-NODEPORTS - [0:0] :KUBE-POSTROUTING - [0:0] :KUBE-SEP-7WFUHPTNYMSU2JFN - [0:0] :KUBE-SEP-GXTJGQCWK4PZV3S3 - [0:0] :KUBE-SEP-HU4VO5AJOOQZNVVQ - [0:0] :KUBE-SEP-QZMLUAKRU3OISTT5 - [0:0] :KUBE-SERVICES - [0:0] :KUBE-SVC-52I4NKETLMIO5ZQJ - [0:0] :KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0] -A PREROUTING -m comment --comment “kubernetes service portals” -j KUBE-SERVICES -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER -A OUTPUT -m comment --comment “kubernetes service portals” -j KUBE-SERVICES -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER -A POSTROUTING -m comment --comment “kubernetes postrouting rules” -j KUBE-POSTROUTING -A POSTROUTING -s 172.16.97.0/24 ! -o docker0 -j MASQUERADE -A DOCKER -i docker0 -j RETURN -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000 -A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000 -A KUBE-POSTROUTING -m comment --comment “kubernetes service traffic requiring SNAT” -m mark --mark 0x4000/0x4000 -j MASQUERADE -A KUBE-SEP-7WFUHPTNYMSU2JFN -s 172.16.86.2/32 -m comment --comment “default/webservice:” -j KUBE-MARK-MASQ -A KUBE-SEP-7WFUHPTNYMSU2JFN -p tcp -m comment --comment “default/webservice:” -m tcp -j DNAT --to-destination 172.16.86.2:80 -A KUBE-SEP-GXTJGQCWK4PZV3S3 -s 172.16.97.2/32 -m comment --comment “default/webservice:” -j KUBE-MARK-MASQ -A KUBE-SEP-GXTJGQCWK4PZV3S3 -p tcp -m comment --comment “default/webservice:” -m tcp -j DNAT --to-destination 172.16.97.2:80 -A KUBE-SEP-HU4VO5AJOOQZNVVQ -s 10.200.3.13/32 -m comment --comment “default/kubernetes:https” -j KUBE-MARK-MASQ -A KUBE-SEP-HU4VO5AJOOQZNVVQ -p tcp -m comment --comment “default/kubernetes:https” -m recent --set --name KUBE-SEP-HU4VO5AJOOQZNVVQ --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.200.3.13:6443 -A KUBE-SEP-QZMLUAKRU3OISTT5 -s 172.16.49.2/32 -m comment --comment “default/webservice:” -j KUBE-MARK-MASQ -A KUBE-SEP-QZMLUAKRU3OISTT5 -p tcp -m comment --comment “default/webservice:” -m tcp -j DNAT --to-destination 172.16.49.2:80 -A KUBE-SERVICES -d 192.168.33.1/32 -p tcp -m comment --comment “default/kubernetes:https cluster IP” -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y -A KUBE-SERVICES -d 192.168.33.28/32 -p tcp -m comment --comment “default/webservice: cluster IP” -m tcp --dport 80 -j KUBE-SVC-52I4NKETLMIO5ZQJ -A KUBE-SERVICES -m comment --comment “kubernetes service nodeports; NOTE: this must be the last rule in this chain” -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS -A KUBE-SVC-52I4NKETLMIO5ZQJ -m comment --comment “default/webservice:” -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-QZMLUAKRU3OISTT5 -A KUBE-SVC-52I4NKETLMIO5ZQJ -m comment --comment “default/webservice:” -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-7WFUHPTNYMSU2JFN -A KUBE-SVC-52I4NKETLMIO5ZQJ -m comment --comment “default/webservice:” -j KUBE-SEP-GXTJGQCWK4PZV3S3 -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment “default/kubernetes:https” -m recent --rcheck --seconds 180 --reap --name KUBE-SEP-HU4VO5AJOOQZNVVQ --mask 255.255.255.255 --rsource -j KUBE-SEP-HU4VO5AJOOQZNVVQ -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment “default/kubernetes:https” -j KUBE-SEP-HU4VO5AJOOQZNVVQ COMMIT
Completed on Tue Nov 15 11:57:54 2016
What you expected to happen: The service ip address should work the same regardless of proxy-mode=iptables or proxy-mode=userspace.
So if I do ‘curl http://mywebserviceip’ it should work despite the kind of plumbing used by kube-proxy
How to reproduce it (as minimally and precisely as possible):
- install ubuntu 16.04 server in 3 VMs
- setup etcd cluster, flanneld overlay network, and docker container runtime
- setup kube-apiserver, kube-controller-manager, kube-scheduler in the master node.
- setup kubelet and kube-proxy in all the nodes. Setup kube-proxy to use --proxy-mode=iptables
- deploy a replication controller yaml config file using kubectl, set it up to deploy an nginx container
- deploy a service yaml config file using kubectl, set it up to use the nginx pods as endpoints
- from the command line of anyone of the nodes, type ‘curl http://<service-ip-address’. Notice that it hangs.
- change --proxy-mode=userspace for one of the nodes and restart kube-proxy there. Try invoke curl again multiple times; notice that sometimes curl successfully returns you a webpage and sometimes hangs.
- change --proxy-mode=userspace for all the nodes and restart kube-proxy in all. Try invoke curl again multiple times; notice that curl always succeeds.
Anything else do we need to know: I made the following manual changes to my iptables configuration (adding KUBE-MARK-MASQ to the mangle table) and i can now successfully query my service when kube-proxy=iptables
Generated by iptables-save v1.6.0 on Tue Nov 15 11:23:19 2016
*mangle :PREROUTING ACCEPT [21850:2940486] :INPUT ACCEPT [21850:2940486] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [21274:1877657] :POSTROUTING ACCEPT [21274:1877657] :KUBE-MARK-MASQ - [0:0] :KUBE-SERVICES - [0:0] :KUBE-SVC-52I4NKETLMIO5ZQJ - [0:0] :KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0] -A PREROUTING -j KUBE-SERVICES -A OUTPUT -j KUBE-SERVICES -A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000 -A KUBE-SERVICES -d 192.168.33.1/32 -p tcp -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y -A KUBE-SERVICES -d 192.168.33.28/32 -p tcp -m tcp --dport 80 -j KUBE-SVC-52I4NKETLMIO5ZQJ -A KUBE-SVC-52I4NKETLMIO5ZQJ -j KUBE-MARK-MASQ -A KUBE-SVC-NPX46M4PTMTKRN6Y -j KUBE-MARK-MASQ COMMIT
Completed on Tue Nov 15 11:23:19 2016
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 21 (11 by maintainers)
closing this issue.
When --proxy-mode=userspace, there is some off-cluster bridging happening. When --proxy-mode=iptables, there is no off-cluster bridging. To enable off-cluster bridging when --proxy-mode=iptables, also set --cluster-cidr.
Well … the userspace configuration also writes its own iptables chains and rules to redirect traffic destined for service vips to the userspace daemon. So the claim that ‘the name says it all’ can be contested.
From a new user’s (like myself) point of view, if they setup a cluster with --proxy-mode=userspace and are able to access their services from apps running outside the cluster, then change to --proxy-mode=iptables and lose that previously existing functionality, they will automatically assume that ‘iptables’ configuration is broken.