kubernetes: kube-router does not work with iptables 1.8.8 (nf_tables) on host

What happened?

Running kubelet on a host with iptables-1.8.8 (nf_tables mode) does not work due to kube-proxy image uses iptables-1.8.7. kube-proxy ends up replace the rule

-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP

with

-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -j DROP

This leads to network stop working.

What did you expect to happen?

Network continue to work regardless of version of iptables installed on the host.

How can we reproduce it (as minimally and precisely as possible)?

Try to join a worker with iptables 1.8.8 in nf_tables mode on the host.

Anything else we need to know?

Problem is that iptables-save with iptables 1.8.7 does not work with iptables rules created with iptables 1.8.8 (nf_tables).

If I on the host manually (using iptables 1.8.8) do:

iptables-save | grep -E '(Generated by|mytest)'
# Generated by iptables-save v1.8.8 (nf_tables) on Thu Sep 15 14:34:46 2022
# Generated by iptables-save v1.8.8 (nf_tables) on Thu Sep 15 14:34:46 2022
-A KUBE-FIREWALL -m comment --comment mytest -m mark --mark 0x8000/0x8000 -j DROP
# Generated by iptables-save v1.8.8 (nf_tables) on Thu Sep 15 14:34:46 2022

It shows the -m --mark 0x8000/0x8000.

If I then use nsenter to the kube-proxy pod and do the same I get:

/ # nsenter -t $(pidof kube-proxy) -m iptables-save | grep -E '(Generated by|mytest)'
# Generated by iptables-save v1.8.7 on Thu Sep 15 14:36:38 2022
# Generated by iptables-save v1.8.7 on Thu Sep 15 14:36:38 2022
-A KUBE-FIREWALL -m comment --comment mytest -j DROP
# Generated by iptables-save v1.8.7 on Thu Sep 15 14:36:38 2022

As you see, the -m --mark 0x8000/0x8000 is lost and all packages are dropped, not only the marked ones.

Possible workarounds:

  • use iptables in legacy mode on host
  • downgrade iptables to 1.8.7 on the host

Possible fixes:

  • upgrade iptables in kube-proxy image to 1.8.8 (and make sure that it always is latest).
  • change logic in kube-proxy so that it does not touch/re-inject previously created iptables rules (eg does not iptables-save | ... | iptables-restore)

Kubernetes version

$ kubectl version

W0915 14:46:03.488791    2464 loader.go:223] Config not found: /var/lib/k0s/pki/admin.conf

WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.0", GitCommit:"a866cbe2e5bbaa01cfd5e969aa3e033f3282a8a2", GitTreeState:"clean", BuildDate:"2022-08-23T17:44:59Z", GoVersion:"go1.19", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Unable to connect to the server: dial tcp 127.0.0.1:8080: i/o timeout
I think the `Unable to connect` message is due to firewall being broken...

Cloud provider

n/a

OS version

# On Linux:
$ cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.16.2
PRETTY_NAME="Alpine Linux v3.16"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"

$ uname -a
Linux worker0 5.15.67-0-lts #1-Alpine SMP Fri, 09 Sep 2022 06:15:47 +0000 x86_64 GNU/Linu

Install tools

k0s

Container runtime (CRI) and version (if applicable)

n/a

Related plugins (CNI, CSI, …) and versions (if applicable)

n/a

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 29 (28 by maintainers)

Most upvoted comments

OK, I commented more on the kube-router bug (https://github.com/cloudnativelabs/kube-router/issues/1370) suggesting how to fix the problem. I don’t think there’s anything more we need to be tracking here…

FWIW, as a workaround until they fix it, you could try running kubelet with --feature-gates=IPTablesOwnershipCleanup=true, which will cause it to not create the problematic “-j DROP” rule. Of course, this is an alpha feature and you may run into problems with components that assume kubelet still creates those rules, but if you do then you can report them and help that KEP move forward 🙂

/close

That’s the mode: iptables proxy source, and you’re using mode: ipvs. The ipvs proxier doesn’t ever call iptables-save in 1.25.

But ignoring that, even in the iptables proxier, no rules get copied from the iptables-save output to the iptables-restore input; the only part of the iptables-save output that it looks at is the chain declarations, which it uses to figure out if there are stale service/endpoint chains lying around that need to be deleted now.

But I may be wrong it could be something else, like kube-router that does it.

ah, I know nothing about kube-router, but yes, maybe it might be some component other than kube-proxy that is breaking things…

This seems to be a bug in iptables and I don’t think we can plausibly work around it. (Changing the version of iptables in the kube-proxy image would just introduce the bug in the opposite scenario, where kube-proxy has the newer version and kubelet has the older version.) The answer for now seems to be “don’t use iptables 1.8.8, it’s broken”.