k3s: k3s on rhel 8 network/dns probleme and metrics not work

Hello I try to make k3s work in a redhat 8.4 but I encounter network or dns problems, I checked the modprob as well as sysctl but nothing happens maybe is flannel problem ?

firewalld and selinux disabled nm-cloud-setup.service nm-cloud-setup.timer no present k3s installed by script https://get.k3s.io

work fine in rhel 7.9

Environmental Info: K3s Version: k3s version v1.22.5+k3s1 (405bf79d) go version go1.16.10

Node(s) CPU architecture, OS, and Version: Linux vldsocfg01 4.18.0-305.25.1.el8_4.x86_64 #1 SMP Mon Oct 18 14:34:11 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: 2 master 3 node 2 node only front (traefik / metallb / haproxy )

Describe the bug:

pods crash with dns resolution probleme coredns:

  [ERROR] plugin/errors: 2 7635134873774865456.7522827499224113179. HINFO: read udp 10.200.3.11:45684->XXXXXXX:53: i/o timeout

longhorn:

  time="2022-01-24T19:50:55Z" level=info msg="CSI Driver: driver.longhorn.io version: v1.2.2, manager URL http://longhorn-backend:9500/v1"
2022/01/24 19:50:03 [emerg] 1#1: host not found in upstream "longhorn-backend" in /etc/nginx/nginx.conf:32

metrics:

E0124 20:17:27.096421       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg03:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg03"
E0124 20:17:27.100536       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg01:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg01"
E0124 20:18:27.049233       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg01:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg01"
E0124 20:18:27.056477       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg02:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg02"
E0124 20:18:27.068495       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg03:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg03"
E0124 20:18:27.076854       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg01:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg01"
E0124 20:18:27.084260       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg01:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg01"
E0124 20:18:27.090960       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg02:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg02"
E0124 20:18:27.104001       1 scraper.go:139] "Failed to scrape node" err="Get \"https://vldsocfg02:10250/stats/summary?only_cpu_and_memory=true\": dial tcp: i/o timeout" node="vldsocfg02"

in logs k3s see metrics error

Jan 24 20:53:03 vldsocfg01 k3s[36279]: E0124 20:53:03.842079   36279 available_controller.go:524] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again
Jan 24 20:53:06 vldsocfg01 k3s[36279]: E0124 20:53:06.068125   36279 cri_stats_provider.go:372] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs"
Jan 24 20:53:06 vldsocfg01 k3s[36279]: E0124 20:53:06.068150   36279 kubelet.go:1343] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
Jan 24 20:53:06 vldsocfg01 k3s[36279]: E0124 20:53:06.097788   36279 kubelet.go:1991] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
Jan 24 20:51:45 vldsocfg01 k3s[33811]: E0124 20:51:45.975122   33811 available_controller.go:524] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.201.36.96:443/apis/metrics.k8s.io/v1beta1: Get "https://10.201.36.96:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Jan 24 20:51:46 vldsocfg01 k3s[33811]: E0124 20:51:46.976471   33811 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
Jan 24 20:51:50 vldsocfg01 k3s[33811]: E0124 20:51:50.983597   33811 available_controller.go:524] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.201.36.96:443/apis/metrics.k8s.io/v1beta1: Get "https://10.201.36.96:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.201.36.96:443: i/o timeout
Jan 24 20:51:51 vldsocfg01 k3s[33811]: E0124 20:51:51.984292   33811 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable

lsmod Module Size Used by xt_state 16384 0 veth 28672 0 nf_conntrack_netlink 49152 0 xt_recent 20480 6 xt_statistic 16384 21 xt_nat 16384 44 ip6t_MASQUERADE 16384 1 ip_vs_sh 16384 0 ip_vs_wrr 16384 0 ip_vs_rr 16384 0 ip_vs 172032 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr nft_chain_nat 16384 8 ipt_MASQUERADE 16384 5 vxlan 65536 0 ip6_udp_tunnel 16384 1 vxlan udp_tunnel 20480 1 vxlan nfnetlink_log 20480 1 nft_limit 16384 1 ipt_REJECT 16384 5 nf_reject_ipv4 16384 1 ipt_REJECT xt_limit 16384 0 xt_NFLOG 16384 1 xt_physdev 16384 2 xt_conntrack 16384 21 xt_mark 16384 25 xt_multiport 16384 4 xt_addrtype 16384 7 nft_counter 16384 329 xt_comment 16384 296 nft_compat 20480 550 nf_tables 172032 884 nft_compat,nft_counter,nft_chain_nat,nft_limit ip_set 49152 0 nfnetlink 16384 5 nft_compat,nf_conntrack_netlink,nf_tables,ip_set,nfnetlink_log iptable_nat 16384 0 nf_nat 45056 5 ip6t_MASQUERADE,ipt_MASQUERADE,xt_nat,nft_chain_nat,iptable_nat nf_conntrack 172032 8 xt_conntrack,nf_nat,ip6t_MASQUERADE,xt_state,ipt_MASQUERADE,xt_nat,nf_conntrack_netlink,ip_vs nf_defrag_ipv6 20480 2 nf_conntrack,ip_vs nf_defrag_ipv4 16384 1 nf_conntrack cfg80211 835584 0 rfkill 28672 2 cfg80211 vsock_loopback 16384 0 vmw_vsock_virtio_transport_common 32768 1 vsock_loopback vmw_vsock_vmci_transport 32768 1 vsock 45056 5 vmw_vsock_virtio_transport_common,vsock_loopback,vmw_vsock_vmci_transport sunrpc 540672 1 intel_rapl_msr 16384 0 intel_rapl_common 24576 1 intel_rapl_msr isst_if_mbox_msr 16384 0 isst_if_common 16384 1 isst_if_mbox_msr nfit 65536 0 libnvdimm 192512 1 nfit crct10dif_pclmul 16384 1 crc32_pclmul 16384 0 ghash_clmulni_intel 16384 0 rapl 20480 0 vmw_balloon 24576 0 joydev 24576 0 pcspkr 16384 0 vmw_vmci 86016 2 vmw_balloon,vmw_vsock_vmci_transport i2c_piix4 24576 0 br_netfilter 24576 0 bridge 192512 1 br_netfilter stp 16384 1 bridge llc 16384 2 bridge,stp overlay 135168 4 ip_tables 28672 1 iptable_nat xfs 1515520 7 libcrc32c 16384 5 nf_conntrack,nf_nat,nf_tables,xfs,ip_vs sr_mod 28672 0 cdrom 65536 1 sr_mod sd_mod 53248 4 t10_pi 16384 1 sd_mod sg 40960 0 ata_generic 16384 0 vmwgfx 368640 1 crc32c_intel 24576 1 drm_kms_helper 233472 1 vmwgfx syscopyarea 16384 1 drm_kms_helper sysfillrect 16384 1 drm_kms_helper sysimgblt 16384 1 drm_kms_helper fb_sys_fops 16384 1 drm_kms_helper ata_piix 36864 0 ttm 114688 1 vmwgfx serio_raw 16384 0 libata 270336 2 ata_piix,ata_generic drm 569344 4 vmwgfx,drm_kms_helper,ttm vmxnet3 65536 0 vmw_pvscsi 28672 8 dm_mod 151552 21 fuse 151552 1

iptables 1.8.4

in sysctl net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1

Steps To Reproduce:

  • Installed K3s:
  • rhel 8.4
  • multi node and master

tried to delete iptables package of real for use iptables from k3s but same result

UPDATE:

with params --flannel-backend=host-gw is work, but is good fix ? ingress not work with host-gw because front node is not in same network of worker

Jan 25 14:07:43 vldsocfg02-front k3s[103276]: I0125 14:07:43.655113  103276 route_network.go:54] Watching for new subnet leases
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: I0125 14:07:43.655271  103276 route_network.go:93] Subnet added: 10.42.4.0/24 via x.y.6.8
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: I0125 14:07:43.655414  103276 route_network.go:93] Subnet added: 10.42.0.0/24 via x.y.6.3
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: E0125 14:07:43.655508  103276 route_network.go:168] Error adding route to {Ifindex: 2 Dst: 10.42.0.0/24 Src: <nil> Gw: x.y.6.3 Flags: [] Table: 0}
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: I0125 14:07:43.655532  103276 route_network.go:93] Subnet added: 10.42.1.0/24 via x.y.6.15
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: E0125 14:07:43.655599  103276 route_network.go:168] Error adding route to {Ifindex: 2 Dst: 10.42.1.0/24 Src: <nil> Gw: x.y.6.15 Flags: [] Table: 0}
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: I0125 14:07:43.655607  103276 route_network.go:93] Subnet added: 10.42.2.0/24 via x.y.6.13
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: E0125 14:07:43.655662  103276 route_network.go:168] Error adding route to {Ifindex: 2 Dst: 10.42.2.0/24 Src: <nil> Gw: x.y.6.13 Flags: [] Table: 0}
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: I0125 14:07:43.655673  103276 route_network.go:93] Subnet added: 10.42.3.0/24 via x.y.6.8
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: E0125 14:07:43.655730  103276 route_network.go:168] Error adding route to {Ifindex: 2 Dst: 10.42.3.0/24 Src: <nil> Gw: x.y.6.8 Flags: [] Table: 0}
Jan 25 14:07:43 vldsocfg02-front k3s[103276]: I0125 14:07:43.661130  103276 iptables.go:216] Some iptables rules are missing; deleting and recreating rules

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 33 (17 by maintainers)

Most upvoted comments

Oh ==> bad udp cksum 0xf1b1 -> 0x521c!,

you might be hitting a kernel bug that affects udp + vxlan when using the offloading feature of the kernel. We saw it in Ubuntu but thought it was fixed in RHEL ==> https://github.com/rancher/rke2/issues/1541

Could you please try disabling the offloading in all nodes? Execute this command sudo ethtool -K flannel.1 tx-checksum-ip-generic off and try again

Same issue still happening with RHEL8.6. Pod communication entirely broken.

added the above fix to crontab to band-aid things so it survives reboot.

@reboot ethtool -K flannel.1 tx-checksum-ip-generic off

This fix should be posted in the readme to avoid headaches. It took a bit of digging to find this issue.

Same issue still happening with RHEL8.6. Pod communication entirely broken.

added the above fix to crontab to band-aid things so it survives reboot.

@reboot ethtool -K flannel.1 tx-checksum-ip-generic off

This fix should be posted in the readme to avoid headaches. It took a bit of digging to find this issue.

We encountered an issue where the flannel.1 interface was not accessible immediately after a reboot. To resolve this, we developed a bash script and established a systemd service as a workaround.

  1. sudo vi /usr/local/bin/flannel-fix.sh
#!/usr/bin/env bash

# Maximum wait time in seconds (e.g., 300 seconds = 5 minutes)
MAX_WAIT=300
WAIT_INTERVAL=10
ELAPSED_TIME=0

while ! ip link show flannel.1 &> /dev/null; do
  sleep $WAIT_INTERVAL
  ELAPSED_TIME=$((ELAPSED_TIME + WAIT_INTERVAL))
  if [ $ELAPSED_TIME -ge $MAX_WAIT ]; then
    echo "Timed out waiting for flannel.1 interface to become ready."
    exit 1
  fi
done

# Now that flannel.1 is up, run the ethtool command
ethtool -K flannel.1 tx-checksum-ip-generic off
  1. sudo chmod +x /usr/local/bin/flannel-fix.sh

  2. sudo vi /etc/systemd/system/flannel-fix.service

IMPORTANT: Change k3s.service to k3s-agent.service on agent nodes

[Unit]
Description=Run command to fix flannel (vxlan + UDP) once after reboot and K3s is up
Requires=k3s.service
After=k3s.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/flannel-fix.sh

[Install]
WantedBy=default.target
  1. Execute following commands one by one
sudo systemctl daemon-reload
sudo systemctl enable flannel-fix.service
sudo systemctl start flannel-fix.service

Note that there are issues with RHEL 8 and vmware. There is one related to vxlan which maybe is the root cause for our issue ==> https://docs.vmware.com/en/VMware-vSphere/6.7/rn/esxi670-202111001.html#esxi670-202111401-bg-resolved

worked !

[root@vldsocfg01-node ~]# dig @10.43.0.10 kubernetes.default.svc.cluster.local

; <<>> DiG 9.11.26-RedHat-9.11.26-4.el8_4 <<>> @10.43.0.10 kubernetes.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: f6c047e0da67c246 (echoed)
;; QUESTION SECTION:
;kubernetes.default.svc.cluster.local. IN A

;; ANSWER SECTION:
kubernetes.default.svc.cluster.local. 5	IN A	10.43.0.1

;; Query time: 0 msec
;; SERVER: 10.43.0.10#53(10.43.0.10)
;; WHEN: Mon Jan 31 19:27:18 CET 2022
;; MSG SIZE  rcvd: 129

[root@vldsocfg01-node ~]

@manuelbuil Thanks for helping me debug I have see and tried this fix but I must have made a mistake on the command

thank you !

Thanks for helping and your quick response! Something we need to fix in flannel upstream