k3s: Can't reach internet from pod / container
Environmental Info: K3s Version:
k3s -v
k3s version v1.22.7+k3s1 (8432d7f2)
go version go1.16.10
Host OS Version:
cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
IP Forwarding:
# sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1
Node(s) CPU architecture, OS, and Version:
Linux ansible-awx 5.4.0-105-generic #119-Ubuntu SMP Mon Mar 7 18:49:24 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration: Single node.
# k3s kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ansible-awx Ready control-plane,master 5d10h v1.22.7+k3s1 10.164.12.6 <none> Ubuntu 20.04.4 LTS 5.4.0-105-generic containerd://1.5.9-k3s1
Describe the bug: I cannot connect to the internet from within the pod / container:
# time curl https://www.google.de
curl: (7) Failed to connect to www.google.de port 443: Connection timed out
real 2m11.892s
user 0m0.005s
sys 0m0.005s
Steps To Reproduce:
Install one node k3s cluster with curl -sfL https://get.k3s.io | sh on a ubuntu 20.04 VM.
Setup a simple workload (in my case AWX - https://github.com/ansible/awx-operator#basic-install-on-existing-cluster).
Enter a container and try to access the internet (for example with curl on a public address).
Expected behavior: Accessing the internet should be working the same way like it is from the host.
Actual behavior: No connectivity to the internet from the pod / container at all.
Additional context / logs:
# cat /etc/resolv.conf
search awx.svc.cluster.local svc.cluster.local cluster.local mydomain.com
nameserver 10.43.0.10
options ndots:5
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 68 (22 by maintainers)
What you described turned out to be the reason for my issues as well! My provider (hetzner) had a rule for incoming
TCPports ranging from32768to65535with the flagackin the default template, which was applied to my server. After changing the port range starting from32768to0the connection tests worked reliably. This perfectly explains, why some attempts did work before while others don’t: If the randomly selected port was in the upper (not blocked) range, it worked. For the ports below32768it did not. This has never been an issue before for me before. It still seems strange to me why theVMrunningk3s, thehostand all otherVMs are using the upper half of the available ports andk3sseems to use the full range of possible ports.However, thank you all for your help and support!
I’m really happy that this issue is solved.
Hi @apiening, Were you able to fix this issue? Please let us know if you are able to. I’m also facing the same issue.
Thanks
This is really interesting. The linux kernel by default uses the range 32768 to 60999 (check
cat /proc/sys/net/ipv4/ip_local_port_range) for client TCP connections. However, iptables, when using the flag--randomor--random-fullyreplaces the source TCP port when doing the SNAT with whatever unassigned port, but it doesn’t have to be in that range. Flannel uses that flag to do SNAT. I wonder how other CNI plugins do it… but at least we should document this to avoid more users having this issueI also have a “perhaps-similar-issue” with my k3s worker node in a VM in Contabo. This happens when doing a POST to gitlab.com but based on the issue, this will happen with any outgoing network access:
using k3s v1.22.7.
this bug only happens with pods inside that node in Contabo VM. However, if I reschedule the pod to the other node which is in Hetzner’s instance, all network is working fine.
I ran into the same issue. Just to clarify ports 0-65535 should be open on the firewall?
Disabling firewalld works for me.
In my case, after rebooting my machine, pods have Internet again.
@apiening I sidestepped the problem
Current configuration works. So something is wrong with flannel if it’s joining a cluster in different cloud.
Thanks again @manuelbuil, that’s a good plan.
I tried to bring up a pod with
hostNetwork: truewith the following command:I entered the pod/container and verified, that I do have in fact
host networking. Then I did the same wget test with 100 tries:So with
hostNetwork: trueall requests are passing without issues. The same way they do from theVMand thehost.So this issue must somehow be related to
flannelthe one way or the other. Maybe a configuration issue or a bug that happens only under specific circumstances.