gluetun: Bug: Kubernetes services cannot be resolved due to DNS overrides and routing conflicts
TLDR: Kubernetes services cannot be resolved anymore because the DNS configuration is being overwritten
-
Is this urgent?
- Yes
- No
-
What VPN service provider are you using?
- PIA
- Mullvad
- Windscribe
- Surfshark
- Cyberghost
-
What’s the version of the program?
Running version latest built on 2020-07-09T11:57:17Z (commit dc1c7ea)
-
What are you using to run the container?
- Docker run
- Docker Compose
- Kubernetes
- Docker stack
- Docker swarm
- Podman
- Other:
-
Extra information
Logs:
DNS over TLS settings:
|--DNS over TLS provider:
|--cloudflare
|--Caching: disabled
|--Block malicious: disabled
|--Block surveillance: disabled
|--Block ads: disabled
...
2020-07-09T13:22:42.163Z INFO firewall configurator: accepting any input traffic on port 8888
2020-07-09T13:22:42.163Z INFO http server: listening on 0.0.0.0:8000
2020-07-09T13:22:42.163Z INFO dns configurator: using DNS address 1.1.1.1 internally
2020-07-09T13:22:42.163Z INFO dns configurator: using DNS address 1.1.1.1 system wide
2020-07-09T13:22:42.163Z INFO openvpn configurator: writing auth file /etc/openvpn/auth.conf
2020-07-09T13:22:42.164Z INFO openvpn configurator: starting openvpn
2020-07-09T13:22:42.166Z INFO openvpn: OpenVPN 2.4.9 x86_64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on Apr 20 2020
2020-07-09T13:22:42.166Z INFO openvpn: library versions: OpenSSL 1.1.1g 21 Apr 2020, LZO 2.10
2020-07-09T13:22:42.167Z INFO tinyproxy configurator: starting tinyproxy server
2020-07-09T13:22:42.168Z INFO openvpn: WARNING: you are using user/group/chroot/setcon without persist-tun -- this may cause restarts to fail
2020-07-09T13:22:42.170Z INFO openvpn: TCP/UDP: Preserving recently used remote address: [AF_INET]81.19.209.124:1194
2020-07-09T13:22:42.170Z INFO openvpn: UDP link local: (not bound)
2020-07-09T13:22:42.170Z INFO openvpn: UDP link remote: [AF_INET]81.19.209.124:1194
2020-07-09T13:22:42.170Z INFO openvpn: NOTE: UID/GID downgrade will be delayed because of --client, --pull, or --up-delay
2020-07-09T13:22:42.182Z INFO openvpn: WARNING: 'link-mtu' is used inconsistently, local='link-mtu 1633', remote='link-mtu 1581'
2020-07-09T13:22:42.182Z INFO openvpn: WARNING: 'cipher' is used inconsistently, local='cipher AES-256-CBC', remote='cipher AES-256-GCM'
2020-07-09T13:22:42.182Z INFO openvpn: WARNING: 'auth' is used inconsistently, local='auth SHA512', remote='auth [null-digest]'
2020-07-09T13:22:42.182Z INFO openvpn: [nl-ams-v024.prod.surfshark.com] Peer Connection Initiated with [AF_INET]81.19.209.124:1194
Configuration file:
apiVersion: v1
kind: Pod
metadata:
name: vpn-test
namespace: default
spec:
containers:
- name: shell
image: ubuntu
command: ['bash']
stdin: true
tty: true
- name: proxy
env:
- name: USER
value: ...
- name: PASSWORD
value: ...
- name: VPNSP
value: 'surfshark'
- name: FIREWALL
value: 'off'
- name: EXTRA_SUBNETS
value: '10.192.0.0/9'
- name: SHADOWSOCKS
value: 'on'
- name: TINYPROXY
value: 'on'
- name: DOT
value: 'on'
- name: DOT_CACHING
value: 'off'
- name: BLOCK_MALICIOUS
value: 'off'
- name: DNS_UPDATE_PERIOD
value: '0'
image: qmcgaw/private-internet-access
imagePullPolicy: Always
ports:
- containerPort: 8888
- containerPort: 8388
- containerPort: 8388
protocol: UDP
- containerPort: 8000
securityContext:
privileged: true
capabilities:
add:
- NET_ADMIN
Host OS: DigitalOcean Kubernetes cluster
I believe that svc.cluster.local
should be added to the search
parameter in /etc/resolv.conf
and that unbound needs to use the internal k8s dns server to resolve those local domainnames.
Running in a normal pod:
root@shell:/# nslookup kube-dns.kube-system
Server: 10.245.0.10
Address: 10.245.0.10#53
Name: kube-dns.kube-system.svc.cluster.local
Address: 10.245.0.10
Running with the VPN sidecar:
root@shell:/# nslookup kube-dns.kube-system
;; Got SERVFAIL reply from 127.0.0.1, trying next server
;; Got SERVFAIL reply from 127.0.0.1, trying next server
Server: 127.0.0.1
Address: 127.0.0.1#53
** server can't find kube-dns.kube-system.default.svc.cluster.local: SERVFAIL
Running from the VPN sidecar:
/ # nslookup kube-dns.kube-system
Server: 127.0.0.1
Address: 127.0.0.1:53
** server can't find kube-dns.kube-system: SERVFAIL
** server can't find kube-dns.kube-system: SERVFAIL
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 8
- Comments: 52 (41 by maintainers)
Commits related to this issue
- DNS_KEEP_NAMESERVER variable, refers to #188 — committed to qdm12/gluetun by qdm12 4 years ago
- DNS_KEEP_NAMESERVER variable, refers to #188 — committed to qdm12/gluetun by qdm12 4 years ago
- Fix routing reading issues - Detect VPN gateway properly - Fix local subnet detection, refers to #188 - Split LocalSubnet from DefaultRoute (2 different routes actually) — committed to qdm12/gluetun by qdm12 4 years ago
- Fixing extra subnets firewall rules - Fix #194 - Fix #190 - Refers to #188 — committed to qdm12/gluetun by qdm12 4 years ago
The correct solution for k8s, imho, is to simply offer the option not to have ANY nameserver being overwrithen.
We really want to move to GlueTun for our TrueCharts VPN addon, but this DNS overriding behavior is going to cause a boat load of unexpected behavior for our users.
Lucky us, specifying multiple nameservers in /etc/resolv.conf works by order. It will try the first one and then the second one etc. on failure.
I’ll therefore add whatever DNS before the already existing nameserver instead of overwriting it. That should do it. I’ll do that tonight and we can then test.
TLDR: For optimal use on kubernetes, sidecars should not be messing with DNS at all.
This is the same as #281 I believe. I’m working on a sort of UDP / DNS proxy to redirect DNS requests to Unbound (DNS over TLS) or the native DNS (from the Docker bridge or K8s) depending on the request. If it has no dot it’s sent to the native DNS and otherwise sent to Unbound, such that Unbound block lists can’t go through using the native Docker DNS.
Not sure that will solve it for K8s as this one might use dots in its private addresses, but I’ll see what I can do. Doing a simple IP check on the result to check it’s private could do the trick. Anyway, that’ll take several days to finish, but I’ll keep you updated. Thanks for your patience!
Hello,
I’m curious to see there is a workaround/fix for this. I’m having the exact same issue as @toniopelo. I have used
DNS_KEEP_NAMESERVER
switched to on andDOT
switched off. I have also set theDNS_ADDRESS
to my k8s DNS server to no availability.I tested using the standard local domains for k8s in both containers with no luck. My current setup is going using nordvpn as the client and gluetun is setup as a sidecar.
@qdm12
I am stuck on this as well. I got to admit I probably only understood half of the above discussion. What I can report is that playing around with the following env vars didn’t help:
I switched everything on and off and didn’t use the
UNBLOCK
in the end. UsingDNS_KEEP_NAMESERVER
didn’t help either, puttingDNS_ADDRESS
to the originalnameserver
ip (k8s dns server) is the same. Even manually resetting/etc/resolv.conf
to its original value to try if k8s internal dns were resolved didn’t help, so I guess there are other stuff to tweak that I don’t understand.I tested all these by executing
nslookup someinternalsubdomain.default.svc.cluster.local
on the gluetun container and on the main container as well (the container along which gluetun run as a sidecar). In my situation I’m stuck with the sidecar option because my pods (which are jobs) scales in such way that it would be a mess to have it as a separated deployment.I’m all ears if you have any tip and I can debug further if you can guide me through it @qdm12 😃.
After looking around a bit I think the only “good” solution is to leave the
/etc/resolv.conf
alone in kubernetes and fix it by modifying thespec.dnsConfig
settings.I don’t think that’s possible right now though. It’s either writing the local dns server when unbound is enabled or 1.1.1.1 if not. Setting
DNS_PLAINTEXT_ADDRESS=''
doesn’t work either.