gluetun: Bug: Connectivity is lost once gluetun container is restarted

Is this urgent?: No (kinda it is, since this causes complete connection loss if this “bug” happens)

Host OS: Tested on both Fedora 34 and (up-to-date) Arch Linux ARM (32bit/RPi 4B)

CPU arch or device name: amd64 & armv7

What VPN provider are you using: NordVPN

What are you using to run your container?: Docker Compose

What is the version of the program

x64 & armv7: Running version latest built on 2021-09-23T17:23:28Z (commit 985cf7b)

Steps to reproduce issue:

  1. Using recommended docker-compose.yml, configure gluetun and another container (in my case, xyz, though it can be something like qbittorrent or whatever you want) to use gluetun’s network stack. Publish xyz’ ports through gluetun’s network stack.
  2. Either: a) restart gluetun using good ol’ docker restart gluetun, or b) manually cause a temporary network problem in such way that gluetun container dies/exits. Then restart gluetun.
  3. Now try to use xyz through its published ports: you’ll receive a connection refused error, unless you restart xyz service again. You can also -exec it into the container and run curl/wget/ping/etc:

Expected behavior: xyz should have internet connectivity through gluetun’s network stack and be accesible through gluetun’s published/exposed ports, even if gluetun is restarted. This is, unfortunately not the case: xyz’s network stack just dies, no data in, no data out.

Additional notes:

  1. I did use FIREWALL_OUTBOUND_SUBNETS - didn’t make a difference.
  2. I noticed quite interesting stuff once gluetun is restarted: a) Routing entries from containers using network_mode: service:gluetun completely disappear. b) Restarting gluetun doesn’t bring back original routing tables. c) NetworkMode seems to be okay.

Terminal example

# At this point, gluetun has been manually restarted. Then I exec -it'd into an affected container that was using gluetun's network stack:
/app # ip ro sh 
/app # 
[root@fedora pepe]# docker restart xyz
[root@fedora pepe]# docker exec -it xyz /bin/sh 
/app # ip ro sh
0.0.0.0/1 via 10.8.1.1 dev tun0 
default via 172.17.0.1 dev eth0 
10.8.1.0/24 dev tun0 scope link  src 10.8.1.4 
37.120.209.219 via 172.17.0.1 dev eth0 
128.0.0.0/1 via 10.8.1.1 dev tun0 
172.17.0.0/16 dev eth0 scope link  src 172.17.0.2 

Brief docker inspect output from affected container

# snip
            "NetworkMode": "container:f77af999d9de92af66094dd9db0f854f1a2da9ceabddc47239bc5b89f577247f",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "unless-stopped",
                "MaximumRetryCount": 0
            },

f77[…] is gluetun’s container ID.

Full gluetun logs:

2021/09/24 16:39:47 INFO Alpine version: 3.14.2
2021/09/24 16:39:47 INFO OpenVPN 2.4 version: 2.4.11
2021/09/24 16:39:47 INFO OpenVPN 2.5 version: 2.5.2
2021/09/24 16:39:47 INFO Unbound version: 1.13.2
2021/09/24 16:39:47 INFO IPtables version: v1.8.7
2021/09/24 16:39:47 INFO Settings summary below:
|--VPN:
   |--Type: openvpn
   |--OpenVPN:
      |--Version: 2.5
      |--Verbosity level: 1
      |--Network interface: tun0
      |--Run as root: enabled
   |--Nordvpn settings:
      |--Regions: mexico, sweden
      |--OpenVPN selection:
         |--Protocol: udp
|--DNS:
   |--Plaintext address: 1.1.1.1
   |--DNS over TLS:
      |--Unbound:
          |--DNS over TLS providers:
              |--Cloudflare
          |--Listening port: 53
          |--Access control:
              |--Allowed:
                  |--0.0.0.0/0
                  |--::/0
          |--Caching: enabled
          |--IPv4 resolution: enabled
          |--IPv6 resolution: disabled
          |--Verbosity level: 1/5
          |--Verbosity details level: 0/4
          |--Validation log level: 0/2
          |--Username: 
      |--Blacklist:
         |--Blocked categories: malicious
         |--Additional IP networks blocked: 13
      |--Update: every 24h0m0s
|--Firewall:
   |--Outbound subnets: 192.168.0.0/24
|--Log:
   |--Level: INFO
|--System:
   |--Process user ID: 1000
   |--Process group ID: 1000
   |--Timezone: REDACTED
|--Health:
   |--Server address: 127.0.0.1:9999
   |--Address to ping: github.com
   |--VPN:
      |--Initial duration: 6s
      |--Addition duration: 5s
|--HTTP control server:
   |--Listening port: 8000
   |--Logging: enabled
|--Public IP getter:
   |--Fetch period: 12h0m0s
   |--IP file: /tmp/gluetun/ip
|--Github version information: enabled
2021/09/24 16:39:47 INFO routing: default route found: interface eth0, gateway 172.17.0.1
2021/09/24 16:39:47 INFO routing: local ethernet link found: eth0
2021/09/24 16:39:47 INFO routing: local ipnet found: 172.17.0.0/16
2021/09/24 16:39:47 INFO routing: default route found: interface eth0, gateway 172.17.0.1
2021/09/24 16:39:47 INFO routing: adding route for 0.0.0.0/0
2021/09/24 16:39:47 INFO firewall: firewall disabled, only updating allowed subnets internal list
2021/09/24 16:39:47 INFO routing: default route found: interface eth0, gateway 172.17.0.1
2021/09/24 16:39:47 INFO routing: adding route for 192.168.0.0/24
2021/09/24 16:39:47 INFO TUN device is not available: open /dev/net/tun: no such file or directory; creating it...
2021/09/24 16:39:47 INFO firewall: enabling...
2021/09/24 16:39:47 INFO firewall: enabled successfully
2021/09/24 16:39:47 INFO dns over tls: using plaintext DNS at address 1.1.1.1
2021/09/24 16:39:47 INFO healthcheck: listening on 127.0.0.1:9999
2021/09/24 16:39:47 INFO http server: listening on :8000
2021/09/24 16:39:47 INFO firewall: setting VPN connection through firewall...
2021/09/24 16:39:47 INFO openvpn: OpenVPN 2.5.2 armv7-alpine-linux-musleabihf [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on May  4 2021
2021/09/24 16:39:47 INFO openvpn: library versions: OpenSSL 1.1.1l  24 Aug 2021, LZO 2.10
2021/09/24 16:39:47 INFO openvpn: TCP/UDP: Preserving recently used remote address: [AF_INET]86.106.103.27:1194
2021/09/24 16:39:47 INFO openvpn: UDP link local: (not bound)
2021/09/24 16:39:47 INFO openvpn: UDP link remote: [AF_INET]86.106.103.27:1194
2021/09/24 16:39:48 WARN openvpn: 'link-mtu' is used inconsistently, local='link-mtu 1633', remote='link-mtu 1634'
2021/09/24 16:39:48 WARN openvpn: 'comp-lzo' is present in remote config but missing in local config, remote='comp-lzo'
2021/09/24 16:39:48 INFO openvpn: [se-nl8.nordvpn.com] Peer Connection Initiated with [AF_INET]86.106.103.27:1194
2021/09/24 16:39:49 INFO openvpn: TUN/TAP device tun0 opened
2021/09/24 16:39:49 INFO openvpn: /sbin/ip link set dev tun0 up mtu 1500
2021/09/24 16:39:49 INFO openvpn: /sbin/ip link set dev tun0 up
2021/09/24 16:39:49 INFO openvpn: /sbin/ip addr add dev tun0 10.8.8.14/24
2021/09/24 16:39:49 INFO openvpn: Initialization Sequence Completed
2021/09/24 16:39:49 INFO dns over tls: downloading DNS over TLS cryptographic files
2021/09/24 16:39:50 INFO healthcheck: healthy!
2021/09/24 16:39:53 INFO dns over tls: downloading hostnames and IP block lists
2021/09/24 16:40:11 INFO dns over tls: init module 0: validator
2021/09/24 16:40:11 INFO dns over tls: init module 1: iterator
2021/09/24 16:40:11 INFO dns over tls: start of service (unbound 1.13.2).
2021/09/24 16:40:12 INFO dns over tls: generate keytag query _ta-4a5c-4f66. NULL IN
2021/09/24 16:40:13 INFO dns over tls: generate keytag query _ta-4a5c-4f66. NULL IN
2021/09/24 16:40:16 INFO dns over tls: ready
2021/09/24 16:40:18 INFO vpn: You are running on the bleeding edge of latest!
2021/09/24 16:40:19 INFO ip getter: Public IP address is 213.232.87.176 (Netherlands, North Holland, Amsterdam)

docker-compose.yml:

  gluetun:
    image: qmcgaw/gluetun
    container_name: gluetun
    restart: unless-stopped
    cap_add:
      - NET_ADMIN
    ports:
      - 4533:4533 #navidrome
    environment:
      - OPENVPN_USER=REDACTED
      - OPENVPN_PASSWORD=REDACTED
      - VPNSP=nordvpn
      - VPN_TYPE=openvpn
      - REGION=REDACTED
      - TZ=REDACTED
      - FIREWALL_OUTBOUND_SUBNETS=192.168.0.0/24

# navidrome (can be literally anything else)
  navidrome:
    image: deluan/navidrome:develop
    container_name: navidrome
    restart: unless-stopped
    environment:
      - PGID=1000
      - PUID=1000
    volumes:
      - dockervolume:/music:ro
    network_mode: service:gluetun
    depends_on:
      - gluetun

Nonetheless I’d like to thank you for creating gluetun. I’d be more than happy to help you fix this issue if this is a gluetun bug. Hopefully it’s a misconfiguration in my side.

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 14
  • Comments: 31 (4 by maintainers)

Most upvoted comments

I have bits and pieces for it, but I am moving country + visiting family + starting a new job right now, so it might take at least 2 weeks for me to finish it up, sorry about that. But it’s at the top of my OSS things-to-do list, so it won’t be forgotten 😉

For the time being, if anyone wants a dirty, cheap solution, here’s my current setup:

  autoheal:
   ... snip ...
  literallyanything:
    image: blahblah
    container_name: blahblah
    network_mode: service:gluetun
    restart: unless-stopped
    healthcheck:
      test: "curl -sf https://example.com  || exit 1"
      interval: 1m
      timeout: 10s
      retries: 1

This will only work with containers where curl is already preinstalled. There are docker images that include wget but not curl, in which case you can replace test command with wget --no-verbose --tries=1 --spider https://example.com/ || exit 1. You can also use qdm12’s deunhealth instead of autoheal.

If anybody wants to give it a try, I have written cascandaliato/docker-restarter. Right now it covers only one scenario: if A depends on B and B restarts then restart A.

Hey there! Thanks for the detailed issue!

It is a well known Docker problem I need to workaround. Let’s keep this opened for now although there is at least one duplicate issue about this problem somewhere in the issues.

Note this only happens if gluetun is updated and uses a different image (afaik).

For now, you might want to have all your gluetun and connected containers in a single docker-compose.yml and docker-compose down && docker-compose up -d them (what I do).

I’m developing https://github.com/qdm12/deunhealth and should add a feature tailored for this problem soon (give it 1-5 days), feel free to subscribe to releases on that side repo. That way it would watch your containers and restart your connected containers if gluetun gets updated & restarted.

I’d also like to thank you for creating gluetun and to say this is a very good project. Any progress on this?

No sorry, but I’ll get to it soon.

Ideally, there is a way to re-attach the disconnected containers to gluetun without restarting them (I guess with Docker’s Go API since I doubt the docker cli supports such thing). That would work by marking each connected container with a label to indicate this network re-attachment.

If there isn’t, I’ll setup something to cascade the restart from gluetun to connected containers, probably using labels to avoid any surprise (mark gluetun as a parent container with a unique id, and mark all connected containers as child containers with that same id).