docker-ipv6nat: Exits with "unable to detect hairpin mode (is the docker daemon running?)"

Version: v0.4.3 docker. Docker version: 20.10.1 and 20.10.2 OS: CentOS Linux release 8.3.2011 (Core)

After a system update, upon launching I get this error:

$ docker logs ipv6nat
2021/01/09 17:26:57 unable to detect hairpin mode (is the docker daemon running?)

After which the container exits and restarts.

Thinking it might be a permissions issue, I removed all --cap-adds, leaving only the --cap-drop ALL to test, but that broke it more:

2021/01/09 18:07:38 running [/sbin/iptables -t nat -C OUTPUT -m addrtype --dst-type LOCAL -j DOCKER --wait]: exit status 3: addrtype: Could not determine whether revision 1 is supported, assuming it is.
addrtype: Could not determine whether revision 1 is supported, assuming it is.
iptables v1.8.4 (legacy): can't initialize iptables table `nat': Permission denied (you must be root)
Perhaps iptables or your kernel needs to be upgraded.

I then tried to give it --cap-add ALL, but that did not fix it.

Since part of the system update was docker-ce, I thought maybe it had changed the backend rules, but:

# /sbin/iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 13:09:03 2021
*nat
...
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
...

Clearly the right rule still exists. And checking manually:

# /sbin/iptables -t nat -C OUTPUT -m addrtype --dst-type LOCAL -j DOCKER --wait; echo "$?"
iptables: Bad rule (does a matching rule exist in that chain?).
1
# /sbin/iptables -t nat -C OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER --wait; echo "$?"
0

The actual checking commands returns correctly as expected. I am using this code section as the reference: https://github.com/robbertkl/docker-ipv6nat/blob/v0.4.3/manager.go#L79-L86

At this point I downgraded dockerd back to 20.10.1, but I got the same error.

What is strange is that when I first did the system upgrade, dockerd restarted itself as usual, and all my containers came back online with IPv6 working. It was after an OS restart that this error started.

I tried to do a system rollback, but the old package versions couldn’t be found, so I’m stuck.

Full package list that I upgraded:

Package	New Version	Old Version
NetworkManager	1:1.26.0-12.el8_3.x86_64	1:1.26.0-9.el8_3.x86_64
NetworkManager-libnm	1:1.26.0-12.el8_3.x86_64	1:1.26.0-9.el8_3.x86_64
NetworkManager-team	1:1.26.0-12.el8_3.x86_64	1:1.26.0-9.el8_3.x86_64
NetworkManager-tui	1:1.26.0-12.el8_3.x86_64	1:1.26.0-9.el8_3.x86_64
gnutls	3.6.14-7.el8_3.x86_64	3.6.14-6.el8.x86_64
iptables	1.8.4-15.el8_3.3.x86_64	1.8.4-15.el8.x86_64
iptables-ebtables	1.8.4-15.el8_3.3.x86_64	1.8.4-15.el8.x86_64
iptables-libs	1.8.4-15.el8_3.3.x86_64	1.8.4-15.el8.x86_64
iptables-services	1.8.4-15.el8_3.3.x86_64	1.8.4-15.el8.x86_64
iwl100-firmware	39.31.5.1-101.el8_3.1.noarch	39.31.5.1-99.el8.1.noarch
iwl1000-firmware	1:39.31.5.1-101.el8_3.1.noarch	1:39.31.5.1-99.el8.1.noarch
iwl105-firmware	18.168.6.1-101.el8_3.1.noarch	18.168.6.1-99.el8.1.noarch
iwl135-firmware	18.168.6.1-101.el8_3.1.noarch	18.168.6.1-99.el8.1.noarch
iwl2000-firmware	18.168.6.1-101.el8_3.1.noarch	18.168.6.1-99.el8.1.noarch
iwl2030-firmware	18.168.6.1-101.el8_3.1.noarch	18.168.6.1-99.el8.1.noarch
iwl3160-firmware	1:25.30.13.0-101.el8_3.1.noarch	1:25.30.13.0-99.el8.1.noarch
iwl3945-firmware	15.32.2.9-101.el8_3.1.noarch	15.32.2.9-99.el8.1.noarch
iwl4965-firmware	228.61.2.24-101.el8_3.1.noarch	228.61.2.24-99.el8.1.noarch
iwl5000-firmware	8.83.5.1_1-101.el8_3.1.noarch	8.83.5.1_1-99.el8.1.noarch
iwl5150-firmware	8.24.2.2-101.el8_3.1.noarch	8.24.2.2-99.el8.1.noarch
iwl6000-firmware	9.221.4.1-101.el8_3.1.noarch	9.221.4.1-99.el8.1.noarch
iwl6000g2a-firmware	18.168.6.1-101.el8_3.1.noarch	18.168.6.1-99.el8.1.noarch
iwl6050-firmware	41.28.5.1-101.el8_3.1.noarch	41.28.5.1-99.el8.1.noarch
iwl7260-firmware	1:25.30.13.0-101.el8_3.1.noarch	1:25.30.13.0-99.el8.1.noarch
kexec-tools	2.0.20-34.el8_3.1.x86_64	2.0.20-34.el8.x86_64
linux-firmware	20200619-101.git3890db36.el8_3.noarch	20200619-99.git3890db36.el8.noarch
microcode_ctl	4:20200609-2.20201112.1.el8_3.x86_64	4:20200609-2.20201027.1.el8_3.x86_64
systemd	239-41.el8_3.1.x86_64	239-41.el8_3.x86_64
systemd-libs	239-41.el8_3.1.x86_64	239-41.el8_3.x86_64
systemd-pam	239-41.el8_3.1.x86_64	239-41.el8_3.x86_64
systemd-udev	239-41.el8_3.1.x86_64	239-41.el8_3.x86_64
tuned	2.14.0-3.el8_3.1.noarch	2.14.0-3.el8.noarch
tzdata	2020f-1.el8.noarch	2020d-1.el8.noarch
docker-ce	3:20.10.2-3.el8.x86_64	3:20.10.1-3.el8.x86_64
docker-ce-cli	1:20.10.2-3.el8.x86_64	1:20.10.1-3.el8.x86_64
docker-ce-rootless-extras	20.10.2-3.el8.x86_64	20.10.1-3.el8.x86_64

Seems like coreos/go-iptables/issues/79 could be related.

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 42 (19 by maintainers)

Commits related to this issue

[Update] Set iptables-legacy as alternative on Debian systems — committed to mailcow/mailcow-dockerized by andryyy 3 years ago
Fix issue caused by different iptables versions A backport of "nft: Optimize class-based IP prefix matches" from newer iptables versions broke the hairpin mode detection of ipv6nat. This is caused by... — committed to geektoor/docker-ipv6nat by geektoor 3 years ago

Most upvoted comments

Hi all,

With Docker 20.10.6 the ipv6nat function is fully intergrated (experimental). You can add the following flags to your daemon.json: { "ipv6": true, "fixed-cidr-v6": "fd00::/80", "experimental": true, "ip6tables": true }

thedejavunl on Apr 13, 2021

To so to be 100% clear, this “backport” in the RedHat packages happened between versions 1.8.4-15.el8 and 1.8.4-15.el8_3.3?

Correct, see the changelog here: https://centos.pkgs.org/8/centos-baseos-x86_64/iptables-services-1.8.4-15.el8_3.3.x86_64.rpm.html

robbertkl on Jan 11, 2021

I spoke to Phil Sutter from RedHat, who did both the upstream patch as well as its backport into RHEL8.3.

The commit in question is here. To quote Phil:

Sadly it is not part of an official release yet, ETA is v1.8.7.

About the issue we’re seeing in the Docker container:

Basically it’s a problem with data representation inside the container. The iptables binary in there doesn’t respect the reduced payload expression length and due to absence of the (not needed) bitwise expression assumes the full address is being matched.

So aside from the workaround (downgrading as detailed here) I guess the only solution would be to either wait for 1.8.7 (and its Alpine edge packages) or build a patched version and ship that in the container image.

robbertkl on Jan 11, 2021

Well that fixed it. 🤕

Wow, that was quite the journey. Great you figured it out! And thanks for the detailed fix.

Let’s leave it at this. I’ll keep an eye out for more reports of this issue.

Makes me wonder if I could have mounted the host xtables-nft-multi binary in the container to fix it. Probably only if it was statically linked, since the container runs on Alpine (MUSL based IIRC).

Yeah, that’s usually a no-go, for that exact reason.

robbertkl on Jan 9, 2021

Just pushed out a new release v0.4.4 which contains the fix for this issue! Docker images for all architectures are on Docker Hub as :0.4.4 and :latest. Thanks everyone!

robbertkl on Jul 19, 2021

If this is the issue, couldn’t it be fixed by upgrading the ipv6net Docker container to use a newer version of iptables? Maybe as an opt-in (eg. a new tag)

That seems to be the plan, once iptables is updated in Alpine. We all seem to agree that we should wait until it is “stable” before doing that.

Rycieos on Feb 26, 2021

I can confirm that this error occurs with a fresh install of CentOS. The downgrade solution helped for me.

thedejavunl on Jan 9, 2021

Well that fixed it. 🤕

Feel free to close this issue if you want, since it seems to be a problem with an external package. Though if you think it is a compatibility issue, I would be happy to help you continue to debug it (though not on my home prod system).

Full fix detailed, in case anyone else has the exact same stuck package scenario:

# Get all "old" packages
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-1.8.4-15.el8.x86_64.rpm
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-ebtables-1.8.4-15.el8.x86_64.rpm
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-libs-1.8.4-15.el8.x86_64.rpm
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-services-1.8.4-15.el8.x86_64.rpm

$ sudo yum downgrade ./iptables-*

# Destroy the container, just in case any loaded kernel modules stick around
$ docker rm -f ipv6nat

# Rebooting was the only thing that fixed it for me
$ sudo reboot

# Recreate the container
$ docker run -d --name ipv6nat --cap-drop ALL --cap-add NET_ADMIN --cap-add NET_RAW --network host --restart unless-stopped -v /var/run/docker.sock:/var/run/docker.sock:ro robbertkl/ipv6nat
# or whatever your command is

Thanks for all your help tracking down what was causing the problem!

Your system uses nftables and I think it’s only iptables-nft in the container that talks directly to nftables. I don’t think it’s even using the iptables installed on the host.

You are right I think. I guess somehow the new package version has some bug that interacts with the kernel incorrectly, and then saves (and then later prints) the rules incorrectly?? Yeah, doesn’t make sense to me either. I wouldn’t even know how to go about reporting this as a bug to the package maintainer. I guess I would need to prove that the rule actually got saved wrong somehow.

Also wondering how iptables on the host could have an effect in the first place: I don’t think it’s even used?

Makes me wonder if I could have mounted the host xtables-nft-multi binary in the container to fix it. Probably only if it was statically linked, since the container runs on Alpine (MUSL based IIRC).

Rycieos on Jan 9, 2021