moby: kernel crash after "unregister_netdevice: waiting for lo to become free. Usage count = 3"
This happens when I login the container, and can’t quit by Ctrl-c.
My system is Ubuntu 12.04
, kernel is 3.8.0-25-generic
.
docker version:
root@wutq-docker:~# docker version
Client version: 0.10.0
Client API version: 1.10
Go version (client): go1.2.1
Git commit (client): dc9c28f
Server version: 0.10.0
Server API version: 1.10
Git commit (server): dc9c28f
Go version (server): go1.2.1
Last stable version: 0.10.0
I have used the script https://raw.githubusercontent.com/dotcloud/docker/master/contrib/check-config.sh to check, and all right.
I watch the syslog and found this message:
May 6 11:30:33 wutq-docker kernel: [62365.889369] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:30:44 wutq-docker kernel: [62376.108277] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:30:54 wutq-docker kernel: [62386.327156] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:02 wutq-docker kernel: [62394.423920] INFO: task docker:1024 blocked for more than 120 seconds.
May 6 11:31:02 wutq-docker kernel: [62394.424175] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 6 11:31:02 wutq-docker kernel: [62394.424505] docker D 0000000000000001 0 1024 1 0x00000004
May 6 11:31:02 wutq-docker kernel: [62394.424511] ffff880077793cb0 0000000000000082 ffffffffffffff04 ffffffff816df509
May 6 11:31:02 wutq-docker kernel: [62394.424517] ffff880077793fd8 ffff880077793fd8 ffff880077793fd8 0000000000013f40
May 6 11:31:02 wutq-docker kernel: [62394.424521] ffff88007c461740 ffff880076b1dd00 000080d081f06880 ffffffff81cbbda0
May 6 11:31:02 wutq-docker kernel: [62394.424526] Call Trace:
May 6 11:31:02 wutq-docker kernel: [62394.424668] [<ffffffff816df509>] ? __slab_alloc+0x28a/0x2b2
May 6 11:31:02 wutq-docker kernel: [62394.424700] [<ffffffff816f1849>] schedule+0x29/0x70
May 6 11:31:02 wutq-docker kernel: [62394.424705] [<ffffffff816f1afe>] schedule_preempt_disabled+0xe/0x10
May 6 11:31:02 wutq-docker kernel: [62394.424710] [<ffffffff816f0777>] __mutex_lock_slowpath+0xd7/0x150
May 6 11:31:02 wutq-docker kernel: [62394.424715] [<ffffffff815dc809>] ? copy_net_ns+0x69/0x130
May 6 11:31:02 wutq-docker kernel: [62394.424719] [<ffffffff815dc0b1>] ? net_alloc_generic+0x21/0x30
May 6 11:31:02 wutq-docker kernel: [62394.424724] [<ffffffff816f038a>] mutex_lock+0x2a/0x50
May 6 11:31:02 wutq-docker kernel: [62394.424727] [<ffffffff815dc82c>] copy_net_ns+0x8c/0x130
May 6 11:31:02 wutq-docker kernel: [62394.424733] [<ffffffff81084851>] create_new_namespaces+0x101/0x1b0
May 6 11:31:02 wutq-docker kernel: [62394.424737] [<ffffffff81084a33>] copy_namespaces+0xa3/0xe0
May 6 11:31:02 wutq-docker kernel: [62394.424742] [<ffffffff81057a60>] ? dup_mm+0x140/0x240
May 6 11:31:02 wutq-docker kernel: [62394.424746] [<ffffffff81058294>] copy_process.part.22+0x6f4/0xe60
May 6 11:31:02 wutq-docker kernel: [62394.424752] [<ffffffff812da406>] ? security_file_alloc+0x16/0x20
May 6 11:31:02 wutq-docker kernel: [62394.424758] [<ffffffff8119d118>] ? get_empty_filp+0x88/0x180
May 6 11:31:02 wutq-docker kernel: [62394.424762] [<ffffffff81058a80>] copy_process+0x80/0x90
May 6 11:31:02 wutq-docker kernel: [62394.424766] [<ffffffff81058b7c>] do_fork+0x9c/0x230
May 6 11:31:02 wutq-docker kernel: [62394.424769] [<ffffffff816f277e>] ? _raw_spin_lock+0xe/0x20
May 6 11:31:02 wutq-docker kernel: [62394.424774] [<ffffffff811b9185>] ? __fd_install+0x55/0x70
May 6 11:31:02 wutq-docker kernel: [62394.424777] [<ffffffff81058d96>] sys_clone+0x16/0x20
May 6 11:31:02 wutq-docker kernel: [62394.424782] [<ffffffff816fb939>] stub_clone+0x69/0x90
May 6 11:31:02 wutq-docker kernel: [62394.424786] [<ffffffff816fb5dd>] ? system_call_fastpath+0x1a/0x1f
May 6 11:31:04 wutq-docker kernel: [62396.466223] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:14 wutq-docker kernel: [62406.689132] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:25 wutq-docker kernel: [62416.908036] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:35 wutq-docker kernel: [62427.126927] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:45 wutq-docker kernel: [62437.345860] unregister_netdevice: waiting for lo to become free. Usage count = 3
After happend this, I open another terminal and kill this process, and then restart docker, but this will be hanged.
I reboot the host, and it still display that messages for some minutes when shutdown:
About this issue
- Original URL
- State: closed
- Created 10 years ago
- Reactions: 281
- Comments: 539 (93 by maintainers)
Links to this issue
Commits related to this issue
- Use cpuguy83' patch and add a global cleanup lock I got stuck in #5618 and #17443 for a week without finding a good solution, except for the great cpuguy83's PR #23178 that unlocked at least the read... — committed to etamponi/docker by etamponi 8 years ago
- Use bridged networking instead of an overlay network provided by Flannel. Workaround for https://github.com/docker/docker/issues/5618, but also simplifies architecture. — committed to edevil/kubernetes-deployment by edevil 8 years ago
- Add docker/docker#5618: hang up with `unregister_netdevice: waiting for lo to become free` — committed to AkihiroSuda/issues-docker by AkihiroSuda 8 years ago
- update dind to 1.12.5 hopefully will stop freezing up docker and will save us from LRT, link to relevant comment: https://github.com/docker/docker/issues/5618#issuecomment-266564257 — committed to iron-io/runner by deleted user 7 years ago
- update dind to 1.12.5 (#27) hopefully will stop freezing up docker and will save us from LRT, link to relevant comment: https://github.com/docker/docker/issues/5618#issuecomment-266564257 — committed to iron-io/runner by deleted user 7 years ago
- host: Set bridge to promiscuous to avoid kernel bug This is a workaround for the unregister_netdevice kernel hang that can occur when starting containers. See these issues for more details: https:/... — committed to flynn/flynn by titanous 7 years ago
- host: Set bridge to promiscuous to avoid kernel bug This is a workaround for the unregister_netdevice kernel hang that can occur when starting containers. See these issues for more details: https:/... — committed to flynn/flynn by titanous 7 years ago
- Adds container stat logging. Adds container stat logging before termininating container to help aid debugging docker issue related to "unregister_netdevice: waiting for ...". See https://github.com/m... — committed to spotify/helios by deleted user 7 years ago
- Adds container stat logging. Adds container stat logging before termininating container to help aid debugging docker issue related to "unregister_netdevice: waiting for ...". See https://github.com/m... — committed to spotify/helios by deleted user 7 years ago
- kernel: Update to 4.14.16/4.9.79/4.4.114 The 4.14 and 4.9 kernels have a significant number of fixes to eBPF and also a fix for kernel level sockets and namespace removals, ie fixes some aspects of h... — committed to rn/linuxkit by rn 6 years ago
- kernel: Update to 4.14.16/4.9.79/4.4.114 The 4.14 and 4.9 kernels have a significant number of fixes to eBPF and also a fix for kernel level sockets and namespace removals, ie fixes some aspects of h... — committed to rn/linuxkit by rn 6 years ago
(repeating this https://github.com/moby/moby/issues/5618#issuecomment-351942943 here again, because GitHub is hiding old comments)
If you are arriving here
The issue being discussed here is a kernel bug and has not yet been fully fixed. Some patches went in the kernel that fix some occurrences of this issue, but others are not yet resolved.
There are a number of options that may help for some situations, but not for all (again; it’s most likely a combination of issues that trigger the same error)
The “unregister_netdevice: waiting for lo to become free” error itself is not the bug
If’s the kernel crash after that’s a bug (see below)
Do not leave “I have this too” comments
“I have this too” does not help resolving the bug. only leave a comment if you have information that may help resolve the issue (in which case; providing a patch to the kernel upstream may be the best step).
If you want to let know you have this issue too use the “thumbs up” button in the top description:
If you want to stay informed on updates use the subscribe button.
Every comment here sends an e-mail / notification to over 3000 people I don’t want to lock the conversation on this issue, because it’s not resolved yet, but may be forced to if you ignore this.
I will be removing comments that don’t add useful information in order to (slightly) shorten the thread
If you want to help resolving this issue
Thanks!
If you are arriving here
The issue being discussed here is a kernel bug and has not yet been fixed. There are a number of options that may help for some situations, but not for all (it’s most likely a combination of issues that trigger the same error)
Do not leave “I have this too” comments
“I have this too” does not help resolving the bug. only leave a comment if you have information that may help resolve the issue (in which case; providing a patch to the kernel upstream may be the best step).
If you want to let know you have this issue too use the “thumbs up” button in the top description:
If you want to stay informed on updates use the subscribe button.
Every comment here sends an e-mail / notification to over 3000 people I don’t want to lock the conversation on this issue, because it’s not resolved yet, but may be forced to if you ignore this.
Thanks!
(repeating this here again, because GitHub is hiding old comments)
If you are arriving here
The issue being discussed here is a kernel bug and has not yet been fully fixed. Some patches went in the kernel that fix some occurrences of this issue, but others are not yet resolved.
There are a number of options that may help for some situations, but not for all (again; it’s most likely a combination of issues that trigger the same error)
Do not leave “I have this too” comments
“I have this too” does not help resolving the bug. only leave a comment if you have information that may help resolve the issue (in which case; providing a patch to the kernel upstream may be the best step).
If you want to let know you have this issue too use the “thumbs up” button in the top description:
If you want to stay informed on updates use the subscribe button.
Every comment here sends an e-mail / notification to over 3000 people I don’t want to lock the conversation on this issue, because it’s not resolved yet, but may be forced to if you ignore this.
I will be removing comments that don’t add useful information in order to (slightly) shorten the thread
If you want to help resolving this issue
Thanks!
Hi all, I tried to debug the kernel issue, was having a email chain on the “netdev” mailing list, so just wanted to post some findings here.
https://www.spinics.net/lists/netdev/msg416310.html
The issue that we are seeing is that
during container shut down. When if I inspect the container network namespace, it seems like the
eth0
device has already been deleted, but only thelo
device is left there. And there is another structure holding the reference for that device.After some digging, it turns out the “thing” holding the reference, is one of the “routing cache” (
struct dst_entry
). And something is preventing that particulardst_entry
to be gc’ed (the reference count fordst_entry
is larger than 0). So I logged everydst_hold()
(increment dst_entry reference count by 1), anddst_release()
(decrement dst_entry reference count by 1), and there is indeed moredst_hold()
calls thendst_release()
.Here is the logs attached: kern.log.zip
Summary:
lo
interface was renamed tolodebug
for ease of grepingdst_entry
starts with 1dst_entry
(which is holding the reference for lo) at the end is 19dst_hold()
calls, and 258023 totaldst_release()
callsdst_hold()
calls, there are 88034udp_sk_rx_dst_set()
(which is then callsdst_hold()
), 152536inet_sk_rx_dst_set()
, and 17471__sk_add_backlog()
dst_release()
calls, there are 240551inet_sock_destruct()
and 17472refdst_drop()
There are more
udp_sk_rx_dst_set()
andinet_sk_rx_dst_set()
calls in total thaninet_sock_destruct()
, so suspecting there are some sockets are in a “limbo” state, and something preventing them to destroyed.UPDATE: Turns out sockets (
struct sock
) are created and destroyed correctly, but for some of the TCP socketsinet_sk_rx_dst_set()
are being called multiple times on the samedst
, but there is only one correspondinginet_sock_destruct()
to release the reference to thedst
.kernel bug
=)
To be clear, the message itself is benign, it’s the kernel crash after the messages reported by the OP which is not.
The comment in the code, where this message is coming from, explains what’s happening. Basically every user, such as the IP stack) of a network device (such as the end of
veth
pair inside a container) increments a reference count in the network device structure when it is using the network device. When the device is removed (e,g. when the container is removed) each user is notified so that they can do some cleanup (e.g. closing open sockets etc) before decrementing the reference count. Because this cleanup can take some time, especially under heavy load (lot’s of interface, a lot of connections etc), the kernel may print the message here once in a while.If a user of network device never decrements the reference count, some other part of the kernel will determine that the task waiting for the cleanup is stuck and it will crash. It is only this crash which indicates a kernel bug (some user, via some code path, did not decrement the reference count). There have been several such bugs and they have been fixed in modern kernel (and possibly back ported to older ones). I have written quite a few stress tests (and continue writing them) to trigger such crashes but have not been able to reproduce on modern kernels (i do however the above message).
Please only report on this issue if your kernel actually crashes, and then we would be very interested in:
uname -r
)Thanks
[ @thaJeztah could you change the title to something like
kernel crash after "unregister_netdevice: waiting for lo to become free. Usage count = 3"
to make it more explicit]This might be relevant: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1065434#yui_3_10_3_1_1401948176063_2050
Sure enough, one of the times this happened for me was right after
apt-get
ting a package with a ton of dependencies.I believe I’ve fixed this issue, at least when caused by a kernel TCP socket connection. Test kernels for Ubuntu are available and I would love feedback if they help/fix this for anyone here. Patch is submitted upstream; more details are in LP bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/46
@piec, many thanks for the details. I have a few more questions for you at the end of this very long comment.
Using the SMB setup I was able to produce a number of things with different kernels. I’ve tried this with the NFS setup as well but no dice.
All tests are run with docker 17.06.1-ce on HyperKit with a VM configured with 2 vCPUs and 2GB of memory (via Docker for Mac, but that should not matter). I’m using LinuxKit kernels, because I can easily swap them out.
I modified your
Dockerfile
in that I added a call todate
as the first command executed and also added a call todate
before andeafter thedocker run
for the client.Experiment 1 (4.9.39 kernel)
With 4.9.39 (latest 4.9.x stable kernel) I get a kernel crash:
and in
dmesg
:Some times I see several iteration of what the 4.11.12 kernel does, including the
unregister_netdevice
messages (see below) and then get the kernel crash above. Sometimes I see a slight variations for the crash, like:The crashes are in the unix domain socket code and similar/identical to what is reported here, though with this new test case it is much easier to reproduce.
Experiment 2 (4.11.12 kernel)
With 4.11.12 (which is the latest stable in the 4.11 series) I see no crashes, but it is really slow (annotations inline with
--->
):I had this running for an hour or so with the same pattern repeating, but no kernel crash.
I the kernel logs I see, lots of:
That is a message every ten seconds.
Since this does not cause the hung task detection to kick in even after an hour, I suspect that with 4.11.12 the reference count eventually gets decremented and the device get’s freed, but, judging by the intervals I can run containers, it might take up to 4mins!
Experiment 3 (4.11.12 kernel)
The kernel crash in the OP indicated that the kernel crashed because a hung task was detected. I haven not seen this crash in my testing, so I changed the
sysctl
setting related to hung task detection:This reduces the timeout to 60 seconds and panics the kernel if a hung task was detected. Since it takes around 2 minutes before
dockerd
complained thatcontainerd
did not start, reducing the hung task detection to 60s ought to trigger a kernel panics if a single task was hung. Alas there was no crash in the logsExperiment 4 (4.11.12 kernel)
Next, I increase the
sleep
after eachdocker run
to 5 minutes to see if the messages are continuous. In this case alldocker run
s seem to work, which is kinda expected since from the previous experiments adocker run
would work every 4 minutes or soIt looks like we are getting around 200 seconds worth of
unregister_netdevice
on almost everydocker run
(except for the second one). I suspect during that time we can’t start new containers (as indicated by Experiment 2. It’s curious that the hung task detection is not kicking in, presumably because no task is hung.Experiment 5 (4.11.12/4.9.39 with a extra debugging enable in the kernel)
This is reverting back to 1s sleep in between
docker run
We have another kernel which enabled a bunch of additional debug options, such as
LOCKDEP
,RCU_TRACE
,LOCKUP_DETECTOR
and a few more.Running the repro 4.11.12 kernels with these debug options enabled did not trigger anything.
Ditto for the 4.9.39 kernel, where the normal kernel crashes. The debug options change the timing slightly, so this maybe an additional clue that the crash in the unix domain socket code shows about is due to a race.
Digging a bit deeper
strace
on the variouscontainerd
processes is not helpful (it usually isn’t because it’s written in Go). Lots of long stalls infutex(...FUTEX_WAIT...)
with any information on where/why.Some poking around with
sysrq
:Increase verbosity:
Stack trace from all CPUs:
Nothing here, CPU1 is idle, CPU0 is handling the sysrq.
Show blocked tasks (twice)
This shows that both the
netns
andcleanup_net
work queues are busy. I found a somewhat related issue a quite a while back here, but this time thecleanup_net
workqueue is in a different state.Summary
unregister_netdev
messages seem unrelated to the recent fix (which is in both 4.9.39 and 4.11.12). This maybe because thecleanup_net
work queue is not progressing and thus the message is printed.runc
, maybe I should trycontainerd
.I will dig a bit more and then send a summary to
netdev
.@piec do you have console access and can see if there is anything in terms of crash dump or do you also just see huge delays as I see? If you have a crash dump, I’d be very interested in seeing it. Also, are you running on bare metal or in a VM? What’s your configuration in terms of CPUs and memory?
Hi guys,
There is a potential patch for the kernel bug (or at least one of the bugs) in the Linux net-dev mailing list:
https://www.spinics.net/lists/netdev/msg442211.html
It’s merged in net tree, queued for stable tree.
Smyte will pay $5000 USD for the resolution of this issue. Sounds like I need to talk to someone who works on the kernel?
same here! 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢 😢
same problem.
According to https://github.com/torvalds/linux/commit/d747a7a51b00984127a88113cdbbc26f91e9d815 - it is in 4.12 (which was released yesterday)!
The issue was assigned 10 days ago and it is work in progress, you can see more insights of what’s going on here https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Hopefully Dan Streetman finds out how to fix it
kernel: unregister_netdevice: waiting for eth0 to become free. Usage count = 3 kernel 4.4.146-1.el7.elrepo.x86_64 linux version CentOS Linux release 7.4.1708 (Core) bridge mode
I had the same issue,what can i do?
@samvignoli your comments are not constructive. Please stop posting.
https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.114
commit edaafa805e0f9d09560a4892790b8e19cab8bf09 Author: Dan Streetman ddstreet@ieee.org Date: Thu Jan 18 16:14:26 2018 -0500
While I’m not the one who is fixing this issue, not being much into Linux Kernel dev, I think I am right in saying that the “me too” comments aren’t that helpful. By this I mean, just saying “I have this problem too, with Kernel vx.x and Docker 1.x” does not bring anything new to the discussion.
However, I would suggest that “me too” comments which describe more the environment and method to reproduce would be of great value.
When reading all the comments, it is clear that there are a few problems - as I posted earlier, some with vethXYZ, some with eth0 and others with lo0. This suggest that they could be caused by different problems. So just saying “me too” without full description of the error and environment may mislead people.
Also, when describing the environment, giving the Kernel and Docker version is not sufficient. Per the thread, there seems to be a few factors such as ipv6 enabled or not. NodeJS not responding to SIGINT (or other containers, not bashing on NodeJS here).
So describing what the workload on the environment is would be useful. Also, this occurs when a container is being shutdown, therefore I would also suggest to the people experiencing this issue to pay attention to what container is being stopped when the problem rear its ugly head.
While it seems the problem is in the Kernel having a race condition - identifying the trigger will be of tremendous help to those who will fix the issue. And it can even give the affected users an immediate solution such as implementing a signal handler in a NodeJS application (I don’t know myself that this prevents the issue from triggering, but it seems so per earlier comments of others).
Thank you to coolljt0725 and co (and everybody in this thread). Since many people will be unable to update to a kernel with the ipv6 patch for some time, (everyone, currently) I’ve managed to squash this bug after trying many of the suggestions from this thread. I want to make a full post to follow up on things that did work and did not work so that nobody else has to see the trouble I seen.
TL;DR disable ipv6 in linux boot params, reboot. on coreos this means
/usr/share/oem/grub.cfg
has the contents:set linux_append="ipv6.disable=1"
and then a reboot. a more general purpose suggestion that should work on centos/ubuntu/debian/$linuxes may be found heredockerd
individually, and with certain combinations (since none of them seemed to work, I wasn’t too scientific about trying any and all combinations):interestingly,
--ipv6=false
doesn’t really seem to do anything – this was quite perplexing, containers still received inet6 addresses with this flag.--userland-proxy=false
sets hairpin mode and wasn’t expected to work really. in conjunction with this I had some hope but this did not resolve the issue, either (setting docker0 to promisc mode). There is a mention of a fix to--userland-proxy=false
here and this may be upstream soon and worth another shot, it would be nice to turn this off regardless of the bug noted in this issue for performance but unfortunately it has yet another bug at this time.too long; did read: disable ipv6 in your grub settings. reboot. profit.
Is there some official word from Docker 🐳 on when this might be looked at? This is second most commented open issue; is very severe (necessitating a host restart); is reproducible; and I don’t see any real progress toward pinning down the root cause or fixing it 😞.
This seems most likely to be a kernel issue, but the ticket on Bugzilla has been stagnant for months. Would it be helpful to post our test cases there?
Everyone who’s seeing this error on their system is running a package of the Linux kernel on their distribution that’s far too old and lacks the fixes for this particular problem.
If you run into this problem, make sure you run
apt-get update && apt-get dist-upgrade -y
and reboot your system. If you’re on Digital Ocean, you also need to select the kernel version which was just installed during the update because they don’t use the latest kernel automatically (see https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/2814988-give-option-to-use-the-droplet-s-own-bootloader).CentOS/RHEL/Fedora/Scientific Linux users need to keep their systems updated using
yum update
and reboot after installing the updates.When reporting this problem, please make sure your system is fully patched and up to date with the latest stable updates (no manually installed experimental/testing/alpha/beta/rc packages) provided by your distribution’s vendor.
Same issue:
CentOS Linux release 7.5.1804 (Core) Docker version 18.06.1-ce, build e68fc7a Kernel Version: 3.10.0-693.el7.x86_64
Everyone please understand this is a common SYMPTOM that has many causes. What has worked for you to avoid this may not work for someone else.
Red Hat claims to have an instance of this bug fixed as of kernel-3.10.0-514.21.1.el7 release. I suppose they will upstream the fix as soon as possible and rebase to 4.12. This package is already available on CentOS 7 as well.
Documentation related to the fix (RHN access needed): https://access.redhat.com/articles/3034221 https://bugzilla.redhat.com/show_bug.cgi?id=1436588
From the article: “In case of a duplicate IPv6 address or an issue with setting an address, a race condition occurred. This race condition sometimes caused address reference counting leak. Consequently, attempts to unregister a network device failed with the following error message: “unregister_netdevice: waiting for to become free. Usage count = 1”. With this update, the underlying source code has been fixed, and network devices now unregister as expected in the described situation.”
I already deployed this fix in all systems of our PaaS pool, and there’s been already 2 days without the bug being hit. Earlier, we’ve had at least one system being frozen per day. I will report here if we hit the bug again.
The last notice (July 2018) advising users to 👍 / subscribe if they don’t have helpful information to contribute regarding crashes, is no longer visible.
I think this comment I’ve put together summarizes the issue well enough for anyone that wants to dig through it. I also have the impression that many of the past discussions have resolved the issue experienced by the majority, along with reproductions shared not creating the failure or log message.
I would suggest closing / locking this issue. 9 years to this day since it was opened. It would be better to create a new issue for anyone else still affected to follow, and where more recent / relevant information can be tracked?
System details
This is not a kernel crash report, but from an attempt to go through over 600 items of this issue, looking for any useful information and reproductions (especially reproductions confirmed by multiple users). Reproduction was not possible.
Info requested:
6.2.0
enp1s0
enp1s0
is created and configured viacloud-init
=> netplan => systemd-networkd, each veth docker creates for container start/stop triggers acloud-init
udev rule that recreatesenp1s0
needlessly which can reset/undo kernel settings likeproxy_ndp
breaking IPv6 GUA address on other containers. This may have affected some users that reported.daemon.json
docker info
Reproductions shared
None of these were reproducible for me.
userland-proxy: false
and IPv6 docker bridge enabled.Presumably the issue has been resolved since (as some hint at, like the
docker-stress
comments, and kernel commits cited in early 2019).userland-proxy: false
+ busybox container loop start/stop hundreds of containers):docker-stress
withuseland-proxy: false
+ IPv6 enabled via kernel):docker-samba-loop
git clone):Cherry-picked comments
unregister_netdevice
message is normal, problem is when it repeats frequently): https://github.com/moby/moby/issues/5618#issuecomment-305007556>= 5.4.226
should work). From the commit message, it may have been related to reproductions that previously caused the problem, given they relied on creating hundreds of containers and other mentions of memory pressure playing a role in reproducing the condition.Many other comments cited CentOS or similar systems with very dated kernels most of the time, or did not provide much helpful information. Another bulk appeared to be related to IPv6, and some with UDP / conntrack.
Various fixes related to networking (for IPv6 and UDP) have been made in both Docker and the kernel over this duration. Activity within the issue has also decreased significantly, implying the main causes have been resolved.
This issue has been solved by this commit : https://github.com/torvalds/linux/commit/ee60ad219f5c7c4fb2f047f88037770063ef785f Using kpatch
This issue still happening 😦 no update/ideas on how to fix?
@FrankYu thats not helpful. To participate usefully in this thread, please provide an exact way to reproduce this issue, and please test on a modern kernel. 3.10 was released four years ago, we are discussing about whether it is fixed or partially on a release from four days ago.
I’m running 4.12 (from http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/) and I still hit this, so torvalds/linux@d747a7a must not be the complete fix.
The fix is in 4.4.22 stable https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22
@tankywoo @drpancake @egasimus @csabahenk @spiffytech @ibuildthecloud @sbward @jbalonso @rsampaio @MrMMorris @rsampaio @unclejack @chrisjstevenson @popsikle @fxposter @scher200 @victorgp @jstangroome @Xuexiang825 @dElogics @Nowaker @pmoust @marckamerbeek @Beatlor @warmchang @Jovons @247687009 @jwongz @tao12345666333 @clkao Please look at this https://pingcap.com/blog/try-to-fix-two-linux-kernel-bugs-while-testing-tidb-operator-in-k8s/
Since upgrading from kernel 4.4.0 to 4.15.0 and docker 1.11.2 to 18.09 the issue disappeared.
In a sizeable fleet of VMs acting as docker hosts we had this issue appearing multiple times a day (with our Docker use-case). 45 days in and we are no longer seeing this.
For posterity, a stack-trace of a hung Docker 1.11.2 w/ printk’s showing
unregister_netdevice: waiting for vethXXXXX
(similar to what we were always seeing in our fleet, in hundreds of VMs) can be found at http://paste.ubuntu.com/p/6RgkpX352J/ (the interesting Container ref is0xc820001980
)From that we can observe that it hanged in https://github.com/moby/moby/blob/v1.11.2/daemon/container_operations.go#L732
which points us to https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/docker/libnetwork/sandbox.go#L175
And https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/docker/libnetwork/endpoint.go#L760
Which goes into libnetwork bridge driver (check the awesome description) https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/docker/libnetwork/drivers/bridge/bridge.go#L1057-L1061
Moving to netlink https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/vishvananda/netlink/link_linux.go#L601-L617 https://github.com/moby/moby/blob/v1.11.2//vendor/src/github.com/vishvananda/netlink/nl/nl_linux.go#L215
And ultimately in that netlink socket, calls https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/vishvananda/netlink/nl/nl_linux.go#L333
We feel that the bug in general happens when stopping a container and due to SKBs being still referenced in the netns the veth is not released, then Docker issues a Kill to that container after 15s. Docker daemon does not handle this situation gracefully, but ultimately the bug is in the kernel. We believe that https://github.com/torvalds/linux/commit/4ee806d51176ba7b8ff1efd81f271d7252e03a1d (accepted in 4.15 upstream) and commits linked to it (there are several) act as a mitigation.
In general, that part of the kernel is not a pretty place.
I can confirm it also happens on:
Linux 3.10.0-693.17.1.el7.x86_64 Red Hat Enterprise Linux Server release 7.4 (Maipo)
I can reproduce it by doing “service docker restart” while having a certain amount of load.
Same on CentOS7 kernel 4.16.0-1.el7.elrepo.x86_64 and docker 18.03.0-ce
It worked for weeks before the crash and when to try to up, it completely stuck.
The problem also happened with kernel 3.10.0-693.21.1.el7
Thats all well and good but what exactly are the options that help? This problem is causing us issues in production so I’d like to do whatever work arounds that are necessary to work around the kernel bug.
For any of those interested, we (Travis CI) are rolling out an upgrade to
v4.8.7
on Ubuntu 14.04. Our load tests showed no occurrences of the error described here. Previously, we were running linux-image-generic-lts-xenial on Ubuntu 14.04. I’m planning to get a blog post published in the near future describing more of the details.UPDATE: I should have mentioned that we are running this docker stack:
UPDATE: We are still seeing this error in production on Ubuntu Trusty + kernel v4.8.7. We don’t yet know why these errors disappeared in staging load tests that previously reproduced the error, yet the error rate in production is effectively the same. Onward and upward. We have disabled “automatic implosion” based on this error given the high rate of instance turnover.
We’ve reproduced the same bug using a diagnostic kernel that had delays artificially inserted to make PMTU discovery exception routes hit this window.
ref_leak_test_begin.sh:
ref_leak_test_end.sh:
The test process:
ref_leak_test_begin.sh
,ref_leak_test_end.sh
,After some testing, torvalds/linux@ee60ad2 can indeed fix this bug.
Here’s an attempt to summarise this issue, from the comments of this issue, https://github.com/kubernetes/kubernetes/issues/70427, https://github.com/kubernetes/kubernetes/issues/64743, and https://access.redhat.com/solutions/3659011
Observed kernel versions with this issue
Kernel versions claimed not triggering this issue
Related kernel commits
Everyone, @etlweather is spot-on. Only post a “me too” if you have a reliable way of reproducing the issue. In that case, detail your procedure. A docker and kernel version isn’t enough and we get lots of notifications about it. The simpler your reproduction procedure, the better.
@rneugeba @redbaron Unfortunately the current “repro” I have is very hardware specific (all but proving this is a race condition). I haven’t tried getting a QEMU repro but that’s definitely the next step so multiple people can actually work on this and get the expected result (ideally in 1 CPU core setup). If someone already has one, please shoot me an email (it’s on my profile). I’ll thoroughly test it and post it here.
FYI: we’ve been running Linux 4.8.8 in conjunction with Docker v1.12.3 on a single production host. Uptime is presently at 5.5 days and the machine remains stable.
We occasionally see a handful of
unregister_netdevice: waiting for lo to become free. Usage count = 1
messages in syslog, but unlike before, the kernel does not crash and the message goes away. I suspect that one of the other changes introduced either in the Kernel or in Docker detect this condition and now recover from it. For us, this now makes this message annoying but no longer a critical bug.I’m hoping some other folks can confirm the above on their production fleets.
@gtirloni - can you clarify if your 4.8.8/1.12.3 machine crashed or if you just saw the message?
Thank you, in advance, to everyone who has been working on reproducing/providing useful information to triangulate this thing.
@pumba-lt We had this issue about 1.5yrs ago, about 1yr ago I disabled ipv6 at the kernel level (not sysctl) and haven’t had the issue once. Running a cluster of 48 blades.
Normally in:
/etc/default/grub
GRUB_CMDLINE_LINUX="xxxxx ipv6.disable=1"
However, I use PXE boot, so my PXE config has:
I assure you, you will not see this issue again.
This information might be relevant.
We are able to reproduce the problem with unregister_netdevice: waiting for lo to become free. Usage count = 1 with 4.14.0-rc3 kernel with CONFIG_PREEMPT_NONE=y and running only on one CPU with following boot kernel options:
Once we hit this state, it stays in this state and reboot is needed. No more containers can be spawned. We reproduce it by running images doing ipsec/openvpn connections + downloading a small file inside the tunnels. Then the instances exist (usually they run < 10s). We run 10s of such containers a minute on one machine. With the abovementioned settings (only 1cpu), the machine hits it in ~2 hours.
Another reproducer with the same kernel, but without limiting number of CPUs, is to jus run iperf in UDP mode for 3 seconds inside the container (so there is no TCP communication at all). If we run 10 of such containers in parallel, wait for all of them to finish and do it again, we hit the trouble in less than 10 minutes (on 40 cores machine).
In both of our reproducers, we added “ip route flush table all; ifconfig <iface> down; sleep 10” before existing from containers. It does not seem to have any effect.
I’ve got same issues. Docker version 1.13.1, build 092cba3 Linux debian 4.8.6-x86_64-linode78
AFAICT, this is a locking issue in the network namespaces subsystem of Linux kernel. This bug has been reported over a year ago, with no reply: https://bugzilla.kernel.org/show_bug.cgi?id=97811 There has been some work on this (see here: http://www.spinics.net/lists/netdev/msg351337.html) but it seems it’s not a complete fix.
I’ve tried pinging the network subsystem maintainer directly, with no response. FWIW, I can reproduce the issue in a matter of minutes.
We got a huge pain with this issue on our CoreOS cluster. Could anyone tell when it will be finally fixed? We dream about this moment when we can sleep at night.
Just wait for a few months and someone will come up complaining about the 4.19 kernel too. Just history repeating itself.
@thaJeztah, perhaps you should add your comment to the top of the original post, as people are still ignoring it.
Using SCTP in netns could also trigger this, fixes in 4.16-rc1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=957d761cf91cdbb175ad7d8f5472336a4d54dbf2
Hi,
Just to add to the fire we are also seeing this problem, as requested here are the following…
Kernel Version: Linux exe-v3-worker 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u5 (2017-09-19) x86_64 GNU/Linux
Linux distribution/version: Debian 9.1 (with all packages up to date)
Are you on the latest kernel version of your Linux vendor? Yes
Network setup (bridge, overlay, IPv4, IPv6, etc): IPv4 only, NATed as per default Docker setup
Description of the workload (what type of containers, what type of network load, etc): Very short lived containers (from a few seconds to a few minutes) running scripts before exiting.
And ideally a simple reproduction:
**kernel:[617624.412100] unregister_netdevice: waiting for lo to become free. Usage count = 1
Couldn’t kill old container or start new ones on the nodes affected, had to reboot to restore functionality.**
Hopefully we find a root cause / patch soon.
Best Regards,
robputt796
The netdev@vger thread is here https://www.mail-archive.com/netdev@vger.kernel.org/msg179703.html if anyone wants to follow progress.
@piec yes, that’s expected.
I also hit this on centos 7.3 with host kernel 3.10.0-514.6.1.el7.x86_64 and docker-ce-17.06.0.ce-1.el7.centos.x86_64.
I spent some time stripping the container down and it turns out that the web service had nothing to do with the bug. What seems to trigger this in my case is mounting an NFS share inside a container (running with
--privileged
).On my desktop, i can reliably reproduce simply running the following a few times:
@rn I managed to narrow this down to a specific container in our test suite, and was able to reproduce with the following steps:
After 3 or 4 iterations of this i end up getting
waiting for lo to become free
and on the next iterationdocker run
fails withdocker: Error response from daemon: containerd: container did not start before the specified timeout.
A pretty small amount. In the steps mentioned above, the http request is a small amount of json, and the response is a binary blob thats around 10MB.
This is on a 4-core desktop machine (no VM)
No, everything is done serially.
They’re stopped with
docker stop
I’ve seen this issue within hours on high load systems with and without
--userland-proxy=false
You can try to upgrade to the latest 4.19+ kernel.
We used kernel version:4.14.62 for a few months ,this issue disappeared.
@rn
We are having a slightly different issue in our environment that I am hoping to get some clarification on (kernel 3.16.0-77-generic, Ubuntu 14.04, docker 1.12.3-0~trusty. We have thousands of hosts running docker, 2-3 containers per host, and we are seeing this on < 1% of total hosts running docker).
We actually never see the kernel crash, but instead (like the original reporters as far as I can tell) the
dockerd
process is defunct. Upstart (using the/etc/init/docker.conf
job from the upstream package) will not start a newdockerd
process because it thinks it is already running (start: Job is already running: docker
), and attempting to stop the upstart job also fails (docker start/killed, process <pid of defunct process>
).Since we mostly run with bridge networking (on a custom bridge device) in
dmesg
we see a slightly different message referring to the virtual interface:Because upstart seems to refuse to restart dockerd or recognize that the previously running process is a zombie, the only solution we have found is to reboot the host.
While our outcome seems different (the kernel does not crash) the root cause sounds the same or similar. Is this not the same issue then? Is there any known workaround or way to have the
docker
upstart job become runnable again when this occurs?Does anyone have a suggested work around for this? I tried enabling
--userland-proxy=true
and docker still hangs after a while. It appears Kubernates has a solution from what @thockin wrote above, but its not clear what--hairpin-mode=promiscuous-bridge
exactly does and how to configure that on a plain jane ubuntu 16.x docker install.I think I know why some dockerized nodejs app could cause this issue. Node uses keep-alive connections per default. When
server.close()
is used, the server doesn’t accept new connections. But current active connections like websockets or HTTP keep-alive connections are still maintained. When the dockerized app is also scaled to n this could result inwaiting for lo to become free
because when it is forced to termination lo newer was freed. When docker redistributes this app to another node or the app is scaled down docker sends a signal to the app that it should shutdown. The app listens to this signal and can react. When the app isn’t shutdown after some seconds, docker terminates it without hesitation. I added signal handlers and found out that when usingserver.close()
the server isn’t perfectly terminated but “only” stops accepting new connections (see https://github.com/nodejs/node/issues/2642). So we need to make sure that open connections like websockets or http keep-alive is also closed.How to handle websockets: The nodejs app emits to all websockets
closeSockets
when a shutdown signal is received. The client listens on thiscloseSockets
event and callssockets.disconnect()
and shortly aftersockets.connect()
. Remember thatserver.close()
was called so this instance doesn’t accept new requests. When other instances of this dockerized app is running the loadbalancer inside docker will eventually pick an instance which isn’t shutdown and a successful connection is established. The instance which should shutdown won’t have open websockets-connections.How to handle keep-alive HTTP connections: Currently I don’t know how this can be done perfectly. The easiest way is to disable keep-alive.
Another possibility is to set the keep-alive timeout to a very low number. For example 0.5 seconds.
Hope this could help others 😃
I have a bugzilla ticket open with Redhat about this.
Some developments: Red Hat put the IPV6 refcount leak patches from mainline on QA, looks like they’re queued up for RHEL 7.4 and may be backported to 7.3. Should be on CentOS-plus soon too. Note: This patch only fixes issues in SOME cases. If you have a 4.x kernel it’s a moot point since they’re already there.
This is definitely a race condition in the kernel from what I can tell, which makes it really annoying to find. I’ve taken a snapshot of the current mainline kernel and am working on instrumenting the various calls starting with the IPV6 subsystem. The issue is definitely reproducible now: looks like all you have to do is create a bunch of containers, push a ton of network traffic from them, crash the program inside the containers, and remove them. Doing this over and over triggers the issue in minutes, tops on a physical 4-core workstation.
Unfortunately, I don’t have a lot of time to work on this: if there are kernel developers here who are willing to collaborate on instrumenting the necessary pieces I think we can set up a fork and start work on hunting this down step by step.
I have 9 docker hosts all nearly identical, and only experience this on some of them. It may be coincidence, but one thing in common I’ve noticed is that I only seem to have this problem when running containers that do not handle
SIGINT
. When Idocker stop
these containers, it hangs for 10s and then kills the container ungracefully.It takes several days before the issue presents itself, and seems to show up randomly, not just imediately after running
docker stop
. This is mostly anecdotal, but maybe it will help someone.I think my colleague have fixed this recently http://www.spinics.net/lists/netdev/msg393441.html, we encountered this problem in our environment and then we found the issue, with this fix, we never encounter this problem any more. Anyone who encountered this problem, could your try this patch and see if it happen again. And from our analysis, it related to ipv6, so you also can try disable ipv6 of docker with
--ipv6=false
when starting docker daemon😮
Same as the above comments, also running on EC2 happens to be via elastic beanstalk using
64bit Amazon Linux 2016.03 v2.1.0 running Docker 1.9.1
This is still happening for me on kernel Ubuntu 5.8.0-41.46-generic 5.8.18
This patch has fixed this problem:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ee60ad219f5c7c4fb2f047f88037770063ef785f
We have analyzed in the following link, and this problem could also be reproduced: https://github.com/w-simon/kernel_debug_notes/blob/master/reproduce_a_leaked_dst_entry
That message itself is not the bug; it’s the kernel crashing afterwards; https://github.com/moby/moby/issues/5618#issuecomment-407751991
centos 7.2 still has this problem: kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Same on CentOS7.5 kernel 3.10.0-693.el7.x86_64 and docker 1.13.1
Same on Centos7 kernel 3.10.0-514.21.1.el7.x86_64 and docker 18.03.0-ce
I can confirm our issues were solved after disabling IPv6 upon boot (fron grub’s config file). Had numerous issues in a 7 node cluster, runs smoothly now.
I don’t remember where I found the solution, or did I find it myself, anyway, thanks @qrpike for suggesting this to others 😃 !!
Sorry to spoil the celebration, but we were able to reproduce the issue. We are now working with @ddstreet on it at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/ .
There is actually a much simpler repro of this issue (which, BTW is not the original issue).
This script just starts a SMB server on the host and then creates a network namespace with a
veth
pair, executesmount; ls; unmount
in the network name space and then removes the network namespace.Note adding a simple
sleep 1
after the unmount, either when executing in the namespace or before deleting the network namespace works without stalling at all when creating the new namespace. A sleep after the old namespace is deleted, does not reduce the stalling.@piec I also tested this with your repro and a
sleep 1
in the Dockerfile after the unmount and everything works as expected, no stalling, nounregister_netdev
messages.I’ll write this up now and send to
netdev@vger
Should be fixed in kernel 4.12 or late. Please check. https://access.redhat.com/solutions/3105941 and link to patch https://github.com/torvalds/linux/commit/d747a7a51b00984127a88113cdbbc26f91e9d815
Here is the CentOS 7.3 workaround that fixed it for me:
Here is the patch that solves it: https://bugs.centos.org/view.php?id=12711&nbn=1
UPDATE: This turned out not to solve the problem permanently. It showed up again several hours later with the following wall message:
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Hey all, just to be clear, all the “kubernetes workaround” does is enable promiscuous mode on the underlying bridge. You can achieve the same thing with
ip link set <bridgename> promisc on
using iproute2. It decreases the probability of running into the bug but may not eliminate it altogether.Now, in theory this shouldn’t work… but for some reason promiscuous mode seems to make the device teardown just slow enough that you don’t get a race to decrement the ref counter. Perhaps one of the Kurbernetes contributors can chime in here if they’re on this thread.
I can verify the workaround (NOT FIX) works using my environment-specific repro. I can’t really verify it helps if you’re using the IPVLAN or MACVLAN drivers (we use macvlan in prod) because it seems very difficult to get those setups to produce this bug. Can anyone else with a repro attempt to verify the workaround?
I saw the same
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
message on my CentOS 7 machine running a docker image of Jenkins. The CentOS 7 machine I was using was current with all the latest CentOS 7 patches as of approximately 20 Dec 2016.Since the most recent references here seem to be CentOS based, I’ll switch my execution host to a Ubuntu or a Debian machine.
I am running
Docker version 1.12.5, build 7392c3b
on that CentOS 7 machine. Docker did not hang, but the Jenkins process I was running in Docker was killed when that message appeared.Thanks so much for Docker! I use it all the time, and am deeply grateful for your work on it!
I am actually seeing this on Amazon Linux in an ECS cluster - the message occasionally throws but it doesn’t lock up, like reshen’s seeing now. Docker 1.11.2. Uname reports “4.4.14-24.50.amzn1.x86_64” as the version.
Would be really helpful if those of you who can reproduce the issue reliably in an environment where a crashdump is possible (aka not EC2) could in fact share this crashdump file, more information about how to enable kdump in ubuntu trusty can be found here and these are the crash options you need to enable when kdump is ready to generate a crashdump:
The crashdump can really help kernel developers find more about what is causing the reference leak but keep in mind that a crashdump also includes a memory dump of your host and may contain sensible information.
Yes, this issue happens on bare metal for any kernel >= 4.3. Have seen this on a lot of different machines and hardware configurations. Only solution for us was to use kernel 4.2.
I have a “one liner” that will eventually reproduce this issue for me on an EC2 (m4.large) running CoreOS 1068.3.0 with the 4.6.3 kernel (so very recent). For me, it takes about 300 iterations but YMMV.
Linux ip-172-31-58-11.ec2.internal 4.6.3-coreos #2 SMP Sat Jun 25 00:59:14 UTC 2016 x86_64 Intel® Xeon® CPU E5-2676 v3 @ 2.40GHz GenuineIntel GNU/Linux CoreOS beta (1068.3.0) Docker version 1.10.3, build 3cd164c
A few hundred iterations of the loop here will eventually hang dockerd and the kernel will be emitting error messages like
The reproducer loop is
EDITS
userland-proxy=false
I’m getting this on Ubuntu 14.10 running a 3.18.1. Kernel log shows
I’ll send
docker version/info
once the system isn’t frozen anymore 😃I finally found out how to suppress these messages btw. From this question on StackExchange, I commented out this line in
/etc/rsyslog.conf
:Very nuclear option, but at least now my system is usable again!
https://github.com/kubernetes/kubernetes/issues/64743#issuecomment-451351435 https://github.com/kubernetes/kubernetes/issues/64743#issuecomment-461772385
This information may be helpful to you.
Yes, I also did not have this problem after upgrading the kernel to 4.17.0-1.el7.elrepo.x86_64. I tried this before (4.4.x, 4.8, 4.14…) and it has failed. It seems that the problem will not occur again in the 4.17+ kernel.
For what its worth…we upgraded RHEL Linux kernel from 3.10.0 to 4.17.11. (Running Kubernetes cluster on). Before upgrading this bug was occuring on daily basis several times on different servers. Running with the upgrade for three weeks now. Bug occurred only once. So roughly said is reduced by 99%.
Ok, this bug still occurs, but probability has reduced.
I think if the containers are gracefully stops (PID 1 exist()s), then this bug will not bother us.
fixed it.
same issue on CentOS 7
Centos 7.5 with 4.17.3-1 kernel, still got the issue.
Env : kubernetes 1.10.4 Docker 13.1 with Flannel network plugin.
Log : [ 89.790907] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 89.798523] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 89.799623] cni0: port 8(vethb8a93c6f) entered blocking state [ 89.800547] cni0: port 8(vethb8a93c6f) entered disabled state [ 89.801471] device vethb8a93c6f entered promiscuous mode [ 89.802323] cni0: port 8(vethb8a93c6f) entered blocking state [ 89.803200] cni0: port 8(vethb8a93c6f) entered forwarding state
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1。
Now : The node IP can reach, but cannot use any network services , like ssh…
@bcdonadio I can reproduce the bug on my system by running this test script once per hour from cron:
This script is just producing a package list using an image in the Docker registry, and another using one that’s built locally so I can compare them. The Dockerfile is just this:
2-4 minutes later syslog gets this message:
In the last occurrence happened a few minutes after I ran the script manually. My guess is that after some timeout elapses after the container delete is attempted, the error condition is raised.
I’m certain the error condition is intermittent, because the script above runs as a cron job at :00 past each error. Here is a sample of the error output that syslog recorded:
So it happens somewhere in the range of 2 to 4 minutes after the containers run and exit and are deleted by docker because of the --rm flag. Also notice from the log above that there is not an error for every container that’s run/deleted, but it’s pretty consistent.
This is reproducible on
Linux containerhost1 4.9.0-0.bpo.2-amd64 #1 SMP Debian 4.9.18-1~bpo8+1 (2017-04-10) x86_64 GNU/Linux
withDocker version 17.04.0-ce, build 4845c56
running in priviliged mode when we have cifs mounts open. When the container stops with mounts open, Docker gets unresponsive and we get thekernel:[ 1129.675495] unregister_netdevice: waiting for lo to become free. Usage count = 1
-error.@stayclassychicago before I tried the
3.10.0-514.10.2.el7.centos.plus.x86_64
kernel I was getting thekernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
very regularly, nearly every time I ran a container withdocker run --rm ...
when the container exited. After the kernel upgrade and reboot, it completely stopped for many hours, and then came back again. Now half the time I delete containers it works properly, where it used to error very time. I don’t know for sure if the new kernel is helping, but it doesn’t hurt.If someone from Docker has time to try the Kubernetes workaround, please let me know and we can point you at it. I am unable to extract the changes and patch them into Docker myself, right now.
On Thu, Mar 9, 2017 at 7:46 AM, Matthew Newhook notifications@github.com wrote:
We’re getting this in GCE pretty frequently. Docker freezes and the machine hangs on reboot.
The container is running a go application, and has hairpin nat configured.
Docker:
Ubuntu 16.04 LTS,
Linux backup 4.6.0-040600-generic #201606100558 SMP Fri Jun 10 10:01:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Server Version: 1.13.1
Same issue. I’m using mount in privileged container. After 4-5 runs it freezes. Also i have same issue with latest standard kernel for 16.04
@r-BenDoan if you try to stop a container but it doesn’t respond to SIGINT, docker will wait 10 seconds and then kill the container ungracefully. I encountered that behavior in my nodejs containers until I added signal handling. If you see a container taking 10s to stop, it likely isn’t handling signals and is more likely to trigger this issue.
Make sure your containers can stop gracefully.
This might not be a purely docker issue - I’m getting it on a Proxmox server where I’m only running vanilla LXC containers (ubuntu 16.04).
Still happens with 4.9.0-0.bpo.1-amd64 on debian jessie with docker 1.13.1. Is there any kernel - os combination which is stable?
Just an observation - there seem to be different problems at play (that has been said before).
Some have noted logs alternating between any of those above and others only having one of the above only.
There is also a similar bug logged on Ubuntu. On this one, they seem to find that NFS is the problem.
I have this on two CentOS systems, latest updates on at least one of them.
@LK4D4 Oh, totally, just want to see those timeouts 😉
@RRAlex Do you know if anyone has reached out to the Ubuntu kernel team regarding a backport? We have a large production Docker deployment on an Ubuntu cluster that’s affected by the bug.
Ubuntu kernel team mailing list: https://lists.ubuntu.com/archives/kernel-team/2016-September/thread.html
Patch for the stable kernel: https://github.com/torvalds/linux/commit/751eb6b6042a596b0080967c1a529a9fe98dac1d
Ubuntu kernel commit log: http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/log/?h=master-next (Patch is not there yet)
coolljt0725 colleagues’s patch has been queued for stable, so hopefully it’ll be backported into distros soon enough. (David Miller’s post: http://www.spinics.net/lists/netdev/msg393688.html )
Not sure where that commit is though and if we should send it to Ubuntu, RH, etc. to help them track & backport it?
Going to show up here at some point I guess: http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/tree/net/ipv6/addrconf.c
EDIT: seems to be present here: https://github.com/torvalds/linux/blob/master/net/ipv6/addrconf.c
@RRAlex This is not specific to any docker version. If you are using
--userland-proxy=false
on the daemon options… OR (from what I understand) you are using kubernetes you will likely hit this issue.The reason being is the
--userland-proxy=false
option enables hairpin NAT on the bridge interface… this is something that kubernetes also sets when it sets up the networking for it’s containers.Just hopping on board. I’m seeing the same behavior on the latest Amazon ec2 instance. After some period of time, the container just tips over and becomes unresponsive.
$ docker info Containers: 2 Images: 31 Server Version: 1.9.1 Storage Driver: devicemapper Pool Name: docker-202:1-263705-pool Pool Blocksize: 65.54 kB Base Device Size: 107.4 GB Backing Filesystem: Data file: /dev/loop0 Metadata file: /dev/loop1 Data Space Used: 1.199 GB Data Space Total: 107.4 GB Data Space Available: 5.754 GB Metadata Space Used: 2.335 MB Metadata Space Total: 2.147 GB Metadata Space Available: 2.145 GB Udev Sync Supported: true Deferred Removal Enabled: false Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Data loop file: /var/lib/docker/devicemapper/devicemapper/data Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata Library Version: 1.02.93-RHEL7 (2015-01-28) Execution Driver: native-0.2 Logging Driver: json-file Kernel Version: 4.4.10-22.54.amzn1.x86_64 Operating System: Amazon Linux AMI 2016.03 CPUs: 1 Total Memory: 995.4 MiB Name: [redacted] ID: OB7A:Q6RX:ZRMK:4R5H:ZUQY:BBNK:BJNN:OWKS:FNU4:7NI2:AKRT:5SEP
$ docker version Client: Version: 1.9.1 API version: 1.21 Go version: go1.4.2 Git commit: a34a1d5/1.9.1 Built: OS/Arch: linux/amd64
Server: Version: 1.9.1 API version: 1.21 Go version: go1.4.2 Git commit: a34a1d5/1.9.1 Built: OS/Arch: linux/amd64
Thanks for the link, Justin! I’ll troll Linus =)
kind regards. =* ❤️
@LK4D4 I believe it is just a high amount of created/destroyed containers, specially containers doing a lot of outbound traffic, we also use LXC instead of docker but the bug is exactly the same as the one described here, I can try to reproduce using your method if it is easy to describe and/or does not involve production load, the idea is to get a crashdump and maybe find more hints about what exactly trigger this bug.
happy birthday, bloody issue =) 6 May 2014
same thing here. Just rebooting. Latest docker version. Ubuntu 14.04.
Sorry to hear. This is a real problem. I guess that many scenarios lead to this issue. On previous versions, the way we managed that was by preventing the namespace deletion.
See the commits authored by Eric Dumazet: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=v5.4.121
It was specifically about ip6_vti interfaces: 5.4.120 added https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.4.121&id=98ebeb87b2cf26663270e8e656fe599a32e4c96d which introduced the regression.
(If I remember right, same issue was seen in some other stable/LTS kernel versions as well.)
Hi, this regression was introduced in 5.4.120, and is fixed in 5.4.121.
Found out my Docker logs are also being spammed. Kernel 5.4.0, Docker 19.03.8:
This must be in 4.19.30 onwards.
Friends, I have been using the 4.19 kernel for stable operation for half a year. I hope that you can enjoy stability as well.
You may upgrade to 4.19. It’s in the backports.
BTW it’s been a year for us here. 😉
Anyone saw this bug with 4.19?
Are these vanilla kernels or heavily patched by distros with backported fixes?
I’m seeing this issue with one of my machines running Docker on Debian 9 Stretch (
4.9.0-8-amd64
). I experience this issue with a tunnel created within the Docker container via Docker Gen and it generates a kernel panic:Here’s our Docker information:
Does anybody know if there’s a temporary fix to this without restarting the entire machine? We’d really prefer not having to restart the entire machine when we experience this issue.
Somewhat off-topic, we cannot suppress the kernel panic messages within the terminal as well. I’ve tried
dmesg -D
anddmesg -n 1
. However, no luck. Is there a way to suppress these type of kernel panic messages from within the terminal? It’s annoying trying to type commands and having that message pop up every 10 seconds or so.Thanks.
No, it means --network=host
@victorgp We still experience the issue with the 4.15 kernel. We will report here when we have tested with 4.16 kernel (hopefully in a few weeks).
Did someone have this issue in a kernel 4.15 or newer?
This Dan Streetman fix (https://github.com/torvalds/linux/commit/4ee806d51176ba7b8ff1efd81f271d7252e03a1d) is first included in 4.15 kernel version and it seems that at least for someone it is not happening anymore since they upgraded to 4.16 (https://github.com/kubernetes/kubernetes/issues/64743#issuecomment-436839647)
Did someone try it out?
@dElogics thanks for letting us know, could you please show us what commands you ran to set this systemd limits to unlimited. I like to try that too.
@hzbd You mean delete the user-defined bridge network? Have your tried to dig further to find out why? Please let me know if you did that. I really appreciate that.
The docker daemon. The systemd as in Debian 9 (232-25).
Not sure about RHEL, but I’ve personally seen this issue on RHEL too. I’d set LimitNOFILE=1048576, LimitNPROC=infinity, LimitCORE=infinity, TasksMax=infinity
@dElogics What version of systemd is considered “newer”? Is this default limit enabled in the CentOS 7.5 systemd?
Also, when you ask if we’re running docker under any limits, do you mean the docker daemon, or the individual containers?
we have been confused by this issue for weeks . Linux 3.10.0-693.17.1.el7.x86_64 CentOS Linux release 7.4.1708 (Core)
Experienced same issue on CentOS. My kernel is 3.10.0-693.17.1.el7.x86_64. But, I didn’t get similar stack trace in syslog.
@fho thanks. You actually don’t need docker at all to repro, just running the samba client in a network namespace will do the trick as per https://github.com/moby/moby/issues/5618#issuecomment-318681443
@rn thanks for the investigations!
I’m running on a baremetal desktop PC so I have access to everything. It’s an i7-4790K + 32 GiB. Currently I’m running on an up-to-date Arch Linux + kernel from the testing repo (4.12.3-1-ARCH)
In my case everything behaves as you describe in your Experiment 2 (4.11.12 kernel):
unregister_netdevice: waiting for lo to become free. Usage count = 1
message appears repeatedly if I try to run any new container in the 4+ minutes delay after the client-smb has exited. And only appears if you run a new container in that 4 minutes time lapse. Running a new container after these 4 minutes will be “normal”So I suppose there’s an issue somewhere in the clean up process of the smb-client container related to network interfaces
Hi,
I’ve just created 2 repos https://github.com/piec/docker-samba-loop and https://github.com/piec/docker-nfs-loop that contain the necessary setup in order to reproduce this bug
My results:
docker-samba-loop
in a few iterations (<10). I can’t reproduce it withdocker-nfs-loop
docker-samba-loop
, didn’t trydocker-nfs-loop
Hope this helps Cheers
@drweber you will also find this patch in upcoming stable releases (for now 4.11.12, 4.9.39, 4.4.78, 3.18.62)
I have kernel version 3.10.0-514.21.1.el7.x86_64, and I still have the same symptom.
I can make this happen reliably when running Proxmox and using containers. Specifically, if I have moved a considerable amount of data or moved really any amount of data very recently, shutting down or hard stopping the container will produce this error. I’ve seen it most often when I am using containers that mount my NAS within, but that might be a coincidence.
And from within Proxmox:
It’s worth noting that Docker is not installed on this system and never has been. I’m happy to provide any data the community needs to troubleshoot this issue, just tell me what commands to run.
FWIW kubernetes has correlated this completely to veth “hairpin mode” and has stopped using that feature completely. We have not experienced this problem at all, across tens of thousands of production machines and vastly more test runs, since changing.
Until this is fixed, abandon ship. Find a different solution 😦
On Wed, Feb 15, 2017 at 10:00 AM, ETL notifications@github.com wrote:
Yes upgrading kernel on Proxmox to latest 4.9.9 didn’t resolve the error. Strange as it’s just appeared after a year without issues.
There might be something in a previous statement further up in the thread about it being linked to either NFS or CIFS shares mounted.
Sent from my iPhone
@jsoler right, I was doing that too, still happened. Only once I did it pxe level did it stop.
I have/had been dealing with this bug for almost a year now. I use CoreOS with PXE boot, I disabled ipv6 in the pxeboot config and I haven’t seen this issue once since then.
Hey, for everyone affected by this issue on RHEL or CentOS, I’ve backported the commit from the mainline kernels (torvalds/linux@751eb6b6042a596b0080967c1a529a9fe98dac1d) that fixes the race condition in the IPV6 IFP refcount to 3.10.x kernels used in enterprise distributions. This should fix this issue.
You can find the bug report with working patch here: If you are interested in testing it and have a RHEL 7 or CentOS 7 system, I have already compiled the latest CentOS 7.3 3.10.0-514.6.1.el7.x86_64 kernel with the patch. Reply to the CentOS bugtracker thread and I can send you a link to the build.
Note: there may be another issue causing a refcount leak but this should fix the error message for many of you.
@stefanlasiewski @henryiii @jsoler
I’ll be trying out a build also adding this fix: http://www.spinics.net/lists/netdev/msg351337.html later tonight.
Did anyone , reach a fix procedure for rancher os ?
@cpuguy83 Docker doesnt hang for me when this error occurs but the containers get killed which in my situation is breaking my Jenkins/CI jobs.
Getting this issue on Cent OS 7:
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Linux foo 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
docker-engine-1.12.5-1.el7.centos.x86_64
This is effecting my CI builds which are running inside Docker containers and appear to be suddenly dying during which this console message appears. Is there a fix or a workaround? thanks!
1.12.4 and 1.13 should, in theory, not freeze up when this kernel issue is hit. The reason the freeze in the docker daemon occurs is because the daemon is waiting for a netlink message back from the kernel (which will never come) while holding the lock on the container object.
1.12.4 and 1.13 set a timeout on this netlink request to at least release the container lock. This does not fix the issue, but at least (hopefully) does not freeze the whole daemon. You will likely not be able to spin up new containers, and similarly probably will not be able to tear them down since it seems like all interactions with netlink stall once this issue is hit.
@reshen I’m going to build 4.8.8 this weekend on my laptop and see if that fixes it for me! ᐧ
On Thu, Dec 1, 2016 at 10:29 AM, Ernest Mueller notifications@github.com wrote:
– Keifer Furzland http://kfrz.work
Ubuntu 16.04 with 4.4.0-47 was still affected… trying 4.4.0-49 now, will report later.
edit: 2016-11-28: -49 is sitll showing that log line in dmesg.
@reshen Excellent research. So far I’ve also not been able to reproduce the problem using Linux 4.8.8 on Xubuntu 16.04.
I’ve been using the Ubuntu mainline kernel builds. I do not have a well defined test case, but I could consistently reproduce the problem before by starting and stopping the set of docker containers I work with.
To test Linux 4.8.8 the easiest for me was to switch from aufs to overlay2 as storage driver as the mainline kernel builds did not include aufs. I don’t think will influence the test, but it should be noted.
In the past I’ve tested Linux 4.4.4 with the 751eb6b6 backported by Dan Streetman, this did not seem to reduce the problem for me. It will be interesting to see if also backporting the two patches noted by you (5086cadf and 6fff1319) can give the same result as 4.4.8.
@reshen I’ll get us updated to 4.8.8 and report back 👍 Thanks much for your research!
I’ve been testing 4.8.8 in a tight loop (see [2] from my earlier comment for the test case) non-stop for the last 4 days. So far, so good.
Facts
Suppositions @meatballhat pointed out that their production servers experienced the problem while running 4.8.7. This leaves us with two possibilities:
Can we get a few folks to try 4.8.8 to see if they are able to reproduce this problem?
centos7 kernel 4.4.30 again~~~~
how to fix the bug, I meet it every day
same… Fedora 24, happen randomly, can be fine for week, than i get one every 10 hours
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
@thaJeztah ah, I see the question you were directing me at now.
So the patch is in the upstream 4.4 stable queue, for 16.04 it’s likely to be included in not the next kernel SRU (which is already in progress) but the one after that, about 5-6 weeks from now. If it is needed in 14.04 too please let me know so that it can be backported.
@rdallman Deactivating ipv6 via grub does not prevent
unregister_netdevice
for me in either ubuntu 16.06 (kernel 4.4.0-36-generic) or 14.04 (kernel 3.13.0-95-generic). Regardless of the--userland-proxy
setting (either true or false).@dadux Thank you for your help. On Kubernetes 1.3.4 CoreOS 1068 Stable, Docker 10.3, Flannel as networking layer, I have fixed the problem by making the following changes in my CoreOS units:
Added the following to
kubelet.service
:If your docker daemon uses the
docker0
bridge, setting--hairpin-mode=promiscuous-bridge
will have no effect as the kubelet will try to configure the un-existing bridgecbr0
.For CoreOS, my workaround to mirror the Kubernetes behaviour but still using flannel :
docker0
interface to promiscuous mode. (Surely there’s a more elegant want to do this ?) :kubelet --hairpin-mode=none
You can check if hairpin is enabled for your interfaces with
brctl showstp docker0
orfor f in /sys/devices/virtual/net/*/brport/hairpin_mode; do cat $f; done
Happens on CoreOS alpha 1097.0.0 also.
Kernel: 4.6.3 Docker: 1.11.2
I can confirm that moving away from hairpin-veth and using the cbr0 bridge I cannot reproduce the problem anymore.
@edevil I use terraform to create routes. You can find it at this repo. I have quickly created this configuration and tested only once. I hope it is enough to provide basic logic behind it.
We have encountered this bug while using Flannel with hairpin-veth enabled at kubernetes cluster (using iptables proxy). This bug was happening only when we run-stop too many container. We switch to using cbr0 bridge network and promiscuous-bridge hairpin mode and never see it again. Actually it is easy to reproduce this bug if you are using hairpin-veth, just start this job with 100 containers with kubernetes.
This problem does not correlate with the filesystem in any way. I have see this problem with zfs, overlayfs, devicemapper, btrfs and aufs. Also with or without swap. It is not even limited to docker, I hit the same bug with lxc too. The only workaround, I currently see, is not to stop container concurrently.
@sirlatrom not fixed. Seeing this again 😭 Required multiple reboots to resolve.
Currently running 3.19.0-18-generic. Will try upgrading to latest
I encountered this again on CoreOS 1010.1.0 with kernel 4.5.0 yesterday, it had been after several containers were started and killed in rapid succession.
@joshrendek it’s a kernel bug. Looks like even newly released kernel 4.4 does not fix it, so there is at least one more race condition somewhere 😃