weave: Memory leak/OOM with "Received update for IP range I own" messages in log
What you expected to happen?
Memory usage of the weave process is expected to be stable and not grow unbounded over time.
What happened?
I had a stable 2.5.0 weave network in my Kubernetes 1.9 cluster of about 100 nodes. The weave was initially installed by kops and had a memory limit of 200mb set. There were no occurrences of “Received update for IP range I own” in the log files and memory usage for weave pods in the cluster had been very stable over time for weeks.
As part of refactoring some services, about 30 nodes were removed from the cluster (bringing the cluster size down to 71 nodes). After this action, the memory usage of the weave pods started growing until it exceeded the memory limit, at which time the pod was OOM killed and restarted. These restarts result in brief disruption for the node on which the restart occurs. At this time the “Received update for IP range I own” message started appearing in the logs (although not from all pods, this nuance was not discovered until later).
After looking at some related tickets and such here (#3650, #3600, #2797), the following actions were taken:
- The “status ipam” output was checked and seen to have a lot of “unreachable” peers listed in it
- The unreachable nodes listed by “status ipam” were removed with rmpeer on one node, though this did not fix all the unreachables on all the nodes, the process of listing and removing unreachables was done on a couple of other systems before all systems were showing all 71 nodes in the list and all as reachable.
- updated to 2.5.2 as there were some related looking tickets mentioned in that release
- increased the memory limit so that OOM killing might happen less frequently (from 200mb to 1gb)
Weave pods continue to grow in memory usage, the new 2.5.2 pods have not hit their 1g limit yet but look to be heading that way. The “update for IP range I own” messages are still being seen - however on closer inspection these messages are only coming from 3 of the 71 pods.
How to reproduce it?
Have a working kubernetes cluster and delete some nodes out of it.
Anything else we need to know?
Versions:
Version: 2.5.2 (up to date; next check at 2019/07/12 18:43:12)
Service: router
Protocol: weave 1..2
Name: ea:38:6f:58:7b:81(ip-10-32-124-236.us-west-2.compute.internal)
Encryption: disabled
PeerDiscovery: enabled
Targets: 71
Connections: 71 (70 established, 1 failed)
Peers: 71 (with 4966 established, 4 pending connections)
TrustedSubnets: none
Service: ipam
Status: ready
Range: 100.96.0.0/11
DefaultSubnet: 100.96.0.0/11
admin@ip-10-32-92-49:~$ docker version
Client:
Version: 17.03.2-ce
API version: 1.27
Go version: go1.7.5
Git commit: f5ec1e2
Built: Tue Jun 27 02:09:56 2017
OS/Arch: linux/amd64
Server:
Version: 17.03.2-ce
API version: 1.27 (minimum version 1.12)
Go version: go1.7.5
Git commit: f5ec1e2
Built: Tue Jun 27 02:09:56 2017
OS/Arch: linux/amd64
Experimental: false
Linux ip-10-32-92-49 4.4.121-k8s #1 SMP Sun Mar 11 19:39:47 UTC 2018 x86_64 GNU/Linux
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.8", GitCommit:"c138b85178156011dc934c2c9f4837476876fb07", GitTreeState:"clean", BuildDate:"2018-05-21T18:53:18Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}```
Logs:
This is the logs from one of the weave pods that is showing the “Received update for IP range I own” messages: weave-net-q56hl.log This is the pprof/heap output for the above node weave-net-q56hl.heap.gz This is status ipam from the above node weave-net-q56hl.ipam.txt This is status peers from the above node weave-net-q56hl.peers.txt
This is the logs from one of the weave pods not showing that message: weave-net-9t7d8.log This is the pprof/heap output for the above node weave-net-9t7d8.heap.gz
And here’s a picture showing the history of memory usage form these pods
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 2
- Comments: 16 (4 by maintainers)
OK so since the weave nodes became consistent, things have been stable with no memory growth or OOM issues. I have done a minor amount of scaling (perhaps up/down of 10 nodes or so) and things have remained consistent throughout that with 2.5.2.
I wrote a quick script verify-weave.sh which will go through all weave pods in the cluster, compute a checksum on the ‘status ipam’ list and tell you if there are any weave pods that disagree on their peer list. We used this to identify the different groups of pods in the cluster and decide which we wanted to preserve and which ones we wanted to reset/restart. Another quick script bump-weave.sh was then used to remove the db file and restart for those weave pods we wished to reset.
Both of those scripts were written quickly to address a particular condition here and so are not intended as portable or good examples of coding, but may be useful to someone so here they are.
@itskingori , @murali-reddy - thank you both for your help on this.
@itskingori - I will keep an eye on it for sure. We’re not doing a huge amount of scaling at the moment, however we did just do some refactoring of instance types hence we had a fairly large amount of hosts created and deleted which is what set the current condition into motion. There is some more similar adjustments still outstanding so I will check and let you know how things look as that happens.
Also, thanks very much for your feedback and detail - sounds like your process to recover normal operation was very similar to mine. The only difference seems to be that you rebooted/terminated the nodes whereas I just recycled the weave pod on the node and left the other pods and the node itself alone. We’ll be doing this same process in our prd cluster today so if I cook up a noteworthy script I will share it here. Cheers -
Yes, there are no ramifications. For now these manual steps will reconcile from state where there were IPAM conflicts. In 2.6 release IPAM conflicts are automatically resolved (https://github.com/weaveworks/weave/pull/3637)
Yes, this is more or less the position we were in. I’ll borrow your commands are they seem similar to (if not better than) what I have to look through weave 👇
I use something like
./script.sh <cluster> "<status ipam>"
to loop through all the weave pods.Weave works by sharing state by consensus, the fact that there are two states breaks weave and you need to bite the bullet and get rid of a group or any weave pods that have inconsistent state. There are two ways to do this:
I ran with no. 2 because this was production so recovery was critical. I didn’t have time to write a script for no. 2. I pretty much figured out which group to terminate and terminated them all at the same time. I’m guessing, it’s better to terminate weave while deleting the database on the host so that the new pod starts from a clean slate and gets its state from other correct weave pods.
I didn’t do anything to the remaining group as their state was similar and correct. Once you get rid of the bad weave pods, everything goes back to normal and the state is now consistent among those remaining … the problem goes away, and the cluster heals.
I didn’t do anything to the ones remaining. The fact that they had similar state was all I needed.
I terminated them all at the same time because I could not risk them sharing state any longer. I figured doing it one by one might not work because new pod might get it state from another bad one. I wanted all the ‘bad’ ones gone and only the ‘good’ ones left to share state.
It didn’t cause more breakage for me. Other than the time when I had to wait for the autoscaling group to replace the nodes that I just terminated. And the disruption it caused to the apps on the nodes I just terminated as new pods come up.
Here are the log files from the other two weave pods producing the “Received update for IP range I own” log messages - only 3 of the 71 pods are producing this message (the other is in the original comment)
weave-net-67hrw.log weave-net-cqfc9.log