zellij: Memory leak

Thank you for taking the time to file this issue! Please follow the instructions and fill in the missing parts below the instructions, if it is meaningful. Try to be brief and concise.

In Case of Graphical or Performance Issues

  1. Delete the contents of /tmp/zellij-1000/zellij-log, ie with cd /tmp/zellij-1000/ and rm -fr zellij-log/
  2. Run zellij --debug
  3. Recreate your issue.
  4. Quit Zellij immediately with ctrl-q (your bug should ideally still be visible on screen)

Please attach the files that were created in /tmp/zellij-1000/zellij-log/ to the extent you are comfortable with.

Basic information

zellij --version: zellij 0.30.0 stty size: 46 197 uname -av or ver(Windows): Linux ip-### Wed Dec 16 22:44:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

List of programs you interact with as, PROGRAM --version: output cropped meaningful, for example: nvim --version: NVIM v0.5.0-dev+1299-g1c2e504d5 (used the appimage release) alacritty --version: alacritty 0.7.2 (5ac8060b)

Further information Reproduction steps, noticeable behavior, related issues, etc

I have a long running zellij instance to which I connect and disconnect often (more than once per day). After weeks of using the same session, I notice that more than 2GB or RAM are being used by the zellij server. See the attached screen shot.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 4
  • Comments: 37 (19 by maintainers)

Most upvoted comments

When I have btop running in its own tab, zellij will gobble up 3–5 GB RAM in a matter of a couple of days.

  • zellij: 0.36.0
  • btop: 1.2.13
  • O/S: Linux 6.3 (x86_64)

Thank you for reporting this!

Sorry we took so long to reply. I tried to reproduce this with the latest zellij version locally and I can confirm this behavior. That clearly shouldn’t happen!

I have a suspicion where this may originate. I’ll investigate and get back to you once I found something.

So, it’ been a while now since I looked at this, but I ran an experiment and gathered some data on this (thanks for being patient @imsnif).

At least for me the results were quite eye opening, and challenged what I thought I knew about memory (probably not a lot to start with 😄).

I started a couple of different sessions on a VPS, and left them running for some days, all using this layout:

layout {
    tab_template name="default-tab" {
        pane size=1 borderless=true {
            plugin location="zellij:compact-bar"
        }
        children
    }
    default-tab name="Btop" {
        pane command="btop"
    }
    default-tab name="Empty" {
        pane
    }
}

A couple of times I logged back in, reattached to each, and then left them detached again.

This chart shows the memory usage over time (logged using @kseistrup’s script) of these sessions, where:

  • polite-mouse: release binary for v0.38.1 from github, using musl allocator
  • awesome-tomato: main branch (697723ddd30715e2997ad12352ea3c89ebdb7e17) compiled with jemalloc as allocator
  • verdant-lake: main branch (697723ddd30715e2997ad12352ea3c89ebdb7e17) compiled with mimalloc as allocator

Memory Usage Chart

Note: the peak for polite-mouse right at the end was me reattaching.

What I found surprising in all cases is that the memory usage takes quite a long time to go down from the initial peak, but does not seem to grow in a significant way afterwards (although the jemalloc build was trending upwards, unlike the other two). So it really looks like this isn’t a memory leak with the zellij + btop combo but rather an allocator related issue (unless my scenario is missing some important interaction).

Another thing to note is that depending an how a user builds zellij one could end up with a program that uses one of multiple allocators, and each of these has different algorithms and behaviors:

  • musl if using the official Linux binaries from GitHub
  • glibc (probably?) if building from source on Linux (e.g. via cargo install)
  • the MacOs allocator on Mac

I’m hoping that someone else more knowledgeable on allocators can also provide some insights into this, but for now my take is that one can’t just look at ps output on it’s own to say there is or there isn’t a memory leak, there could be other factors at play (like heap fragmentation and other allocator internals).

There’s also been some substantial changes to the code since I ran this, so it might be worth giving another experiment like this a try.

Thanks @kseistrup - @tlinford showed me graphs that show the memory being released after a few days. I know he’s very busy, so I’m going to give him some time to write this up here.

@imsnif, @tgross35

I still see a minor leak when I run btop in a zellij tab. It’s in the ballpark of 100-200 bytes per hour, so it’s nothing near what I have seen or reported earlier, and I frankly didn’t have the time or energy to re-report it, so I had decided to just live with it and restart zellij every now and then. But now that someone is commenting on it, so will I. I haven’t saved any data, though, so this comment is all I can offer for now.

Computer programs are never creative, so there must be something that triggers the leak.

This is what made me suspect the allocator, which in Rust’s case doesn’t immediately free memory when the relevant data is dropped, and so can be particular to the whole state of its memory space (the screen thread in this case). But it’s really just guessing. We’re chipping away at this (I think @tlinford found a really interesting leak in the output buffer) and trust him to find whatever can be found here.

We might eventually “explain” this away, but I think it’s a good idea to try and track down every stray byte we cannot explain at the very least. Thank you very much for helping us out @kseistrup !

I’m not sure how a btop tab can be an issue here and not there. Does btop show colours when you run it? I always imagined that zellij blew up because of the gazillions of ANSI colour codes that btop is spewing out. But perhaps I lack imagination…

Honestly I’m also baffled. I’ve been around this area with a very fine comb and the only thing I found to help is the linked fix in the recent version. We keep the colors in a stack allocated structure that’s essentially an elaborate “terminal character configuration”. It’s the same size for each character whether you have styles or not (and it’s really not that big).

I thought this was somehow us not dropping (eventually deallocating) the terminal lines, but that also wasn’t it. The particularities of this issue across machines and setups are the reason I currently suspect the allocator. Let’s see.

Alright - so I issued a fix for this in #2675 - it will be released in the next version. Thank you very much @kseistrup for the reproduction.

This issue has become a bit of a grab bag for memory issues, some of them actual issues others symptoms of other issues (eg. things we need to give an error about instead of crashing). So I’m going to close it now after this fix. If anyone is still experiencing memory issues (starting from next release), please open a separate issue and be sure to provide a detailed and minimal reproduction. Thanks everyone!

Hey, just to confirm: I’m reproducing this with the btop tab. Thanks for the detailed reproduction. I’ll poke around and see what’s up hopefully some time this month.

this layout causes zellij to rapidly consume all memory on the system

moved to https://github.com/zellij-org/zellij/issues/2407

This issue is more about memory usage in sessions that are long running and/or had a lot of tabs / panes / scrollback lines.

layout {
    pane size=1 borderless=true {
        plugin location="zellij:tab-bar"
    }
    
    pane split_direction="vertical" {
        pane split_direction="vertical" size="80%"
        pane split_direction="vertical"
    }
    pane split_direction="horizontal" {
        pane split_direction="horizontal" size="80%"
        pane split_direction="vertical"
    }
    
    pane size=2 borderless=true {
        plugin location="zellij:status-bar"
    }
}

this layout causes zellij to rapidly consume all memory on the system, a recoverable condition only because I have 64 gigs of RAM and it takes it a bit to fill that much up. It gets to 20 gigs in seconds.

It is unresponsive SIGINT and SIGQUIT and has to be killed manually quickly

$ stty size
45 167

Unfortunatelly I don’t have the logs from the attached screen shot session. If this is a blocker in investigating this issue, then next time I will attach another screen shot and will copy the logs also before.