compiler-explorer: GHC 8.6.x fails with lack of memory in prod

ghc: failed to create OS thread: Cannot allocate memory

Probably due to the 1.25GB virtual RAM limit we now enforce for compilations. Older GHCs seem ok.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 18 (7 by maintainers)

Most upvoted comments

So after looking at the bug and talking with some people on IRC, it doesn’t seem like there is a good fix on the GHC side. What is going on: GHC reserves a large amount of virtual memory for it’s runtime system (but this is not committed memory). While they do try to respect the ulimit and I did try to leave some extra space for pthread stack space, that doesn’t seem to reliably work. Sometimes one value crashes it, but a lower value won’t. Figuring out how to make this work seems more difficult than anticipated.

Apparently other VMs/runtime systems do something similar. Running javac -version with a ulimit of 1GB give me an error as well:

Error occurred during initialization of VM Could not reserve enough space for 524288KB object heap

(and I could not get it to work with a vmem limit of 1GB, using -Xmx and -XX:MaxMetaspaceSize, it would crash at different points)

This probably hasn’t been an issue so far, bc. GCC, Clang, rustc, CPython all don’t use a GC or runtime. But limiting virtual memory doesn’t seem like a good solution anyway, when you want to limit resident set size (RSS). Unfortunately, RLIMIT_RSS doesn’t work any more on kernels > 2.4.30.

I’m not very familiar with Compiler Explorers architecture, but I think you use firejail as sandboxing mechanism here and the rlimit-as option which uses setrlimit. Now I haven’t seen any other way in firejail to limit RSS, but there is a cgroup option and cgroups are now the preferred mechanism for resource limits. You could create a cgroup with the given memory limit and add new tasks to it. I’ve tested that with GHC and it works.

Now I don’t know if that solution is workable for you, since it would require a setup step where this cgroup would have to be created and that requires root privileges.

An alternative would be to add support for RLIMIT_DATA to firejail. This limits the process’ data segment size. So process size would be something like data + code, but I’m not sure what else you would have to add to the calculation (stack sizes?).

@siedentop good idea, will do