packwerk: [Bug Report] Max RSS is Significantly Higher in Version 2.0
Description Memory usage in packwerk v2.0 is significantly higher than it was in v1.4—so much higher that we will need to increase the resource class of our CI nodes in order for us to run packwerk v2.0. It looks like the max RSS for the whole process tree has increased from ~850 MB to about ~2.5GB for our use-case.
I ran gnu time -v
on bin/packwerk check
in our application working directory using both versions of packwerk and saw the following results:
Version 1.4 | Version 2.0 |
|
|
To Reproduce Run packwerk while watching memory usage, compare between v1.4 and v2.0
Expected Behaviour I expected memory usage to not increase so dramatically between v1.4 and v2.0.
Screenshots
The GNU time output only captures the max RSS of the root process, so I grabbed a couple screenshots from activity monitor to demonstrate that the memory usage is much higher in each of the child processes as well.
Packwerk v1.4 Memory Usage
Packwerk v2.0 Memory Usage
Version Information
- Packwerk: v2.0
- Ruby v3.0.2
Additional Context
I haven’t looked into the changes yet, but I assume this is because packwerk now loads our rails application in each of its processes? I still wouldn’t expect each child process’s max RSS to balloon up so much.
Do you all have any idea why this might be? And secondly, any idea if it would be possible and reasonable to bring this number back down?
Thanks!
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (15 by maintainers)
Rails is very good with copy-and-write, that is how puma and unicorn work in production without each process having the entire Rails application in resident memory, so if we boot the application in the master process, before we fork the other processes, the resident memory of each one of those process would be only for the new allocated or modified memory.
I haven’t had a chance to look at the proposals yet, but I can say that we were able to work around the issue by reducing parallelism for now, though I would still like to see this improved.
I think booting rails in a side process and sending the load paths back to the main process before the fork sounds like a good implementation to me, but I need to look into it more closely. From a high level, though, I would support pursuing a more optimal solution over thrashing through a short-term fix.
Hey all – Thanks for this report and for the great discussion around it. I’ve been looking into this issue and seeing what it would look like to implement @ngan’s suggestion of extracting what we need from Rails in a separate isolated process. The results are here: https://github.com/Shopify/packwerk/pull/166
I’m pairing with @ngan tomorrow to chat through this and get his feedback, but I’d love your feedback as well.
I expect that if you pull this version and use it your CI environment it should resolve the memory issue. I also expect that
bin/packwerk check
andbin/packwerk update-deprecations
should have the same behavior as the version on main.Here are the results before and after this change on our codebase.
While it does resolve the memory issue, it does slow down
bin/packwerk check
on one file quite a bit, presumably we are no longer using spring to load the bundle in the main process.Memory Usage Results
Before
ps aux | grep packwerk
After
ps aux | grep packwerk
Speed Results
Before
After
Thanks for making that distinction more clear @exterm
I think that makes sense. To reiterate what we chatted about over video call, I think that moving to a faster parser should give us a lot of wiggle room to move to a thread-based approach to address this memory issue more simply.
Thanks for the comments @exterm and @jordanstephens.
For now, based on the feedback I’ve received which all makes sense, I think I’ll close the proposal PR in favor of a better longer term solution.
Once everyone is back from the holidays, I’d love to (A) align on what tradeoffs feel like the right ones (complexity/maintainability, memory, performance, etc and (B) align on an implementation for delivering on those tradeoffs.
Hm, that’s a good question… when I ran it with threads without eager loading, I got some errors. One of them, iirc, was “SyntaxError: dynamic constant assignment error” in one of the files. There were many other errors as well. But looking through packwerk, I can’t see anything that actually references the app’s constants causing it to autoload.
@mclark autoload paths are derived during framework boot. A lot of other things can mutate the paths on boot as well, including gems, and the app itself. It may not seem like it, but it’s actually very predictable for a Rails app to have its paths ready to go. It’s actually a public API. I really like booting up the app to find these things out as it allow apps to do more custom things to autoload paths. And as long as Packwerk can discern from the public api, we should be good.
I want to solve this problem as well because, right now, it’s forking 10x with 10x the memory usage:
Hmm this is an interesting problem. Great analysis @jordanstephens and @ngan!
I like the automated determination of load paths for Packwerk, but I don’t like that we need to load the entire Rails app just to get them. It’s like having a dependency on an abstract library, very unpredictable. I don’t know much about Rails internals but I wonder if there is anything we can do to pull the load path logic into a (Ruby) file with minimal dependencies that Packwerk could run. This is obviously a longer term solution but feels like it could be a good architectural change for Rail as well. The same might work for inflections as well.
In the near term, I wonder if it is worth introducing an interface inside Packwerk to provide the load paths and inflections, and have the default implementation just read from the Rails app as we do now. An alternative implementation might read from the config file as we did before. A future implementation might take advantage of future interfaces in Rails that allow us to get this information without load the entire app.
Sorry for the “Why not both?” proposal 😆
I’m guessing this make sense and it sort of expected.
https://github.com/Shopify/packwerk/blob/4da4215cb9d15aeb8020a6324edc014c6094c50a/lib/packwerk/parse_run.rb#L74
Parallel
will default to forking by the cpu count. And since we’re in a context of the Rails application, it’ll multiply the memory footprint dramatically (depending on the number of CPUs you have).Off the top of my head, I can see 2 ways of resolving this issue: (1) switch to threads, or (2) load the rails environment only to extract the information we need. I’ll outline the choices below:
Switch to threads
Switching to threads is a little tricky because we’ll have to deal with thread issues. It’s not too bad since we’re not actually serving requests or anything complicated (other than loading the app), the only threading issue we’ll need to deal with is auto loading. We’re at the mercy of the application on what it does in the
test
environment (whether or not it eager loads). A solution would be to eager load the app (Rails.application.eager_load!
) right before we do theParallel.flat_map
call. I tried this and it works. However, the performance is not as good as forking. For some data, our monolith (37946 files scanned) does the check in37.57 seconds
using forks, and222.46 seconds
using threads (6x slower!). It solves the memory issue and we might be able to investigate the performance difference separately.Load rails and extract what we need
We could boot the rails app and dump (or however we want to communicate between the rails app and the check process) what we need. Eg load paths, inflections, etc. This allows us to stay with forks, still utilize spring to boot the application, keep the packwerk scanner process memory low.