puma: Second Request Very Slow in Clustered Mode on 5.0.3, 5.0.4 and master.

Describe the bug I don’t know all the details yet so I’ll try and explain it the best I can. I noticed that the second (and subsequent) requests after launching the puma server was very slow compared to the original. e.g. 30 seconds instead of 0.3 seconds. I also notice that it takes much longer to quit / exit the server. Like 15 seconds instead of 0.5.

This was appearing on any kind of cluster, even a cluster of 1. Commenting out workers entirely fixed the issue.

I recently upgraded from 4.2.1 to 5.0.4 so I decided to roll back my upgrade and upgrade slowly to see which version it was introduced.

I was able to run things just fine on 5.0.2, but on 5.0.3 and then on 5.0.4, the issue starts appearing.

Puma config:

bind "tcp://0.0.0.0:3000" # Bind to any IP address so we can connect over wifi from CNTRAL Mobile devices to the development machine's local IP address.
workers 1

To Reproduce I tried reproducing this with the hello.ru app and using my puma.rb file but I couldn’t reproduce it there. The page requests were always fast. So it may be application, or Rails, specific but I can definitely reproduce it and the issue first appears in 5.0.3.

5.0.2 is completely fine.

Expected behavior Quick requests, like it is on 5.0.2.

Desktop (please complete the following information):

  • OS: Mac
  • Puma Version 5.0.3
  • Rails 5.2.4.4
  • Ruby 2.6.5

My apologies for the lack of information and reproducibility but I’ve narrowed it down the best I could to a single patch release, so hopefully we can dig deeper from here.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 3
  • Comments: 46 (24 by maintainers)

Most upvoted comments

@dentarg Definitely will.

My 2 cents: I’m also having this problem (I’m using Ruby 2.7.2 and Rails 6.0.3.4). I ended up downgrading to Puma 5.0.2 for now.

We seem to be seeing same issue with this config(Ruby 2.7.2 and Rails 6.0x). Being on puma 5.0.2 fixes and makes puma “just run” on development env for us. But only when we downgrade puma to < 5x do all our issues go away.

Edit:

Just to note that we have been facing this issue for a while, but we thought it to be strange or something wrong with our own setup, etc. as it was sporadic. Noting this because many teams might be facing it but feeling something might be wrong with their own setup and not reporting it as there’s no errors reported from puma side.

@joshuapinter just to clarify, the issue reproduces in 2.7.2 only with postman/Net::HTTP (can you share the exact code used to reproduce with Net::HTTP?), but still can’t reproduce with curl at all?

Also, is it a specific request path/route that reproduces the issue and not any other routes? Even if the app is proprietary and you can’t narrow down the code into something you can share publicly, is there anything else about the particular route that triggers the issue that is unique or that you’re able to share?

Finally, what’s the specific Mac OS version you’re seeing the issue on?

@lucascaton it would be helpful if you could share more specific detail on the problem you’re experiencing:

  • OS (including specific OS version)
  • Server code used (if public; if not, you could try making a minimal test app that reproduces the issue and share that instead)
  • How the requests were made (curl, browser, net::http ruby code, end-user reports from a web-app etc)
  • whether the issue reproduces consistently 100% of the time or if it’s intermittent/random, and if it’s always on the second request after the server boots
  • any puma/rails logs, browser network-request logs, etc that you can capture+share

Hey @wjordan,

your issue sounds very different from the issue discussed in this thread

Well, I agree it’s different (I wasn’t sure when I wrote my first comment here) but it’s affected by the very same versions of Puma.

I can only assume the memory leak is coming from a bug in your application/configuration and not Puma itself

I don’t think so. If I upgrade Puma from version 5.0.2 to 5.0.3+ and nothing else, the issue starts happening. When downgrading it back to 5.0.2, everything goes back to normal. I’ve done this a few times now.

until you can reproduce an issue in code you can make public

I wish I could be more helpful here. I’ve been trying to reproduce it but it’s not very easy since it takes 1-2 days (and/or several requests) for the issue to start happening. I was just reporting my case in case it helps somehow because it’s affected by the same versions described in this PR 🙂

I’ll keep trying to reproduce the issue. In the meantime, please let me know if I can help with anything else.

@joshuapinter please share that code using net-http if you can, I think it would be helpful

Hey @dentarg - I’ve just tested here, ~all good when using version 5.2~!

UPDATE: After a few days running, the issue started again 😞

Speaking of logs, adding log_requests (or quiet false) to the config file will add request logging that shows ‘app’ time. Since this seems related to the Puma version used, one would think the app isn’t causing the issue, but logging requests would confirm that.

Today I ran Puma as a separate process (not a sub process as in CI), and ran three requests per client/socket. No real difference in times, and I was running loops of 800. IOW, I’m trying to repo, but I can’t…

Thanks for the feedback @MSP-Greg.

Some new updates:

  • Changed to workers 4.
  • Can still reproduce issue.
  • Could not reproduce in cURL. Seems odd.
  • So I used Net::HTTP and was able to reproduce it.

Here’s the kicker…

  • We upgraded from Ruby 2.6.5 to 2.7.2 during a Friday evening Maintenance Window and I can’t reproduce the issue on Ruby 2.7.2. On Postman, cURL or Net::HTTP.
  • Switching back to 2.6.5, the issue returns.

So, not sure what to tell you guys from here. I’m happy to close this now and see if anybody else encounters it, since it’s not an issue on Ruby 2.7.2. But I’m also happy to test anything else for you if you want to drill deeper.

Let me know and thanks for the consideration. 🙏

Yes, I can definitely do that. FYI, the requests are happening via Postman to hit the API, not in a browser. Not sure what it is using underneath but I’ll confirm the same results occur with cURL. On Nov 14, 2020, 12:12 PM -0700, Will Jordan notifications@github.com, wrote:

This will be hard to track down without narrowing down the issue to something we can reproduce locally. Any chance you can repro the actual requests to the rails app with curl at least, to rule out anything unusual going on with the browser-based requests? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

This will be hard to track down without narrowing down the issue to something we can reproduce locally. Any chance you can repro the actual requests to the rails app with curl at least, to rule out anything unusual going on with the browser-based requests?