puma: Slow restarts
We love puma and have been using it for years… baffled by this one though! What are we missing?
Steps to reproduce
- puma configs
# web
tag "#{ENV['RAILS_ENV']}-app-puma"
environment ENV['RAILS_ENV']
pidfile "#{ENV['SHARED_PATH']}/puma.pid"
stdout_redirect "#{ENV['SHARED_PATH']}/log/#{ENV['RAILS_ENV']}_app_puma.stdout.log",
"#{ENV['SHARED_PATH']}/log/#{ENV['RAILS_ENV']}_app_puma.stderr.log",
false # don't append
bind "unix://#{ENV['SHARED_PATH']}/#{ENV['RAILS_ENV']}-app-puma.sock"
worker_timeout 30
threads 0,4
workers 1
preload_app!
on_worker_boot do
ActiveSupport.on_load(:active_record) do
ActiveRecord::Base.establish_connection
end
Que.mode = :off
end
# action cable
tag "#{ENV['RAILS_ENV']}-app-puma-actioncable"
environment ENV['RAILS_ENV']
pidfile "#{ENV['SHARED_PATH']}/puma-actioncable.pid"
stdout_redirect "#{ENV['SHARED_PATH']}/log/#{ENV['RAILS_ENV']}_app_puma_actioncable.stdout.log",
"#{ENV['SHARED_PATH']}/log/#{ENV['RAILS_ENV']}_app_puma_actioncable.stderr.log",
false # don't append
bind "unix://#{ENV['SHARED_PATH']}/#{ENV['RAILS_ENV']}-app-puma-actioncable.sock"
worker_timeout 30
threads 1,4
workers 1
- Action Cable Configs
All defaults except for:
config.action_cable.allowed_request_origins = ['https://www.app.com']
config.action_cable.logger = ActiveSupport::Logger.new(
File.join(Rails.root, 'log', "#{Rails.env}-cable.log")
)
- Systemd
# web
[Unit]
Description=Rails Web Server
PartOf=staging-app.target
[Service]
EnvironmentFile=/etc/default/staging-app
User=staging-app
WorkingDirectory=/home/staging-app/app/current
Type=simple
Restart=always
KillMode=process
ExecStart=/bin/bash -lc 'bundle exec --keep-file-descriptors puma -C config/deploy/staging/puma.rb'
# cable
[Unit]
Description=Rails Cable Server
PartOf=staging-app.target
[Service]
EnvironmentFile=/etc/default/staging-app
User=staging-app
WorkingDirectory=/home/staging-app/app/current
Type=simple
Restart=always
KillMode=process
ExecStart=/bin/bash -lc 'bundle exec --keep-file-descriptors puma -C config/deploy/staging/puma-action-cable.rb'
- Restart
time sudo systemctl restart staging-app-puma.service
real 1m30.106s
user 0m0.011s
sys 0m0.000s
Expected behavior
Normally, we get instant restarts with puma.
Actual behavior
Restarts take 1m30s usually.
I have never seen it take this long, but we weren’t the original developer of the application.
This application uses ActionCable running in a separate process with nginx sending all /cable requests to another process (also on puma). Both processes take the 1m30s to restart, the ActionCable config file is the same minus preload_app!
and the on_worker_boot
block.
The ActiveJob process restarts instantly.
Questions
- Is there a way to tell what is happening in puma after the restart is initiated?
- Is it expected Action Cable processes take a long time to stop restart? But with no Action Cable requests going to the puma web process, why is that slow to restart as well?
- Is this a known problem with 5.1? The client will be updating eventually.
- What are we missing?
System configuration
Ruby version: 2.6.0 Rails version: 5.1.6.1 Puma version: 3.12.0
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 15 (7 by maintainers)
Commits related to this issue
- Note the Ruby 2.6 SIGTERM issue in the changelog Linking them all with this commit. https://github.com/puma/puma/pull/1741 https://github.com/puma/puma/issues/1755 https://github.com/puma/puma/issue... — committed to dentarg/puma by dentarg 5 years ago
- Note the Ruby 2.6 SIGTERM issue in the changelog Linking them all with this commit. https://github.com/puma/puma/pull/1741 https://github.com/puma/puma/issues/1755 https://github.com/puma/puma/issue... — committed to dentarg/puma by dentarg 5 years ago
- Note the Ruby 2.6 SIGTERM issue in the changelog (#1928) Linking them all with this commit. https://github.com/puma/puma/pull/1741 https://github.com/puma/puma/issues/1755 https://github.com/pum... — committed to puma/puma by dentarg 5 years ago
We noticed similar issues on a few of our servers (rails >=5.2, ruby >= 2.6.0).
Seems to be fixed with Puma 4.0.x 😃
Restart times went up to (exactly) 90 seconds for each app that i updated from ruby 2.5 to 2.6. The issue persists when updating to 2.6.1.
I have the same issue with ruby 2.6.2 90 seconds is a default timeout for systemd restart command, after that process is just killed by the systemd and started again
This is almost certainly the ruby 2.6.0 wait bug. Why are you doing
workers 1
? Might as well avoid the overhead of using the clustering in that case and just comment it out and use the single process model.I just switched to the alternative forked
systemd
in order to utilize phased-restarts withpumactl
https://github.com/puma/puma/blob/master/docs/systemd.md#alternative-forking-configuration It was a huge pain, and somewhat fragile compared to the systemd simple/socket I’m going to do a quick sanity test with phased restarts and see how it handles.hot restarts were dragggggging causing racktimeouts and gateway timeouts from the loadbalancer with two workers 16 threads,
preload_app
. My guess is with preload, and perhaps bad actors preventing connections from being drained, workers cant reboot before timeout threshholds are hit. In fact i wouldn’t be surprised if my timeouts were the only thing allowing the restart to completeI’ll post all me new/old configs in a bit, it’s a painstaking process to debug this in production env.