rails: ActionCable's SubscriptionAdpater for Redis does not reconnect after a connection loss.
Steps to reproduce
- Start rails server and let ActionCable establish a connection to Redis.
- Restart/Stop the Redis instance.
Expected behavior
If the connection between ActionCable and Redis is interrupted or lost the server process tries to re-establish the connection, informs web socket connections about the disconnection (or re-establishes subscriptions) and does not abort.
Actual behavior
If the connection between ActionCable and Redis is interrupted or lost the server process is aborted, without trying to re-establish the connection.
RoomChannel is transmitting the subscription confirmation
RoomChannel is streaming from room:123
Exiting
2.3.3/gems/redis-3.3.2/lib/redis/client.rb:257:in `rescue in io': Connection lost (ECONNRESET) (Redis::ConnectionError)
gems/redis-3.3.2/lib/redis/client.rb:250:in `io'
gems/redis-3.3.2/lib/redis/client.rb:261:in `read'
gems/redis-3.3.2/lib/redis/client.rb:136:in `block (3 levels) in call_loop'
gems/redis-3.3.2/lib/redis/client.rb:135:in `loop'
gems/redis-3.3.2/lib/redis/client.rb:135:in `block (2 levels) in call_loop'
gems/redis-3.3.2/lib/redis/client.rb:231:in `block (2 levels) in process'
gems/redis-3.3.2/lib/redis/client.rb:367:in `ensure_connected'
gems/redis-3.3.2/lib/redis/client.rb:221:in `block in process'
gems/redis-3.3.2/lib/redis/client.rb:306:in `logging'
gems/redis-3.3.2/lib/redis/client.rb:220:in `process'
gems/redis-3.3.2/lib/redis/client.rb:134:in `block in call_loop'
gems/redis-3.3.2/lib/redis/client.rb:280:in `with_socket_timeout'
gems/redis-3.3.2/lib/redis/client.rb:133:in `call_loop'
gems/redis-3.3.2/lib/redis/subscribe.rb:43:in `subscription'
gems/redis-3.3.2/lib/redis/subscribe.rb:12:in `subscribe'
gems/redis-3.3.2/lib/redis.rb:2765:in `_subscription'
gems/redis-3.3.2/lib/redis.rb:2143:in `block in subscribe'
gems/redis-3.3.2/lib/redis.rb:58:in `block in synchronize'
from .rubies/ruby-2.3.3/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'
gems/redis-3.3.2/lib/redis.rb:58:in `synchronize'
gems/redis-3.3.2/lib/redis.rb:2142:in `subscribe'
gems/actioncable-5.0.1/lib/action_cable/subscription_adapter/redis.rb:75:in `block in listen'
gems/redis-3.3.2/lib/redis/client.rb:293:in `with_reconnect'
gems/redis-3.3.2/lib/redis.rb:64:in `block in with_reconnect'
gems/redis-3.3.2/lib/redis.rb:58:in `block in synchronize'
from .rubies/ruby-2.3.3/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'
gems/redis-3.3.2/lib/redis.rb:58:in `synchronize'
gems/redis-3.3.2/lib/redis.rb:63:in `with_reconnect'
gems/redis-3.3.2/lib/redis.rb:70:in `without_reconnect'
gems/actioncable-5.0.1/lib/action_cable/subscription_adapter/redis.rb:72:in `listen'
gems/actioncable-5.0.1/lib/action_cable/subscription_adapter/redis.rb:146:in `block in ensure_listener_running'
Additional information
As soon as the redis instance is unavailable and redis-rb receives the EOF from reading the socket the Listener aborts as intended? here.
The redis-rb client offers the option to specify reconnect_attempts
during initialisation, but in case of a disconnection, after a successful one, the client doesn’t attempt a reconnection. Even if it did, it might be too aggressive to allow a re-establishment since there is no delay.
A work-around is something like:
def listen_with_retry(conn)
listen conn
rescue ::Redis::ConnectionError, ::Redis::CannotConnectError => e
ActionCable.server.connections.each(&:close)
sleep 1
retry
end
System configuration
Rails version: Rails 5.0.1
Ruby version: ruby 2.3.3p222
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 23
- Comments: 22 (5 by maintainers)
Commits related to this issue
- feat: restore Action Cable Redis pub/sub listener on connection failure Closes #27659 — committed to palkan/rails by palkan 2 years ago
The fix is coming in Rails 7.1. For older versions (5+), there is a backport gem: https://github.com/anycable/action-cable-redis-backport
This is such a big vulnerability for us, that we decided to monkey patch
ActionCable::SubscriptionAdapter::Redis::Listener
. It works by catching the disconnect exception and restart Puma using a hot restart. This recovers full functionality, provided that Redis is available. Since our primary use case is restoring connection in case of a Redis primary/replica failover, this is a safe assumption.Perhaps this could be of use to anyone. Note: I’m not proud of this, but it works 😉
It doesn’t happen often, but when it does happen, it’s catastrophic. A Redis::ConnectionError when sending a message in Action Cable to an end user will cause an entire worker (on heroku) to go down, and the worker won’t restart itself. Does anyone recommend the patch suggested by @rhomeister for this issue, or some other strategy?
We have an API for this in ActiveJob that would be good to consider extracting into something generic for timeouts/retries. See https://github.com/rails/rails/blob/master/activejob/lib/active_job/exceptions.rb#L116
On Mon, Jan 29, 2018 at 2:24 AM, Philipp Weissensteiner < notifications@github.com> wrote: