relay: Interrupted system call (RELAY_ERR_IO)
Description
I am using Relay as a replacement for php-redis/predis in a Laravel 10 project.
Now with Relay, I see poll(2): Interrupted system call (RELAY_ERR_IO; cache.c:551).
This only happens on a server (queue worker) on a more remote location with more latency (around 100-200ms max.) to the Redis server, which is located in Germany and the worker in the USA (Miami).
Expected Behavior
Queue should not throw exceptions.
Actual Behavior
Throws the described exception at random times. Queue still works, but is logging this exception sometimes.
Possible Fix
Steps to Reproduce
Almost impossible, but:
- Run a Redis server on location A and a worker that connects to it on Location B, other side of the world
- Wait for a couple of hours, it should start throwing errors4.
Additional context
Before, I noticed different errors with php-redis that were common, but couldn’t be solved..
Apologies in advance for the lack of information. I’ll update this issue as I discover more details.
Relay diagnostics
$ php --ri relay
relay
Relay Support => enabled
Relay Version => 0.6.4
Available cache => 352
Available serializers => php, json, igbinary, msgpack
Available compression => lzf, zstd, lz4
Binary UUID => c773e2de-231e-484c-86d4-e51114e00975
Git SHA =>
Allocator => relay
License state => unknown
License memory cap => 0
License request id =>
relay.enabled => false
relay.key =>
relay.maxmemory => 0
relay.maxmemory_pct => 95
relay.eviction_policy => noeviction
relay.eviction_sample_keys => 128
relay.initial_readers => 128
relay.invalidation_poll_usec => 5
relay.pconnect_default => 1
relay.max_endpoint_dbs => 32
relay.loglevel => none
relay.logfile =>
Environment
- Operating System and version: Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-72-generic x86_64)
- PHP version: 8.2.6
- Laravel v10.11.0
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 21 (16 by maintainers)
We got this merged, but are fighting some macOS changes that are blocking a release. Tag should go out Monday.
Fantastic!
@tillkruss Since the release, I haven’t experienced this issue anymore. Thanks a lot!
I’ve got what I need for now but will let you know if I have any questions.
Honestly, it should be possible to mitigate this on Relay’s side regardless of the specific underlying problem.
What hiredis does when it connects to a redis-server is switch the socket to non-blocking mode and then use
pollto wait for success/failure. At some point during this polling process, poll is returning-1anderrnois set toEINTR. This isn’t a fatal error, it’s just the kernel saying that the syscall was interrupted.In other places in the hiredis code for example, we test for
EINTRand then retry (e.g.readandwriteoperations).On Relay’s side I can just detect that error in the underlying context and retry the connection myself. The trick is that we don’t want to do it in a way that would stop the process from being properly interrupted (e.g. if
SIGINTwere sent to shut down the queue process).Once I trigger the condition it may even be worth fixing in hiredis, but I’m not sure about that yet.
Thank you for the report, there’s a lot of good info go to on here.
I will attempt to replicate and get back to you.