roadrunner: [BUG] RR2 stop passing requests to workers

we are upgrading from RR1 to RR2

after deployment everything works and after some time (few hours) the roadrunner stops responding

I expected to see this happen: RR keeps proccessing requests indefinitely

Instead, this happened: RR stops passing requests to workers

the port is still open and listening to requests

# curl --max-time 10 -vvv 127.0.0.1:8080
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/7.74.0
> Accept: */*
> 
* Operation timed out after 10000 milliseconds with 0 bytes received
* Closing connection 0
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received

workers are sitting idle with no execs and i can list them

# /usr/local/bin/rr -c '/etc/roadrunner/.rr.yaml' workers
Workers of [http]:
+---------+-----------+---------+---------+---------+--------------------+
|   PID   |  STATUS   |  EXECS  | MEMORY  |  CPU%   |      CREATED       |
+---------+-----------+---------+---------+---------+--------------------+
|    2379 | ready     |       0 | 28 MB   |    0.01 | 4 minutes ago      |
|    2380 | ready     |       0 | 28 MB   |    0.01 | 4 minutes ago      |
+---------+-----------+---------+---------+---------+--------------------+

they are rotated as they reach TTL but live past idle TTL and calling reset just hangs indefinitely i can just see the rotating dots

Resetting plugin: [http]  ●∙∙

also while restarting i am not even getting the HTTP plugin got restart request. Restarting... info log

The version of RR used: 2.5.3 i used binary from docker image spiralscout/roadrunner:2.5.3 in php:8.0.12-cli

My .rr.yaml configuration is:

rpc:
  enable: true
  listen: unix:///etc/roadrunner/rr.sock

server:
  command: "php -d opcache.enable_cli=1 /var/www/html/app/worker.php"
  relay: "pipes"

http:
  address: :8080

  pool:
    num_workers: 2

    supervisor:
      watch_tick: 1s # check every 1s

      # Maximal worker memory usage in megabytes (soft limit). Zero means no limit.
      max_worker_memory: 150 # 150MB
      ttl: 300s # maximum time to live for the worker (soft)
      idle_ttl: 60s # maximum allowed amount of time worker can spend in idle before being removed (for weak db connections, soft)
      exec_ttl: 360s # max_execution_time (brutal)

endure:
  log_level: info

logs:
  mode: production
  level: info

metrics:
  address: :8081

Errortrace, Backtrace or Panictrace there are no errors in logs

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 21 (15 by maintainers)

Most upvoted comments

@dstrop Fixed in the 2.5.4 https://github.com/spiral/roadrunner-binary/releases/tag/v2.5.4

rustatian on Nov 7, 2021