nanomq: Broker Restart - aio error

Describe the bug

After a couple of hours/days with the broker running, the broker restart due to errors, check the log file.

First it starts generating error open /proc/stat failed', I am using the API to get statistics data from the broker. After a little bit of time, it starts generating error send aio error Out of files` until that after a couple of minutes (typically 5 minutes) the broker restarts.

Expected behavior

Actual Behavior

This seems to happen every 2~3 days of operation.

Log File, when the errors start happening

2024-03-07 18:43:27 [1831111] INFO  /home/runner/work/nanomq/nanomq/nanomq/pub_handler.c:1172: acl allow
2024-03-07 18:43:27 [1831130] INFO  /home/runner/work/nanomq/nanomq/nanomq/pub_handler.c:1172: acl allow
2024-03-07 18:43:28 [1831118] INFO  /home/runner/work/nanomq/nanomq/nanomq/pub_handler.c:1172: acl allow
2024-03-07 18:43:28 [1831122] ERROR /home/runner/work/nanomq/nanomq/nanomq/rest_api.c:1387: open /proc/stat failed!
2024-03-07 18:43:28 [1831125] INFO  /home/runner/work/nanomq/nanomq/nanomq/pub_handler.c:1172: acl allow
2024-03-07 18:43:28 [1831127] INFO  /home/runner/work/nanomq/nanomq/nanomq/unsub_handler.c:189: UnSub topic [maquina/WHDNG23099-VNT/ACM] in client [NodeRed1]. pid [63266]
2024-03-07 18:43:28 [1831120] INFO  /home/runner/work/nanomq/nanomq/nanomq/unsub_handler.c:189: UnSub topic [maquina/WHDNG23099-VNT/ACM/cmd] in client [NodeRed1]. pid [63267]
2024-03-07 18:43:28 [1831122] INFO  /home/runner/work/nanomq/nanomq/nanomq/unsub_handler.c:189: UnSub topic [maquina/WHDNG23099-VNT/Software/Events] in client [NodeRed1]. pid [63268]
2024-03-07 18:43:28 [1831109] INFO  /home/runner/work/nanomq/nanomq/nanomq/unsub_handler.c:189: UnSub topic [maquina/WHDNG23099-VNT/VNT] in client [NodeRed1]. pid [63269]
2024-03-07 18:43:28 [1831129] INFO  /home/runner/work/nanomq/nanomq/nanomq/unsub_handler.c:189: UnSub topic [$SYS/#] in client [NodeRed1]. pid [63270]
2024-03-07 18:43:28 [1831122] INFO  /home/runner/work/nanomq/nanomq/nanomq/unsub_handler.c:189: UnSub topic [maquina/WHDNG23099-VNT/VNT/cmd] in client [NodeRed1]. pid [63271]
2024-03-07 18:43:28 [1831114] INFO  /home/runner/work/nanomq/nanomq/nanomq/pub_handler.c:1172: acl allow
2024-03-07 18:43:28 [1831116] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831121] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831114] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831115] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831130] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831130] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831116] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831123] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831113] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831128] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831125] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831117] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831125] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files
2024-03-07 18:43:28 [1831116] WARN  /home/runner/work/nanomq/nanomq/nng/src/sp/transport/mqtt/broker_tcp.c:1769:  send aio error Out of files

To Reproduce

Environment Details

  • NanoMQ version v0.21.6-6
  • Operating system and version Ubuntu Server 22.04
  • Compiler and language used, installed from file

Client SDK

Additional context

About this issue

  • Original URL
  • State: open
  • Created 4 months ago
  • Comments: 16 (8 by maintainers)

Most upvoted comments

Hi @arturv2000, THX for your feedback, I built a demo locally to replicate the issues described in the issue

$ ulimit -n 30 && ./nanomq/nanomq start --conf ../etc/nanomq.conf --log_level warn

$ emqtt_bench sub -p 2883 -c 18 -t topic1 -q 0

$ emqtt_bench sub -p 2883 -c 1 -t topic1 -q 0
This is where the above 'send aio error Out of files' occurs.

$ curl -i --basic -u admin:public -X GET "http://localhost:8082/api/v4/prometheus"
This is where the above 'open /proc/stat failed!' occurs.

Surprisingly my local nanomq did not quit after going through the above steps and waiting for some time.

In addition, the cause of these errors is caused by the limit of the number of FDS in the system. You can try to use ulimit to modify the number of FDS that can be used by the program to circumvent this problem.

This nanomq exit issue may require more information such as replication steps, configuration files, etc.