aiomqtt: High CPU usage -
Hi,
trying to get some help figuring out a problem im having, and have landed here. Running a script (called ecowitt2mqtt) on a RPi 4 Bullseye, that dumps data to my mqtt broker (data it obtains from my local weather station), and then HA discoveres it. The script host (RPi 4), the broker (Ubuntu 22.04), and HA are 3 different instances on the same network. After hours or days, one of the cores on my RPi 4 that is running the script chokes, and runs at 100% (or ~30% CPU total).
Ran a pyspy on the instance, and caught 700 errors, but nothing too conclusive. Running:
$ strace -p <pid> -f -s 4096
on the stuck process, yields this:
[pid 121259] recvfrom(14, 0x7f8683aaa0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 76, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683a1a0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 76, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683aaa0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 76, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683a1a0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 76, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683aaa0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 75, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683a1a0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 75, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683aaa0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 75, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683a1a0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 74, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683aaa0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 74, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683a1a0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 74, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683aaa0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 74, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683a1a0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 73, NULL, 8) = 1
[pid 121259] recvfrom(14, 0x7f8683aaa0, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 121259] epoll_pwait(3, [{EPOLLIN, {u32=13, u64=13}}], 1024, 73, NULL, 8) = 1
[pid 121259] recvfrom(14, ^Cstrace: Process 121259 detached
and then
$ lsof -p <pid> -n
yields these descriptors:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ecowitt2m 121259 root cwd DIR 179,2 4096 2 /
ecowitt2m 121259 root rtd DIR 179,2 4096 2 /
ecowitt2m 121259 root txt REG 179,2 5280744 1882 /usr/bin/python3.9
ecowitt2m 121259 root mem REG 179,2 15688 12246 /usr/lib/python3.9/lib-dynload/_multiprocessing.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 192112 5267 /usr/lib/aarch64-linux-gnu/libmpdec.so.2.5.1
ecowitt2m 121259 root mem REG 179,2 163840 12240 /usr/lib/python3.9/lib-dynload/_decimal.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 63376 12241 /usr/lib/python3.9/lib-dynload/_hashlib.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 44568 12242 /usr/lib/python3.9/lib-dynload/_json.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 350640 650642 /usr/local/lib/python3.9/dist-packages/Levenshtein/_levenshtein.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 2127000 7468 /usr/local/lib/python3.9/dist-packages/_ruamel_yaml.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 15304 12249 /usr/lib/python3.9/lib-dynload/_queue.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 31592 2153 /usr/lib/aarch64-linux-gnu/librt-2.31.so
ecowitt2m 121259 root mem REG 179,2 11791880 650852 /usr/local/lib/python3.9/dist-packages/uvloop/loop.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 30712 5387 /usr/lib/aarch64-linux-gnu/libuuid.so.1.3.0
ecowitt2m 121259 root mem REG 179,2 6240 12257 /usr/lib/python3.9/lib-dynload/_uuid.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 154232 2100 /usr/lib/aarch64-linux-gnu/liblzma.so.5.2.5
ecowitt2m 121259 root mem REG 179,2 33144 12244 /usr/lib/python3.9/lib-dynload/_lzma.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 70504 5115 /usr/lib/aarch64-linux-gnu/libbz2.so.1.0.4
ecowitt2m 121259 root mem REG 179,2 20032 12226 /usr/lib/python3.9/lib-dynload/_bz2.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 62336 12225 /usr/lib/python3.9/lib-dynload/_asyncio.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 2739952 8506 /usr/lib/aarch64-linux-gnu/libcrypto.so.1.1
ecowitt2m 121259 root mem REG 179,2 577176 8510 /usr/lib/aarch64-linux-gnu/libssl.so.1.1
ecowitt2m 121259 root mem REG 179,2 181184 12251 /usr/lib/python3.9/lib-dynload/_ssl.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 51640 2147 /usr/lib/aarch64-linux-gnu/libnss_files-2.31.so
ecowitt2m 121259 root mem REG 179,2 6080 12233 /usr/lib/python3.9/lib-dynload/_contextvars.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 10320 12247 /usr/lib/python3.9/lib-dynload/_opcode.cpython-39-aarch64-linux-gnu.so
ecowitt2m 121259 root mem REG 179,2 3041504 2580 /usr/lib/locale/locale-archive
ecowitt2m 121259 root mem REG 179,2 1458480 2140 /usr/lib/aarch64-linux-gnu/libc-2.31.so
ecowitt2m 121259 root mem REG 179,2 104824 5407 /usr/lib/aarch64-linux-gnu/libz.so.1.2.11
ecowitt2m 121259 root mem REG 179,2 161856 5161 /usr/lib/aarch64-linux-gnu/libexpat.so.1.6.12
ecowitt2m 121259 root mem REG 179,2 633000 2142 /usr/lib/aarch64-linux-gnu/libm-2.31.so
ecowitt2m 121259 root mem REG 179,2 14672 2155 /usr/lib/aarch64-linux-gnu/libutil-2.31.so
ecowitt2m 121259 root mem REG 179,2 14560 2141 /usr/lib/aarch64-linux-gnu/libdl-2.31.so
ecowitt2m 121259 root mem REG 179,2 160200 2151 /usr/lib/aarch64-linux-gnu/libpthread-2.31.so
ecowitt2m 121259 root mem REG 179,2 145352 2136 /usr/lib/aarch64-linux-gnu/ld-2.31.so
ecowitt2m 121259 root mem REG 179,2 27004 2448 /usr/lib/aarch64-linux-gnu/gconv/gconv-modules.cache
ecowitt2m 121259 root 0r CHR 1,3 0t0 5 /dev/null
ecowitt2m 121259 root 1u unix 0x00000000c8dfa60c 0t0 625905 type=STREAM
ecowitt2m 121259 root 2u unix 0x00000000c8dfa60c 0t0 625905 type=STREAM
ecowitt2m 121259 root 3u a_inode 0,13 0 7590 [eventpoll]
ecowitt2m 121259 root 4r FIFO 0,12 0t0 625101 pipe
ecowitt2m 121259 root 5w FIFO 0,12 0t0 625101 pipe
ecowitt2m 121259 root 6r FIFO 0,12 0t0 625102 pipe
ecowitt2m 121259 root 7w FIFO 0,12 0t0 625102 pipe
ecowitt2m 121259 root 8u a_inode 0,13 0 7590 [eventfd]
ecowitt2m 121259 root 9u unix 0x000000003d8f318e 0t0 625103 type=STREAM
ecowitt2m 121259 root 10u unix 0x000000007373546a 0t0 625104 type=STREAM
ecowitt2m 121259 root 11u IPv4 625114 0t0 TCP *:http-alt (LISTEN)
ecowitt2m 121259 root 12r CHR 1,3 0t0 5 /dev/null
ecowitt2m 121259 root 13u IPv4 630945 0t0 TCP 192.168.1.130:http-alt->192.168.1.138:55339 (CLOSE_WAIT)
ecowitt2m 121259 root 14u IPv4 633219 0t0 TCP 192.168.1.130:55201->192.168.1.139:1883 (ESTABLISHED)
FYI 192.168.1.130:55201 (the RPi running the script 192.168.1.139:1883 (my mqtt broker)
it has been suggested by someone much more knowledgeable than me that the issue could be here:
I am running the latest version asyncio-mqtt on the RPi.
anybody know why this resources becomes unavailable and chokes my RPi?
thanks!
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 17 (13 by maintainers)
FYI, digging in and keeping a single connection open won’t work with
ecowitt2mqtt
’s architecture – since we’re a Uvicorn + FastAPI application, there’s no feasible way to have a reconnect “loop” (similar to the docs) because we publish when we received REST API calls via FastAPI. Happy to go into more detail if interested, but more importantly, we’ll need to connect/disconnect with each payload. If that’s always going to spike CPU, I’m not certain we can do anything…EDIT: I lied. 😂 https://github.com/bachya/ecowitt2mqtt/pull/236
Sorry about the silence—I was on a short vacation.
Glad that you figured it out. Feel free to open new issues/discussions/PRs if you find something in anyio-mqtt that you would like to add/change/fix. 👍
For what it’s worth, it does “work” in that most users successfully publish multiple messages with the same client (re-entered). Whether that’s correct practice (or causing issues under the surface) is obviously a different matter.
Yep, got it.
Got it. I’m looking for less work, so I’ll check out
anyio
. 😂 Thanks for the recommendation!