uptime-kuma: Shrinking databse/Blocking database opreations Give False Downtime
⚠️ Please verify that this bug has NOT been raised before.
- I checked and didn’t find similar issue
🛡️ Security Policy
- I agree to have read this project Security Policy
Description
One of my monitors said it was down because: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
I was deleting a monitor, probably with a lot of data (i had history set for 365 previously. insane, i know), so deleting took a long time, which then caused the monitor being down.
The issue can also be caused by a manually triggered shrink database operation.
👟 Reproduction steps
- Have some monitors (any should be fine, just fast enough like at most 20 seconds)
- Have a large database (say >512MB)
- Shrink database (Settings > Monitor History > Shrink database)
- Experience behavior
👀 Expected behavior
The monitors will continue as “up” and saving the correct data later (if needed).
😓 Actual Behavior
The monitors are considered “down” because a blocking database operation is happening.
🐻 Uptime-Kuma Version
1.19.0
💻 Operating System and Arch
macOS 13.1
🌐 Browser
LibreWolf 108.0.1-1
🐋 Docker Version
No response
🟩 NodeJS Version
v16.18.1
📝 Relevant log output
Dec 25 12:36:43 laptop-server npm[1618271]: 2022-12-25T11:36:43Z [RATE-LIMIT] INFO: remaining requests: 20
Dec 25 12:37:06 laptop-server npm[1618271]: 2022-12-25T11:37:06Z [MONITOR] WARN: Monitor #6 'mastodon/mcrblgng (micro.)': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
| Max retries: 12 | Retry: 1 | Retry Interval: 60 seconds | Type: keyword
Dec 25 12:37:10 laptop-server npm[1618271]: 2022-12-25T11:37:10Z [MONITOR] WARN: Monitor #7 'peertube (videos.)': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max re
tries: 12 | Retry: 1 | Retry Interval: 30 seconds | Type: keyword
Dec 25 12:37:10 laptop-server npm[1618271]: 2022-12-25T11:37:10Z [MONITOR] WARN: Monitor #40 'conduit (conduit.hazmat.)': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
| Max retries: 15 | Retry: 1 | Retry Interval: 60 seconds | Type: keyword
Dec 25 12:37:11 laptop-server npm[1618271]: 2022-12-25T11:37:11Z [MONITOR] WARN: Monitor #36 'unbound DNS server (telemetry)': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) c
all? | Max retries: 2 | Retry: 1 | Retry Interval: 30 seconds | Type: keyword
Dec 25 12:37:12 laptop-server npm[1618271]: Trace: KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
Dec 25 12:37:12 laptop-server npm[1618271]: at Client_SQLite3.acquireConnection (/home/uptime/uptime-kuma/node_modules/knex/lib/client.js:305:26)
Dec 25 12:37:12 laptop-server npm[1618271]: at async Runner.ensureConnection (/home/uptime/uptime-kuma/node_modules/knex/lib/execution/runner.js:259:28)
Dec 25 12:37:12 laptop-server npm[1618271]: at async Runner.run (/home/uptime/uptime-kuma/node_modules/knex/lib/execution/runner.js:30:19)
Dec 25 12:37:12 laptop-server npm[1618271]: at async RedBeanNode.findOne (/home/uptime/uptime-kuma/node_modules/redbean-node/dist/redbean-node.js:515:19)
Dec 25 12:37:12 laptop-server npm[1618271]: at async Function.handleStatusPageResponse (/home/uptime/uptime-kuma/server/model/status_page.js:23:26)
Dec 25 12:37:12 laptop-server npm[1618271]: at async /home/uptime/uptime-kuma/server/routers/status-page-router.js:16:5 {
Dec 25 12:37:12 laptop-server npm[1618271]: sql: undefined,
Dec 25 12:37:12 laptop-server npm[1618271]: bindings: undefined
Dec 25 12:37:12 laptop-server npm[1618271]: }
Dec 25 12:37:12 laptop-server npm[1618271]: at process.<anonymous> (/home/uptime/uptime-kuma/server/server.js:1779:13)
Dec 25 12:37:12 laptop-server npm[1618271]: at process.emit (node:events:513:28)
Dec 25 12:37:12 laptop-server npm[1618271]: at emit (node:internal/process/promises:140:20)
Dec 25 12:37:12 laptop-server npm[1618271]: at processPromiseRejections (node:internal/process/promises:274:27)
Dec 25 12:37:12 laptop-server npm[1618271]: at processTicksAndRejections (node:internal/process/task_queues:97:32)
Dec 25 12:37:13 laptop-server npm[1618271]: If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues
Dec 25 12:37:13 laptop-server npm[1618271]: 2022-12-25T11:37:13Z [MONITOR] WARN: Monitor #44 'prometheus (prometheus.)': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? |
Max retries: 12 | Retry: 1 | Retry Interval: 60 seconds | Type: keyword
Dec 25 12:37:14 laptop-server npm[1618271]: 2022-12-25T11:37:14Z [AUTH] INFO: Successfully logged in user jackson. IP=176.241.52.131
Dec 25 12:37:15 laptop-server npm[1618271]: 2022-12-25T11:37:15Z [RATE-LIMIT] INFO: remaining requests: 20
Dec 25 12:37:19 laptop-server npm[1618271]: 2022-12-25T11:37:19Z [MONITOR] WARN: Monitor #34 'ntfy localhost': Failing: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Interval:
20 seconds | Type: http | Down Count: 0 | Resend Interval: 15
Dec 25 12:37:43 laptop-server npm[1618271]: 2022-12-25T11:37:43Z [RATE-LIMIT] INFO: remaining requests: 20
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 19 (9 by maintainers)
@louislam if you’re considering supporting other databases, i would personally suggest considering postgresql and mysql and the pros/cons
i don’t have mysql on my server, because nothing uses it. pretty much everything i run (peertube, mastodon, synapse (matrix homeserver)) are using postgres.
anyways, here’s some already existing issues/comments:
MariaDB support is merged already. And I’m submitting Postgres support in #3748
Maybe just this for now:
OK, in documentation (https://www.sqlite.org/pragma.html#pragma_auto_vacuum) we have this:
IMO, we can write:
or just:
Yea, indeed shrinking is not the same as deleting monitors. Should have read more carefully, sorry about that
@cypa Please see the performance changes we are doing in
v2.0=> see https://github.com/louislam/uptime-kuma/issues/4500. While optimising the indexies might also be a way, we have chosen to do aggregation instead. The relevant PR here is https://github.com/louislam/uptime-kuma/pull/2750=> deleting in smaller batches (i.e. allowing for other operations to sneak in) would be the way to resolve this issue.