questdb: Potential memory leak in 6.2
Describe the bug
We are running QuestDB 6.2 (container) and ingesting data at 14kHz (via ILP, single writer) for a long period. During this period (couple of days), we do not run a single query.
There, we observed that the memory usage of QuestDB rises over time, finally allocating all available memory. When limiting the memory using cgroups (technically via docker-compose mem_limit setting), we observe that the process is periodically OOMd and restarted. Further we observed the following:
- we do not overrun the DB (all data ends up on disk until last commit before OOM, checked after restart)
- when stopping the ingestion, the memory usage does not shrink
- the JVM heap is limited to 1GB (that could be put into the docs)
- Restarts happen very periodically
time between OOMs (restarts)
- 03:48:02
- 04:03:43
- 03:53:27
- 03:58:54
- 05:00:44
docker-compose.yml
services:
questdb:
image: docker.io/questdb/questdb:6.2
volumes:
- /mnt/db-storage:/root/.questdb
# Note: setting log-level via env var did not work
- ./questdb-conf/log.conf:/root/.questdb/conf/log.conf:ro
ports:
- '8812:8812'
- '9000:9000'
- '9009:9009'
environment:
- QDB_LINE_TCP_MAINTENANCE_JOB_INTERVAL=10000
- QDB_LINE_TCP_DEFAULT_PARTITION_BY=HOUR
- QDB_CAIRO_COMMIT_LAG=10000
- QDB_HTTP_QUERY_CACHE_ENABLED=false
- QDB_PG_SELECT_CACHE_ENABLED=false
- QDB_PG_INSERT_CACHE_ENABLED=false
cpus: 4
mem_limit: 2G
restart: unless-stopped
QuestDB output after startup
2022-02-08T06:36:59.104203Z A server-main Server config : /root/.questdb/conf/server.conf
2022-02-08T06:36:59.111826Z A server-main Config changes applied:
2022-02-08T06:36:59.111843Z A server-main http.enabled : true
2022-02-08T06:36:59.111870Z A server-main tcp.enabled : true
2022-02-08T06:36:59.111891Z A server-main pg.enabled : true
2022-02-08T06:36:59.111912Z A server-main open database [id=8916382354024914915.-5271762009964388491]
2022-02-08T06:36:59.111948Z A server-main platform [bit=64]
2022-02-08T06:36:59.111969Z A server-main OS/Arch: linux/amd64 [AVX2,8]
2022-02-08T06:36:59.112352Z A server-main available CPUs: 4
2022-02-08T06:36:59.112388Z A server-main db root: /root/.questdb/db
2022-02-08T06:36:59.112410Z A server-main backup root: null
2022-02-08T06:36:59.112482Z A server-main db file system magic: 0x6edc97c2 [BTRFS] SUPPORTED
2022-02-08T06:36:59.112712Z A server-main SQL JIT compiler mode: off
2022-02-08T06:36:59.298217Z A i.q.TelemetryJob instance [id=0x05d6f771771573000001f41cdc0163, enabled=true]
2022-02-08T06:36:59.307648Z A http-server listening on 0.0.0.0:9000 [fd=58 backlog=256]
2022-02-08T06:36:59.347898Z A pg-server listening on 0.0.0.0:8812 [fd=62 backlog=10]
2022-02-08T06:36:59.380170Z A tcp-line-server listening on 0.0.0.0:9009 [fd=64 backlog=256]
2022-02-08T06:36:59.408147Z A server-main enjoy
2022-02-08T06:36:59.345514Z A http-min-server listening on 0.0.0.0:9003 [fd=60 backlog=4]
To reproduce
No response
Expected Behavior
Fixed upper bound of allocated memory. At least for the containerized version, this limit should be read from the cgroup.
Environment
- **QuestDB version**: 6.2 (container)
- **OS**: Debian Bullseye
- **container runtime**: podman + crun (rootless)
- **storage**: BTRFS (for DB), fuse-overlayfs for container rootfs
Additional context
Without memory limitation:
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 19 (9 by maintainers)
@pswu11 Looks like that fixed the issue. We did no longer observe a monotonic grow of the RSS.
Right after executing the command, the DB crashed and restarted (no memory stats output). After that (running at ~250MB memory):