questdb: Potential memory leak in 6.2

Describe the bug

We are running QuestDB 6.2 (container) and ingesting data at 14kHz (via ILP, single writer) for a long period. During this period (couple of days), we do not run a single query.

There, we observed that the memory usage of QuestDB rises over time, finally allocating all available memory. When limiting the memory using cgroups (technically via docker-compose mem_limit setting), we observe that the process is periodically OOMd and restarted. Further we observed the following:

we do not overrun the DB (all data ends up on disk until last commit before OOM, checked after restart)
when stopping the ingestion, the memory usage does not shrink
the JVM heap is limited to 1GB (that could be put into the docs)
Restarts happen very periodically

time between OOMs (restarts)

03:48:02
04:03:43
03:53:27
03:58:54
05:00:44

docker-compose.yml

services:
  questdb:
    image: docker.io/questdb/questdb:6.2
    volumes:
      - /mnt/db-storage:/root/.questdb
      # Note: setting log-level via env var did not work
      - ./questdb-conf/log.conf:/root/.questdb/conf/log.conf:ro
    ports:
      - '8812:8812'
      - '9000:9000'
      - '9009:9009'
    environment:
      - QDB_LINE_TCP_MAINTENANCE_JOB_INTERVAL=10000
      - QDB_LINE_TCP_DEFAULT_PARTITION_BY=HOUR
      - QDB_CAIRO_COMMIT_LAG=10000
      - QDB_HTTP_QUERY_CACHE_ENABLED=false
      - QDB_PG_SELECT_CACHE_ENABLED=false
      - QDB_PG_INSERT_CACHE_ENABLED=false
    cpus: 4
    mem_limit: 2G
    restart: unless-stopped

QuestDB output after startup

2022-02-08T06:36:59.104203Z A server-main Server config : /root/.questdb/conf/server.conf
2022-02-08T06:36:59.111826Z A server-main Config changes applied:
2022-02-08T06:36:59.111843Z A server-main   http.enabled : true
2022-02-08T06:36:59.111870Z A server-main   tcp.enabled  : true
2022-02-08T06:36:59.111891Z A server-main   pg.enabled   : true
2022-02-08T06:36:59.111912Z A server-main open database [id=8916382354024914915.-5271762009964388491]
2022-02-08T06:36:59.111948Z A server-main platform [bit=64]
2022-02-08T06:36:59.111969Z A server-main OS/Arch: linux/amd64 [AVX2,8]
2022-02-08T06:36:59.112352Z A server-main available CPUs: 4
2022-02-08T06:36:59.112388Z A server-main db root: /root/.questdb/db
2022-02-08T06:36:59.112410Z A server-main backup root: null
2022-02-08T06:36:59.112482Z A server-main db file system magic: 0x6edc97c2 [BTRFS] SUPPORTED
2022-02-08T06:36:59.112712Z A server-main SQL JIT compiler mode: off
2022-02-08T06:36:59.298217Z A i.q.TelemetryJob instance [id=0x05d6f771771573000001f41cdc0163, enabled=true]
2022-02-08T06:36:59.307648Z A http-server listening on 0.0.0.0:9000 [fd=58 backlog=256]
2022-02-08T06:36:59.347898Z A pg-server listening on 0.0.0.0:8812 [fd=62 backlog=10]
2022-02-08T06:36:59.380170Z A tcp-line-server listening on 0.0.0.0:9009 [fd=64 backlog=256]
2022-02-08T06:36:59.408147Z A server-main enjoy
2022-02-08T06:36:59.345514Z A http-min-server listening on 0.0.0.0:9003 [fd=60 backlog=4]

To reproduce

No response

Expected Behavior

Fixed upper bound of allocated memory. At least for the containerized version, this limit should be read from the cgroup.

Environment

- **QuestDB version**: 6.2 (container)
- **OS**: Debian Bullseye
- **container runtime**: podman + crun (rootless)
- **storage**: BTRFS (for DB), fuse-overlayfs for container rootfs

Additional context

Without memory limitation:

questdb-leak

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 19 (9 by maintainers)

Most upvoted comments

@pswu11 Looks like that fixed the issue. We did no longer observe a monotonic grow of the RSS.

fmoessbauer on Mar 2, 2022

Right after executing the command, the DB crashed and restarted (no memory stats output). After that (running at ~250MB memory):

        TOTAL: 621618615
        MMAP_DEFAULT: 274600
        NATIVE_DEFAULT: 31734985
        MMAP_O3: 0
        NATIVE_O3: 32
        NATIVE_RECORD_CHAIN: 0
        MMAP_TABLE_WRITER: 352354304
        NATIVE_TREE_CHAIN: 0
        MMAP_TABLE_READER: 90
        NATIVE_COMPACT_MAP: 0
        NATIVE_FAST_MAP: 0
        NATIVE_FAST_MAP_LONG_LIST: 0
        NATIVE_HTTP_CONN: 102781568
        NATIVE_PGW_CONN: 134217728
        MMAP_INDEX_READER: 46172
        MMAP_INDEX_WRITER: 53488
        MMAP_INDEX_SLIDER: 0
        MMAP_BLOCK_WRITER: 0
        NATIVE_REPL: 131072
        NATIVE_SAMPLE_BY_LONG_LIST: 0
        NATIVE_LATEST_BY_LONG_LIST: 0
        NATIVE_JIT_LONG_LIST: 0
        NATIVE_LONG_LIST: 0
        NATIVE_JIT: 24576

fmoessbauer on Feb 8, 2022