questdb: Potential memory leak in 6.2

Describe the bug

We are running QuestDB 6.2 (container) and ingesting data at 14kHz (via ILP, single writer) for a long period. During this period (couple of days), we do not run a single query.

There, we observed that the memory usage of QuestDB rises over time, finally allocating all available memory. When limiting the memory using cgroups (technically via docker-compose mem_limit setting), we observe that the process is periodically OOMd and restarted. Further we observed the following:

  • we do not overrun the DB (all data ends up on disk until last commit before OOM, checked after restart)
  • when stopping the ingestion, the memory usage does not shrink
  • the JVM heap is limited to 1GB (that could be put into the docs)
  • Restarts happen very periodically

time between OOMs (restarts)

  • 03:48:02
  • 04:03:43
  • 03:53:27
  • 03:58:54
  • 05:00:44

docker-compose.yml

services:
  questdb:
    image: docker.io/questdb/questdb:6.2
    volumes:
      - /mnt/db-storage:/root/.questdb
      # Note: setting log-level via env var did not work
      - ./questdb-conf/log.conf:/root/.questdb/conf/log.conf:ro
    ports:
      - '8812:8812'
      - '9000:9000'
      - '9009:9009'
    environment:
      - QDB_LINE_TCP_MAINTENANCE_JOB_INTERVAL=10000
      - QDB_LINE_TCP_DEFAULT_PARTITION_BY=HOUR
      - QDB_CAIRO_COMMIT_LAG=10000
      - QDB_HTTP_QUERY_CACHE_ENABLED=false
      - QDB_PG_SELECT_CACHE_ENABLED=false
      - QDB_PG_INSERT_CACHE_ENABLED=false
    cpus: 4
    mem_limit: 2G
    restart: unless-stopped

QuestDB output after startup

2022-02-08T06:36:59.104203Z A server-main Server config : /root/.questdb/conf/server.conf
2022-02-08T06:36:59.111826Z A server-main Config changes applied:
2022-02-08T06:36:59.111843Z A server-main   http.enabled : true
2022-02-08T06:36:59.111870Z A server-main   tcp.enabled  : true
2022-02-08T06:36:59.111891Z A server-main   pg.enabled   : true
2022-02-08T06:36:59.111912Z A server-main open database [id=8916382354024914915.-5271762009964388491]
2022-02-08T06:36:59.111948Z A server-main platform [bit=64]
2022-02-08T06:36:59.111969Z A server-main OS/Arch: linux/amd64 [AVX2,8]
2022-02-08T06:36:59.112352Z A server-main available CPUs: 4
2022-02-08T06:36:59.112388Z A server-main db root: /root/.questdb/db
2022-02-08T06:36:59.112410Z A server-main backup root: null
2022-02-08T06:36:59.112482Z A server-main db file system magic: 0x6edc97c2 [BTRFS] SUPPORTED
2022-02-08T06:36:59.112712Z A server-main SQL JIT compiler mode: off
2022-02-08T06:36:59.298217Z A i.q.TelemetryJob instance [id=0x05d6f771771573000001f41cdc0163, enabled=true]
2022-02-08T06:36:59.307648Z A http-server listening on 0.0.0.0:9000 [fd=58 backlog=256]
2022-02-08T06:36:59.347898Z A pg-server listening on 0.0.0.0:8812 [fd=62 backlog=10]
2022-02-08T06:36:59.380170Z A tcp-line-server listening on 0.0.0.0:9009 [fd=64 backlog=256]
2022-02-08T06:36:59.408147Z A server-main enjoy
2022-02-08T06:36:59.345514Z A http-min-server listening on 0.0.0.0:9003 [fd=60 backlog=4]

To reproduce

No response

Expected Behavior

Fixed upper bound of allocated memory. At least for the containerized version, this limit should be read from the cgroup.

Environment

- **QuestDB version**: 6.2 (container)
- **OS**: Debian Bullseye
- **container runtime**: podman + crun (rootless)
- **storage**: BTRFS (for DB), fuse-overlayfs for container rootfs

Additional context

Without memory limitation:

questdb-leak

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

@pswu11 Looks like that fixed the issue. We did no longer observe a monotonic grow of the RSS.

Right after executing the command, the DB crashed and restarted (no memory stats output). After that (running at ~250MB memory):

        TOTAL: 621618615
        MMAP_DEFAULT: 274600
        NATIVE_DEFAULT: 31734985
        MMAP_O3: 0
        NATIVE_O3: 32
        NATIVE_RECORD_CHAIN: 0
        MMAP_TABLE_WRITER: 352354304
        NATIVE_TREE_CHAIN: 0
        MMAP_TABLE_READER: 90
        NATIVE_COMPACT_MAP: 0
        NATIVE_FAST_MAP: 0
        NATIVE_FAST_MAP_LONG_LIST: 0
        NATIVE_HTTP_CONN: 102781568
        NATIVE_PGW_CONN: 134217728
        MMAP_INDEX_READER: 46172
        MMAP_INDEX_WRITER: 53488
        MMAP_INDEX_SLIDER: 0
        MMAP_BLOCK_WRITER: 0
        NATIVE_REPL: 131072
        NATIVE_SAMPLE_BY_LONG_LIST: 0
        NATIVE_LATEST_BY_LONG_LIST: 0
        NATIVE_JIT_LONG_LIST: 0
        NATIVE_LONG_LIST: 0
        NATIVE_JIT: 24576