duckdb: Out of Memory Error: Failed to allocate block of 13230080 bytes

What happens?

I am performing the following query with a window function and QUALIFY clause on a GCP n2-highmem-64 instance (64 CPU, 500GB RAM, with plenty of free swap just in case).

-- perform query
CREATE TABLE tbl_out AS
SELECT
	A, B, C,
	row_number() OVER (PARTITION BY A, B ORDER BY C DESC) AS D
FROM tbl_in
QUALIFY row_number() OVER (PARTITION BY A, B ORDER BY C DESC) <= 5
ORDER BY A, B, D ASC;

DuckDB is configured as such:

-- only use 16 threads and 200G RAM, set a temporary directory
SET threads TO 16;
SET memory_limit = '200GB';
SET temp_directory='<some path>';

This issue is likely similar to https://github.com/duckdb/duckdb/issues/8135: the issue happens randomly, and there’s 300GB of free RAM available.

As advised by @hawkfish here, I reduced the thread count which helped for previous queries, but not this one. I set memory to less than 50% of the available RAM. I set up a temporary directory.

This query works on my OSX M1 Max, 64GB RAM, 10 CPUs.

To Reproduce

I cannot share the original data as it is private, but I generated synthetic data that keeps the original cardinality and reproduce the error.

On a GCP instance n2-highmem-64,

$ curl -O https://storage.googleapis.com/capp-dev/duckdb/tbl_in.parquet
$ duckdb oom

then

SET threads TO 16;
SET memory_limit = '200GB';
SET temp_directory='tmp';

CREATE TABLE tbl_in AS SELECT * FROM 'tbl_in.parquet';

CREATE TABLE tbl_out AS
SELECT
	A, B, C,
	row_number() OVER (PARTITION BY A, B ORDER BY C DESC) AS D
FROM tbl_in
QUALIFY row_number() OVER (PARTITION BY A, B ORDER BY C DESC) <= 5
ORDER BY A, B, D ASC;

You should likely see the following error:

Out of Memory Error: Failed to allocate block of XXXX bytes

although you have plenty of free RAM

OS:

Linux instance-2 5.10.0-26-cloud-amd64 #1 SMP Debian 5.10.197-1 (2023-09-29) x86_64 GNU/Linux

DuckDB Version:

0.9.2

DuckDB Client:

CLI & Go

Full Name:

Gawen Arab

Affiliation:

airbuds.fm

Have you tried this on the latest main branch?

I have tested with a release build (and could not test with a main build)

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • Yes, I have

About this issue

  • Original URL
  • State: open
  • Created 7 months ago
  • Comments: 27 (9 by maintainers)

Most upvoted comments

Good question. My objective was actually to see if there was something dumb going on in the window operator (which is my responsibility). The workaround came from my observations of that.

But why the allocator is failing is something I have little clue about and should probably be addressed by another member of the team. I do know that not all of our memory is tracked, which can be an issue at times. it may also be that once the partitioning ground through, we were unable to allocate a large untracked block of 1/1024 of the data for the window calculations on one of the cores.

I have 512gb RAM with 64CPU and dataset in gziped json format ~300gb raises exactly same error, rerun script 3 times and same issue. Reducing thread or memory_limit don’t help. Any other suggestions? @beazerj could you please guide, how to disable jemalloc? Thanks

I have fixed my OOM issue by disabling jemalloc on Ubuntu, just run this GEN=ninja BUILD_JSON=1 SKIP_EXTENSIONS=jemalloc BUILD_JEMALLOC=0 make

This is entirely unhelpful to you because i can’t actually provide the data but wanted to follow up on this. I have updated to build 0.10.0 and am still getting allocator issues with jemalloc when joining large tables (~12B rows) that i have only been able to solve by disabling jemalloc. The error that keeps popping up is the same as above: Error: Out of Memory Error: Failed to allocate block of 262144 bytes even though ostensibly there is memory still available. I’m also on a gcp machine n2-highmem with 856GB RAM and 128 vcpus. During execution of one particular join i limit the threads to 24 and the memory to 700GB. These settings work with the system allocator but not with jemalloc. If it’s also helpful, the tables that i’m joining contain anywhere from 100GB to 1TB of data under whatever compression duckdb is using.