duckdb: Out of Memory Error: Failed to allocate block of 262144 bytes

What happens?

I am running dbt-duckdb on a m6id.32xlarge ec2 instances (512GB memory 128 vCPU) and will randomly get OOM errors even though the system says its only using <100Gb of memory. The same pipeline runs fine on my 64gb m1 max macbook pro. Is there any settings I should be setting when running on linux?

To Reproduce

I cant give detailed instructions to reproduce since the queries are for internal consumption only. There is not 1 query that causes the issue, and it happens randomly. We can not get pipelines to finish on ec2 instances that run fine locally.

OS:

Linux x86_64

DuckDB Version:

0.8.1

DuckDB Client:

Python

Full Name:

Thomas Boles

Affiliation:

SaaSWorks Inc.

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 2
  • Comments: 16 (5 by maintainers)

Most upvoted comments

My latest PR is #8593 . There is an issue with releasing memory for data as we scan it, so right now we end up needing double the memory for intermediates, but the current code should be pretty good at paging to disk. I have a test that take 15 minutes because it is run in a relatively small about of memory (16GB) when the data sets are over 100GB, so it is definitely possible to make things finish under fairly extreme stress. Things to try:

  • Set the memory lower because we don’t account for all of it
  • Make sure you have a temporary directory
  • Reduce the thread count so that less memory is needed.

@sorokine thanks for this data point, and it’s great that you found a workaround.

The CSV writer is receiving a major rewrite prior to v0.10, which includes its memory handling, so we hope that most memory allocation issues will go away with the next release.