duckdb: Out of Memory Error: Failed to allocate block of 262144 bytes
What happens?
I am running dbt-duckdb on a m6id.32xlarge ec2 instances (512GB memory 128 vCPU) and will randomly get OOM errors even though the system says its only using <100Gb of memory. The same pipeline runs fine on my 64gb m1 max macbook pro. Is there any settings I should be setting when running on linux?
To Reproduce
I cant give detailed instructions to reproduce since the queries are for internal consumption only. There is not 1 query that causes the issue, and it happens randomly. We can not get pipelines to finish on ec2 instances that run fine locally.
OS:
Linux x86_64
DuckDB Version:
0.8.1
DuckDB Client:
Python
Full Name:
Thomas Boles
Affiliation:
SaaSWorks Inc.
Have you tried this on the latest master branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 2
- Comments: 16 (5 by maintainers)
My latest PR is #8593 . There is an issue with releasing memory for data as we scan it, so right now we end up needing double the memory for intermediates, but the current code should be pretty good at paging to disk. I have a test that take 15 minutes because it is run in a relatively small about of memory (16GB) when the data sets are over 100GB, so it is definitely possible to make things finish under fairly extreme stress. Things to try:
@sorokine thanks for this data point, and it’s great that you found a workaround.
The CSV writer is receiving a major rewrite prior to v0.10, which includes its memory handling, so we hope that most memory allocation issues will go away with the next release.