duckdb: OOM when reading Parquet file
What happens?
It is using all available memory and is terminated by OOM.
To Reproduce
Allocate a machine with 32 GB RAM, like c6a.4xlarge on AWS, with Ubuntu 22.04. ssh into that machine. Run the following commands:
sudo apt-get update
sudo apt-get install python3-pip
pip install duckdb
wget 'https://datasets.clickhouse.com/hits_compatible/hits.parquet'
Create the following run.py file:
#!/usr/bin/env python3
import duckdb
import timeit
con = duckdb.connect(database='my-db.duckdb', read_only=False)
print("Will load the data")
start = timeit.timeit()
con.execute("CREATE TABLE hits AS SELECT * FROM parquet_scan('hits.parquet')")
end = timeit.timeit()
print(end - start)
Make it executable:
chmod +x run.py
Run it:
./run.py
Wait around 10 minutes…
Will load the data
Killed
Environment (please complete the following information):
- OS: Ubuntu 22.04
- DuckDB Version: 0.4.0
- DuckDB Client: Python
Identity Disclosure:
- Full Name: Alexey Milovidov
- Affiliation: ClickHouse, Inc
With OOM it cannot qualify in the ClickHouse benchmark.
Before Submitting
- Have you tried this on the latest
masterbranch? No.
- Python:
pip install duckdb --upgrade --preIt installs the same version 0.4.0. - R: I don’t use R.
- Other Platforms: I don’t use other platforms.
- Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there? Yes.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 22 (4 by maintainers)
Commits related to this issue
- Fix #3969: route allocations of the resizeable parquet buffers and the undo buffer through the buffer manager, and avoid unnecessary scan of table when skipping the WAL write and writing directly to t... — committed to Mytherin/duckdb by Mytherin 2 years ago
- Fix #3969: route allocations of the resizeable parquet buffers and the undo buffer through the buffer manager, and avoid unnecessary scan of table when skipping the WAL write and writing directly to t... — committed to hannes/duckdb by Mytherin 2 years ago
Thank you!
ClickHouse is using jemalloc since Aug 2018. We have found and fixed multiple bugs in jemalloc, and continue to find them, but generally, it works ok.
And glibc’s malloc is something that no reasonable software should ever use.
This issue was auto-closed, but is actually not fully resolved yet in the master. In #4035 I have pushed a number of fixes for allocation tracking and reducing the number of allocations, which alleviates some of the problems, but the core problem here is actually caused by
glibcmalloc.Attached is a graph of the memory usage of DuckDB running on the AWS
c6a.4xlargebox using different memory allocators and a memory limit of 26GB.glibcmalloc runs out of memory during the compression phase despite DuckDB freeing more buffers than it is allocating, likely caused by memory fragmentation. Switching to jemalloc fixes the problem and lets DuckDB finish.We need to switch to jemalloc internally to fully resolve this issue and make DuckDB complete the query out-of-the-box without being killed by the OOM killer.
@hannes I’ve included DuckDB into https://github.com/ClickHouse/ClickBench Hope it will be interesting.
Ok, I will set the memory limit to 8GB.
I’m currently preparing the benchmark, it is in work in progress: https://github.com/ClickHouse/ClickHouse/pull/38437 The idea is to make it easy reproducible (like running a single small script) and also compatible with most SQL DBMS. So, I’ve removed all custom aggregate functions and data types.
On the topic of the clickhouse benchmarks, please make sure to use the
approx_count_distinctfunction instead ofCOUNT(DISTINCT)for better performance on approx distinct counts 😃Is the code that you are using to benchmark DuckDB available somewhere publicly, or could we perhaps have a look at it otherwise?
Thanks for the report! I will investigate more after the weekend.
As a work-around for now, perhaps try either (1) setting an even lower memory limit explicitly (e.g.
SET memory_limit='8GB', instead of the 16GB/24GB you tried before) or (2) splitting up the file load into multiple calls using e.g.INSERT INTO tbl SELECT * FROM hits.parquet LIMIT ... OFFSET .... Neither will be particularly fast but at least it should work.