ClickHouse: Preallocation into hash table slows down query for small table
Describe the situation Degraded performance of window functions since 22.8
How to reproduce
echo "select * from (select number, count(), dense_rank() over (order by number asc) rank from numbers(100000) group by number) where rank < 100" | clickhouse-benchmark -c 10 --max_threads=8
versions <= 22.7
localhost:9000, queries 3976, QPS: 442.505, RPS: 57972517.711, MiB/s: 442.295, result RPS: 43807.948, result MiB/s: 1.003.
...
95.000% 0.028 sec.
...
versions >= 22.8
localhost:9000, queries 202, QPS: 204.984, RPS: 26854915.343, MiB/s: 204.887, result RPS: 20293.387, result MiB/s: 0.464.
...
95.000% 0.056 sec.
...
Additional context
<= 22.7 explain
┌─explain───────────────────────────────────────────────────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│ Filter ((WHERE + (Projection + Before ORDER BY))) │
│ Window (Window step for window 'ORDER BY number ASC') │
│ MergingSorted (Merge sorted streams for window 'ORDER BY number ASC') │
│ MergeSorting (Merge sorted blocks for window 'ORDER BY number ASC') │
│ PartialSorting (Sort each block for window 'ORDER BY number ASC') │
│ Expression (Before window functions) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromStorage (SystemNumbers) │
└───────────────────────────────────────────────────────────────────────────────────────────┘
= 22.8 explain
┌─explain───────────────────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│ Filter ((WHERE + (Projection + Before ORDER BY))) │
│ Window (Window step for window 'ORDER BY number ASC') │
│ Sorting (Sorting for window 'ORDER BY number ASC') │
│ Expression (Before window functions) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ ReadFromStorage (SystemNumbers) │
└───────────────────────────────────────────────────────────┘
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 21 (10 by maintainers)
aggr over
numbersis single threaded regardless ofmax_threadsvalue. it is why we havenumbersandnumbers_mt. but you right that in multithreaded case we could also see a slowdown for small tables. I also addressed this issue in the linked pr.@nickitat I’ll help locate the issue, got a reproducible local setup
(QPS is 100 in 22.8 and 500 in 21.9 on my local)