prometheus: PromQL: Improve space complexity by streaming more; avoid series pre-lookup.
Hi đź‘‹
While changing SeriesSet interfaces, we found with @brian-brazil that our usage of SerieSet in PromQL might cause extra unnecessary allocations for certain implementations and cases. Let me describe the problem in two diagrams:
Current PromQL logic for Range Query with Metric Selection:
All function names should match the current master: https://github.com/prometheus/prometheus/tree/18d9ebf0ffc26b8bd0e136f552c8e9886d29ade4
Issues:
Problem: We are accumulating all series first, to gather all iterators.
With the current interfaces (StoreSet, Remote Read, Thanos StoreAPI) “Series by Series” means that we stream Series labels together with labels.
Current Flow Issues:
- To finish this
eval
PromQL stage, we have to buffer & receive all data. Especially for remote data & Thanos, this means literally all. (: Realistic space complexity is now O(2 * series * chunks), for the biggest result, so ~ O(2 series * chunks) B - Current space complexity is also not great further with multi arg functions, as we build a hash map. For large cardinality queries, we end up hashing millions of Series potentially.
Solution: Stream more!
The idea would be to follow really exactly the same algorithm as everywhere else (vertical dedup, merge series etc): Implement Heap / MergeSort that will iterate over “sorted” series from two SeriesSets.
- Open Question: Good point that @brian-brazil mentioned, was that we don’t know then how big matrix we should allocate. I wonder if we can workaround this somehow. To me it sounds we don’t really want to operate on matrixes at all, rather on top two series from each SeriesSets… 🤔
This sounds like a big task but really needed, especially for Thanos community. Even with the work we are doing to make PromQL more concurrent for Prometheus Ecosystem (https://github.com/prometheus/prometheus/issues/6878), this issue might be impacting resource (memory) consumption significantly, especially for concurrent runs. I am happy to do this at some point, but I won’t have time in near ~2weeks for this. so… help wanted (:
Any feedback on this to make it easier to improve this? (: cc @slrtbtfs @brian-brazil @brancz @kakkoyun @cstyan @tomwilkie @gouthamve @cstyan
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 16 (10 by maintainers)
Just to give status on it.
It’s still needed for Thanos and (and potentially Cortex purposes). See: https://github.com/thanos-io/thanos/issues/4780
Looks like @darshanime did a good job on this with https://github.com/prometheus/prometheus/pull/9071, so we can close https://github.com/prometheus/prometheus/pull/8009 (thanks for your hard initial work @aryan9600 !).
@bwplotka @brian-brazil , assuming the issue still exists, I would love to take a crack at this if @aryan9600 isn’t keen on progressing with #8009