thanos: store/compact: bucket fetch is broken in v0.32 vs v0.31
Thanos, Prometheus and Golang version used: v0.32.2
Object Storage Provider: GCS
What happened:
Tried to upgrade from v0.31.0 to v0.32.2 and seeing for both compactor and store:
- increased
Readytime to 20min (from 1min for v0.31.0):
ts=2023-09-06T14:38:10.996926271Z caller=store.go:435 level=info msg="bucket store ready" init_duration=19m17.494860523s
ts=2023-09-06T14:36:06.345890308Z caller=fetcher.go:487 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=17m12.843785017s duration_ms=1032843 cached=11790 returned=6634 partial=4168
- constant bandwidth of 60MBps of
storecontainer towards bucket, even after sync of blocks is completed, it isReady, and there are no query requests coming - constant cpu usage
Using images available on docker hub the issue is narrowed down to:
thanosio/thanos:main-2023-07-03-9710935 - ok
thanosio/thanos:main-2023-07-03-b60452a - bad
So it is https://github.com/thanos-io/thanos/pull/6474/files to blame
cc @fpetkovski
Also attaching pprof cpu profile for 60s, when store becomes ready and there is no any queries to it:
Archive.zip
What you expected to happen: New version is better than previous 😉
How to reproduce it (as minimally and precisely as possible):
image: thanosio/thanos:main-2023-07-03-b60452a # main-2023-07-03-9710935 - works
args:
- store
- |
--objstore.config=type: GCS
config:
bucket: <bucketname>
Full logs to relevant components: No any errors in logs, just very slow timings
About this issue
- Original URL
- State: open
- Created 10 months ago
- Reactions: 3
- Comments: 22 (16 by maintainers)
By adding a few fmt.Println I was able to find out the reason. Our bucket had 2515032 objects for 2020-2021 like:
/debug/metas/%ULID%.jsonLooks like remains of: https://github.com/thanos-io/thanos/issues/3839 After deleting these,bucket lsfor v0.32.5 works with the same speed as v0.31. Andstoretimings back in order:We see the same with our large bucket: 0.31.0:
msg="successfully synchronized block metadata" duration=21.876800148s duration_ms=21876 cached=38595 returned=28518 partial=40.32.2:
msg="successfully synchronized block metadata" duration=28m53.396172016s duration_ms=1733396 cached=38014 returned=28518 partial=4We suspect the changes in #6474, as using a single threaded
Iterapi call to list all 8.5e6 objects in the bucket is far slower than parallelizableExistscalls to check 38014 metadata.json objects.We reduced our daily object storage bill by thousands of dollars per day by using a recursive iterator, so having no escape hatch will put us in a very bad spot. I would say the root cause is the large amount of partial blocks in object storage, or data unrelated to Thanos. For the versioned objects use case, we should probably change the library to only request the latest version. I am also fine with having both methods as options since there are obvious trade-offs.
We still haven’t heard from @gburek-fastly whether his issue falls into one of these two categories.
With
0.32.3we are seeing pretty close perf to0.32.024m48 for 38105 blocks