thanos: store/compact: bucket fetch is broken in v0.32 vs v0.31
Thanos, Prometheus and Golang version used: v0.32.2
Object Storage Provider: GCS
What happened:
Tried to upgrade from v0.31.0 to v0.32.2 and seeing for both compactor
and store
:
- increased
Ready
time to 20min (from 1min for v0.31.0):
ts=2023-09-06T14:38:10.996926271Z caller=store.go:435 level=info msg="bucket store ready" init_duration=19m17.494860523s
ts=2023-09-06T14:36:06.345890308Z caller=fetcher.go:487 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=17m12.843785017s duration_ms=1032843 cached=11790 returned=6634 partial=4168
- constant bandwidth of 60MBps of
store
container towards bucket, even after sync of blocks is completed, it isReady
, and there are no query requests coming - constant cpu usage
Using images available on docker hub the issue is narrowed down to:
thanosio/thanos:main-2023-07-03-9710935
- ok
thanosio/thanos:main-2023-07-03-b60452a
- bad
So it is https://github.com/thanos-io/thanos/pull/6474/files to blame
cc @fpetkovski
Also attaching pprof cpu profile for 60s, when store
becomes ready
and there is no any queries to it:
Archive.zip
What you expected to happen: New version is better than previous 😉
How to reproduce it (as minimally and precisely as possible):
image: thanosio/thanos:main-2023-07-03-b60452a # main-2023-07-03-9710935 - works
args:
- store
- |
--objstore.config=type: GCS
config:
bucket: <bucketname>
Full logs to relevant components: No any errors in logs, just very slow timings
About this issue
- Original URL
- State: open
- Created 10 months ago
- Reactions: 3
- Comments: 22 (16 by maintainers)
By adding a few fmt.Println I was able to find out the reason. Our bucket had 2515032 objects for 2020-2021 like:
/debug/metas/%ULID%.json
Looks like remains of: https://github.com/thanos-io/thanos/issues/3839 After deleting these,bucket ls
for v0.32.5 works with the same speed as v0.31. Andstore
timings back in order:We see the same with our large bucket: 0.31.0:
msg="successfully synchronized block metadata" duration=21.876800148s duration_ms=21876 cached=38595 returned=28518 partial=4
0.32.2:
msg="successfully synchronized block metadata" duration=28m53.396172016s duration_ms=1733396 cached=38014 returned=28518 partial=4
We suspect the changes in #6474, as using a single threaded
Iter
api call to list all 8.5e6 objects in the bucket is far slower than parallelizableExists
calls to check 38014 metadata.json objects.We reduced our daily object storage bill by thousands of dollars per day by using a recursive iterator, so having no escape hatch will put us in a very bad spot. I would say the root cause is the large amount of partial blocks in object storage, or data unrelated to Thanos. For the versioned objects use case, we should probably change the library to only request the latest version. I am also fine with having both methods as options since there are obvious trade-offs.
We still haven’t heard from @gburek-fastly whether his issue falls into one of these two categories.
With
0.32.3
we are seeing pretty close perf to0.32.0
24m48 for 38105 blocks