thanos: store/compact: bucket fetch is broken in v0.32 vs v0.31

Thanos, Prometheus and Golang version used: v0.32.2

Object Storage Provider: GCS

What happened: Tried to upgrade from v0.31.0 to v0.32.2 and seeing for both compactor and store:

  • increased Ready time to 20min (from 1min for v0.31.0):
ts=2023-09-06T14:38:10.996926271Z caller=store.go:435 level=info msg="bucket store ready" init_duration=19m17.494860523s
ts=2023-09-06T14:36:06.345890308Z caller=fetcher.go:487 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=17m12.843785017s duration_ms=1032843 cached=11790 returned=6634 partial=4168
  • constant bandwidth of 60MBps of store container towards bucket, even after sync of blocks is completed, it is Ready, and there are no query requests coming
  • constant cpu usage image

Using images available on docker hub the issue is narrowed down to: thanosio/thanos:main-2023-07-03-9710935 - ok thanosio/thanos:main-2023-07-03-b60452a - bad So it is https://github.com/thanos-io/thanos/pull/6474/files to blame cc @fpetkovski

Also attaching pprof cpu profile for 60s, when store becomes ready and there is no any queries to it: Archive.zip

What you expected to happen: New version is better than previous 😉

How to reproduce it (as minimally and precisely as possible):

          image: thanosio/thanos:main-2023-07-03-b60452a # main-2023-07-03-9710935 - works
          args:
            - store
            - |
              --objstore.config=type: GCS
              config:
                bucket: <bucketname>

Full logs to relevant components: No any errors in logs, just very slow timings

About this issue

  • Original URL
  • State: open
  • Created 10 months ago
  • Reactions: 3
  • Comments: 22 (16 by maintainers)

Most upvoted comments

By adding a few fmt.Println I was able to find out the reason. Our bucket had 2515032 objects for 2020-2021 like: /debug/metas/%ULID%.json Looks like remains of: https://github.com/thanos-io/thanos/issues/3839 After deleting these, bucket ls for v0.32.5 works with the same speed as v0.31. And store timings back in order:

ts=2023-11-14T09:46:24.344376629Z caller=store.go:435 level=info msg="bucket store ready" init_duration=53.578281039s
ts=2023-11-14T09:46:24.344762123Z caller=intrumentation.go:56 level=info msg="changing probe status" status=ready
ts=2023-11-14T09:46:24.34489548Z caller=grpc.go:131 level=info service=gRPC/server component=store msg="listening for serving gRPC" address=0.0.0.0:10901
ts=2023-11-14T09:46:25.174331359Z caller=fetcher.go:487 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=829.850131ms duration_ms=829 cached=10281 returned=6976 partial=2495

We see the same with our large bucket: 0.31.0: msg="successfully synchronized block metadata" duration=21.876800148s duration_ms=21876 cached=38595 returned=28518 partial=4

0.32.2: msg="successfully synchronized block metadata" duration=28m53.396172016s duration_ms=1733396 cached=38014 returned=28518 partial=4

We suspect the changes in #6474, as using a single threaded Iter api call to list all 8.5e6 objects in the bucket is far slower than parallelizable Exists calls to check 38014 metadata.json objects.

We reduced our daily object storage bill by thousands of dollars per day by using a recursive iterator, so having no escape hatch will put us in a very bad spot. I would say the root cause is the large amount of partial blocks in object storage, or data unrelated to Thanos. For the versioned objects use case, we should probably change the library to only request the latest version. I am also fine with having both methods as options since there are obvious trade-offs.

We still haven’t heard from @gburek-fastly whether his issue falls into one of these two categories.

With 0.32.3 we are seeing pretty close perf to 0.32.0

ts=2023-09-20T21:34:41.564220246Z caller=fetcher.go:487 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=24m48.323615601s duration_ms=1488323 cached=38105 returned=28929 partial=110

24m48 for 38105 blocks