cortex: ingester_v2 reporting quite a bit of "failed to delete idle TSDB" error

Describe the bug We are seeing error message like

level=error ts=2020-12-15T21:33:44.395830698Z caller=ingester_v2.go:1623 msg="failed to delete idle TSDB" user=some-user-id err="unlinkat /data/tsdb/some-user-id: directory not empty"

We are using this commit

To Reproduce I am not sure

Expected behavior A clear and concise description of what you expected to happen.

Environment:

  • Infrastructure: EKS
  • Deployment tool: Helm

Storage Engine

  • [ X ] Blocks
  • Chunks

Additional Context The error happens during a call to os.RemoveAll() call, research shows that RemoveAll() might return “directory not empty” error if file are created under the directory while the removal process is on going. So I am wondering if there is any background process in ingester that maybe interfering with the local TSDB deletion process?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (16 by maintainers)

Most upvoted comments

I am on the same thought trail @pstibrany 😃

I just looked at few example directories, and the directory has a single thanos.shipper.json file. Also, the last modify timestamp is same as timestamp (up to seconds precision) where the error was reported.

Shipper running concurrently?