thanos: v0.12.0 regression: Partial blocks are never deleted; spam in logs for double deletion mark.
It was hard to track down, but looks like 4 unexpected things are happening on v0.12.0:
-
Partial blocks (no meta.json) are never cleaned up:
-
We have the spam of double mark deletion for partial upload delete attempt (for each partial block):
l=info ts=2020-04-17T16:50:14.305651352Z caller=clean.go:52 msg="deleted aborted partial upload" block=01E5Q7MP483QSDSX036X4TZ5SD thresholdAge=48h0m0s
level=info ts=2020-04-17T16:50:14.305659796Z caller=clean.go:47 msg="found partially uploaded block; marking for deletion" block=01E50RCWDZ5P83NRH4R0GXVGCC
level=warn ts=2020-04-17T16:50:14.312057891Z caller=block.go:140 msg="requested to mark for deletion, but file already exists; this should not happen; investigate" err="file 01E50RCWDZ5P83NRH4R0GXVGCC/deletion-mark.json already exists in bucket"
level=info ts=2020-04-17T16:50:14.312073215Z caller=clean.go:52 msg="deleted aborted partial upload" block=01E50RCWDZ5P83NRH4R0GXVGCC thresholdAge=48h0m0s
level=info ts=2020-04-17T16:50:14.312081259Z caller=clean.go:47 msg="found partially uploaded block; marking for deletion" block=01E5695C03VK6FB89G49T5MRB4
level=warn ts=2020-04-17T16:50:14.317892887Z caller=block.go:140 msg="requested to mark for deletion, but file already exists; this should not happen; investigate" err="file 01E5695C03VK6FB89G49T5MRB4/deletion-mark.json already exists in bucket"
level=info ts=2020-04-17T16:50:14.317907097Z caller=clean.go:52 msg="deleted aborted partial upload" block=01E5695C03VK6FB89G49T5MRB4 thresholdAge=48h0m0s
level=info ts=2020-04-17T16:50:14.317914967Z caller=clean.go:47 msg="found partially uploaded block; marking for deletion" block=01E5J2V84SKW9GYMH8B88M5NR9
- We have spam of double mark deletions for garbage collection attempts (thousands+ as well):
level=warn ts=2020-04-16T23:01:18.533895718Z caller=block.go:140 msg="requested to mark for deletion, but file already exists; this should not happen; investigate" err="file 01E57S72A216JY3H6JXXK1GDE5/deletion-mark.json already exists in bucket"
level=info ts=2020-04-16T23:01:18.533922849Z caller=compact.go:271 msg="marking outdated block for deletion" block=01E5F2TGZ6FD6KS5WDMKBKGKP5
level=warn ts=2020-04-16T23:01:18.540796799Z caller=block.go:140 msg="requested to mark for deletion, but file already exists; this should not happen; investigate" err="file 01E5F2TGZ6FD6KS5WDMKBKGKP5/deletion-mark.json already exists in bucket"
level=info ts=2020-04-16T23:01:18.540817511Z caller=compact.go:271 msg="marking outdated block for deletion" block=01E58037KF7XF4TFMDQZYFKJCT
level=warn ts=2020-04-16T23:01:18.546292591Z caller=block.go:140 msg="requested to mark for deletion, but file already exists; this should not happen; investigate" err="file 01E58037KF7XF4TFMDQZYFKJCT/deletion-mark.json already exists in bucket"
level=info ts=2020-04-16T23:01:18.546319501Z caller=compact.go:271 msg="marking outdated block for deletion" block=01E58037KFCFY22BZZR4RJKG7G
- (theoretically) Malformed deletion mark can cause the problem: log and inability to delete it.
We found the root cause of 1, 2 and 4 still looking for 3, help wanted (: 3 is not very critical other than spam.
Root cause explained on diagram:
Fix in progress. Let’s make sure it is in 0.12.1
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 8
- Comments: 26 (12 by maintainers)
Commits related to this issue
- Added tests to reproduce #2459. Related to: https://github.com/thanos-io/thanos/issues/2459 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> — committed to thanos-io/thanos by bwplotka 4 years ago
- Added tests to reproduce #2459. Related to: https://github.com/thanos-io/thanos/issues/2459 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> — committed to thanos-io/thanos by bwplotka 4 years ago
- Added tests to reproduce #2459. Related to: https://github.com/thanos-io/thanos/issues/2459 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> — committed to thanos-io/thanos by bwplotka 4 years ago
- Added tests to reproduce #2459. Related to: https://github.com/thanos-io/thanos/issues/2459 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> — committed to thanos-io/thanos by bwplotka 4 years ago
- Added tests to reproduce #2459. (#2462) Related to: https://github.com/thanos-io/thanos/issues/2459 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. Some improveme... — committed to thanos-io/thanos by bwplotka 4 years ago
- Reverted addition of deletion mark for partial uploads. (#2472) Fixes https://github.com/thanos-io/thanos/issues/2459 (quick fix). This keeps the logic from the 0.11.0 which was good enough. So... — committed to thanos-io/thanos by bwplotka 4 years ago
- merge master into features/rules-proxy (#2623) * Removed dependency on Cortex fork; Moved to official one. (#2199) Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Typo corrections quick... — committed to thanos-io/thanos by s-urbaniak 4 years ago
@bwplotka This is still happening when applying optional retention. Using v0.15. The same happens for many many many more chunks - left just 1-2 for easier reading:
We are in progress in cutting it. Once done we will let you know so you can test with us if all works (:
On Tue, 21 Apr 2020 at 12:53, Stefan Bühler notifications@github.com wrote: