longhorn: [BUG] Restoring volume stuck forever if the backup is already deleted.
Describe the bug Restoring volume stuck with null state forever if the backup is already deleted.
To Reproduce Steps to reproduce the behavior:
- Create a volume and write 100 mb data to it.
- Take a backup.
- Restore the backup and immediately delete the backup.
- The restoring volume stuck and never recovers or become faulted.
Expected behavior The restoring volume should become faulted after retrying for some time, there should be a time out for retry.
Log longhorn-support-bundle_e636dffc-08d6-4fd1-8aad-7a5fe167b4d2_2020-10-08T22-41-51Z.zip
Backup volume - pvc-f8473372-108d-4667-a54b-23b99595de66 Backup name - backup-708a7f01339f467a Restoring volume - restore-2 Time - ~2020-10-08 22:24:00
Environment:
- Longhorn version: Longhorn Master - 10/08/2020
- Kubernetes version: 19.2
- Node OS type and version: Ubuntu 18.04
Additional context
The backupfrom of the restoring volume is referring to a non-existent backup because the backup is deleted.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 16 (15 by maintainers)
@chriscchien
Got it. This is different than the backup deletion, and it is a backupVolume deletion case.
The backupVolume does code
Deleting all backups CR should succeed, but the deletions in backupstore are rejected until the restoration process is complete. The behaviors are expected.
The restoration is succesful, but the volume state is stuck in attached with
frontendDisabled=truebecause the errorThe root cause is the check in
checkForAutoDetachment. We should ignore theIsNotFounderror.cc @weizhe0422
Yeah.
I think ignoring the
IsNotFounderror might be not a good solution. Probably need to check if the backupVolume is being used for any restoration before removing it from the backupstore, and it will be consistent with the backup deletion.Hi @derekbit , I can only reproduce the situation by clicking
Delete All Backupsin UI while volume restoration is in progress, from terminal I can see the backup CR deleted(disappear) too. Below is the support bundle supportbundle_4cb5bb67-7d01-4b27-91ef-775a969a3043_2023-03-07T13-54-09Z.zipIf I try delete backup CR by command or try delete single backup in UI during volume restoration, the restoration will complete and backup deleted after restoration, volume become detached.
~~Update the test steps and expectation https://github.com/longhorn/longhorn/issues/5458#issuecomment-1452894901~~
cc @chriscchien @longhorn/qa
Close this ticket because in longhorn-manager master
44f425below, actions worked as expectedBecause of the latest PR, so the behavior won’t be changed.
cc @longhorn/qa
@mantissahz need to backport to 1.3 and 1.2.