velero: Can't restore restic volumes
What steps did you take and what happened:
I created a backup that included some PVCs backed up using restic (with the backup.velero.io/backup-volumes
annotation). The backup completed successfully. I was using S3. These PVCs are bound to normal EBS PV’s, so I know I can use snapshots, but I was trying out restic.
I then deleted the cluster and set up a new kops cluster, installed velero using the same manifests and created a restore job.
The restore job then gets stuck in pending.
What did you expect to happen:
Everything to restore.
The output of the following commands will help us better understand what’s going on:
kubectl logs deployment/velero -n velero
<snip>
time="2019-03-14T01:55:43Z" level=info msg="Restoring cluster level resource 'persistentvolumes' from: /tmp/725321603/resources/persistentvolumes/cluster" backup=bts3 logSource="pkg/restore/restore.go:696" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="Getting client for /v1, Kind=PersistentVolume" backup=bts3 logSource="pkg/restore/restore.go:754" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-50d4544c-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="Attempting to restore PersistentVolume: pvc-50d4544c-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="error restoring pvc-50d4544c-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="successfully restored persistent volume from snapshot" backup=bts3 logSource="pkg/restore/restore.go:1166" persistentVolume=pvc-510b74ed-4526-11e9-9f1f-0aa34d933732 providerSnapshotID=snap-094014b90adb832dc restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-510b74ed-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="error restoring pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="error restoring pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-54284933-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-54284933-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="error restoring pvc-54284933-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Restoring resource 'persistentvolumeclaims' into namespace 'sgs' from: /tmp/725321603/resources/persistentvolumeclaims/namespaces/sgs" backup=bts3 logSource="pkg/restore/restore.go:694" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Getting client for /v1, Kind=PersistentVolumeClaim" backup=bts3 logSource="pkg/restore/restore.go:754" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: etcd-pv-0" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: etcd-pv-1" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: etcd-pv-2" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: postgres-pv-0" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: postgres-pv-1" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:56:43Z" level=warning msg="Timeout reached waiting for persistent volume pvc-50d4544c-4526-11e9-9f1f-0aa34d933732 to become ready" backup=bts3 logSource="pkg/restore/restore.go:830" restore=velero/bts3-20190313185509
<snip>
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
Name: bts3
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>
Phase: Completed
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1
Started: 2019-03-12 18:48:59 -0700 PDT
Completed: 2019-03-12 18:51:17 -0700 PDT
Expiration: 2019-04-11 18:48:59 -0700 PDT
Validation errors: <none>
Persistent Volumes:
pvc-510b74ed-4526-11e9-9f1f-0aa34d933732:
Snapshot ID: snap-094014b90adb832dc
Type: gp2
Availability Zone: us-west-2c
IOPS: <N/A>
Restic Backups:
New:
sgs/etcd-0-79894c7d66-jv9hz: data, data, data
sgs/etcd-1-74d4fdd498-9wdxw: data, data, data
sgs/etcd-2-bbdb44bdd-xlb27: data, data, data
sgs/postgres-0-fd9f5dfd4-fql6b: data, data
sgs/postgres-0-fd9f5dfd4-h6mks: data
Anything else you would like to add:
The restore also hangs, so there are no logs for the restore, only what I extract from the velero pod.
Also, this is the output of kubectl get pvc:
sgs etcd-pv-0 Lost pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs etcd-pv-1 Lost pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs etcd-pv-2 Lost pvc-54284933-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs postgres-pv-0 Lost pvc-50d4544c-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs postgres-pv-1 Bound pvc-510b74ed-4526-11e9-9f1f-0aa34d933732 100Gi RWO gp2-us-west-2c 8m
postgres-pv-1 is a volume I forgot the annotation on so it used normal snapshots. The other PVCs I set the annotation on.
Environment:
- Velero version (use
velero version
): 0.11.0 - Kubernetes version (use
kubectl version
): 1.11 - Kubernetes installer & version: KOPS 1.11
- Cloud provider or hardware configuration: AWS
- OS (e.g. from
/etc/os-release
):
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 28 (10 by maintainers)
I got into a similar situation where I was trying to do restores that included restic volumes. All of the restores were getting stuck in STATUS=New. The way I resolved it was to delete the velero pod, so the velero deployment would recreate it. Then, the newly created velero pod started picking up the restores and their STATUS changed to InProgress.
Great. Thank you for your help. I’ll follow #1151 and wait for a resolution. The restic backup is ideal because I can restore in a different region (AWS main to AWS govcloud) but snapshots will do for now.