velero: Velero Snapshots Stored in Wrong Resource Group for Azure AKS (MC_*)
What steps did you take and what happened:
- Velero managed disk snapshots are stored in the AKS managed resource group (MC_*) for the cluster instead of a specified snapshot location resource group (specified at velero install time).
- This means that if the AKS cluster is deleted, backups (pv snapshots) will be lost, making DR recovery impossible. This is because the MC_* resource group is deleted along with the cluster when an AKS cluster is destroyed (especially with an infrastructure as code implementation)
Install and configuration:
- Uninstall Velero and specify the snapshot location resource group explicitly as follows:
/usr/local/bin/velero install --provider azure \
--bucket "lolcorpaz1aksbkp" \
--secret-file "velero-credentials" \
--image "velero/velero:v1.1.0" \
--backup-location-config \
resourceGroup="rsg-lolcorp-uat-az1-aksbkp",storageAccount="stalolcorpuataz1aksbkp"
--snapshot-location-config \
apiTimeout="1m",resourceGroup="rsg-lolcorp-uat-az1-aksbkp" \
--velero-pod-cpu-limit "0" \
--velero-pod-cpu-request "0" \
--velero-pod-mem-limit "0" \
--velero-pod-mem-request "0"
--use-restic --wait"
AZURE_SUBSCRIPTION_ID="...."
AZURE_TENANT_ID="...."
ZURE_CLIENT_ID="...."
AZURE_CLIENT_SECRET="...."
AZURE_RESOURCE_GROUP="rsg-lolcorp-uat-az1-aksbkp"
AZURE_CLOUD_NAME="AzurePublicCloud"
- Despite this specification, snapshots still appear in the AKS MC_… resource group (the “managed resource group”)
What did you expect to happen:
- PV snapshots for managed disk will be stored in the resource group specified in the snapshotlocation specified in the install configuration.
- This will provide the correct level of DR suitable for enterprise organisations
The output of the following commands will help us better understand what’s going on: (Pasting long output into a GitHub gist or other pastebin is fine.)
[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "pvc-24d7d033-5e91-11ea-9866-929bf458f004" ^C
[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "75f44b4c-5f64-11ea-9866-929bf458f004"
time="2020-03-06T05:00:31Z" level=info msg="Backing up item" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:162" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods
time="2020-03-06T05:00:31Z" level=info msg="Executing takePVSnapshot" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:375" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods
time="2020-03-06T05:00:31Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic." backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default persistentVolume=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 resource=pods
- Configuration of the velero snapshot location
[tests]$ velero get snapshot-locations -o yaml
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
creationTimestamp: 2020-03-06T04:31:16Z
generation: 1
labels:
component: velero
name: default
namespace: velero
resourceVersion: "169404"
selfLink: /apis/velero.io/v1/namespaces/velero/volumesnapshotlocations/default
uid: 517763a6-5f63-11ea-9866-929bf458f004
spec:
config:
apiTimeout: 1m
resourceGroup: rsg-lolcorp-uat-az1-aksbkp
provider: azure
status: {}
kubectl logs deployment/velero -n velero
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename>
orkubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
- Velero version (use
velero version
): - Velero features (use
velero client config get features
): - Kubernetes version (use
kubectl version
): - Kubernetes installer & version:
- Cloud provider or hardware configuration:
- OS (e.g. from
/etc/os-release
):
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 18 (8 by maintainers)
awesome, glad you got it working 😃 I’ll close this out.
Hi @skriss - I’ve been rebuilding my test set up to reproduce this issue on a clean cluster. I will paste the new results tomorrow.