velero: Velero Snapshots Stored in Wrong Resource Group for Azure AKS (MC_*)

What steps did you take and what happened:

  • Velero managed disk snapshots are stored in the AKS managed resource group (MC_*) for the cluster instead of a specified snapshot location resource group (specified at velero install time).
  • This means that if the AKS cluster is deleted, backups (pv snapshots) will be lost, making DR recovery impossible. This is because the MC_* resource group is deleted along with the cluster when an AKS cluster is destroyed (especially with an infrastructure as code implementation)

Install and configuration:

  • Uninstall Velero and specify the snapshot location resource group explicitly as follows:
/usr/local/bin/velero install --provider azure \
  --bucket "lolcorpaz1aksbkp" \
  --secret-file "velero-credentials" \
  --image "velero/velero:v1.1.0" \
  --backup-location-config \
  resourceGroup="rsg-lolcorp-uat-az1-aksbkp",storageAccount="stalolcorpuataz1aksbkp"
  --snapshot-location-config \
  apiTimeout="1m",resourceGroup="rsg-lolcorp-uat-az1-aksbkp" \
  --velero-pod-cpu-limit "0" \
  --velero-pod-cpu-request "0" \
  --velero-pod-mem-limit "0" \
  --velero-pod-mem-request "0"
  --use-restic --wait"

AZURE_SUBSCRIPTION_ID="...."

AZURE_TENANT_ID="...."

ZURE_CLIENT_ID="...."

AZURE_CLIENT_SECRET="...."

AZURE_RESOURCE_GROUP="rsg-lolcorp-uat-az1-aksbkp"

AZURE_CLOUD_NAME="AzurePublicCloud"

  • Despite this specification, snapshots still appear in the AKS MC_… resource group (the “managed resource group”)

What did you expect to happen:

  • PV snapshots for managed disk will be stored in the resource group specified in the snapshotlocation specified in the install configuration.
  • This will provide the correct level of DR suitable for enterprise organisations

The output of the following commands will help us better understand what’s going on: (Pasting long output into a GitHub gist or other pastebin is fine.)


[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "pvc-24d7d033-5e91-11ea-9866-929bf458f004"   ^C

[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "75f44b4c-5f64-11ea-9866-929bf458f004"

time="2020-03-06T05:00:31Z" level=info msg="Backing up item" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:162" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods

time="2020-03-06T05:00:31Z" level=info msg="Executing takePVSnapshot" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:375" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods

time="2020-03-06T05:00:31Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic." backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default persistentVolume=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 resource=pods

  • Configuration of the velero snapshot location
[tests]$ velero get snapshot-locations -o yaml

apiVersion: velero.io/v1

kind: VolumeSnapshotLocation

metadata:

  creationTimestamp: 2020-03-06T04:31:16Z

  generation: 1

  labels:

    component: velero

  name: default

  namespace: velero

  resourceVersion: "169404"

  selfLink: /apis/velero.io/v1/namespaces/velero/volumesnapshotlocations/default

  uid: 517763a6-5f63-11ea-9866-929bf458f004

spec:

  config:

    apiTimeout: 1m

    resourceGroup: rsg-lolcorp-uat-az1-aksbkp

  provider: azure

status: {}

  • kubectl logs deployment/velero -n velero
  • velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
  • velero backup logs <backupname>
  • velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
  • velero restore logs <restorename>

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Velero version (use velero version):
  • Velero features (use velero client config get features):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 18 (8 by maintainers)

Most upvoted comments

awesome, glad you got it working 😃 I’ll close this out.

Hi @skriss - I’ve been rebuilding my test set up to reproduce this issue on a clean cluster. I will paste the new results tomorrow.