velero: Velero Snapshots Stored in Wrong Resource Group for Azure AKS (MC_*)

What steps did you take and what happened:

Velero managed disk snapshots are stored in the AKS managed resource group (MC_*) for the cluster instead of a specified snapshot location resource group (specified at velero install time).
This means that if the AKS cluster is deleted, backups (pv snapshots) will be lost, making DR recovery impossible. This is because the MC_* resource group is deleted along with the cluster when an AKS cluster is destroyed (especially with an infrastructure as code implementation)

Install and configuration:

Uninstall Velero and specify the snapshot location resource group explicitly as follows:

/usr/local/bin/velero install --provider azure \
  --bucket "lolcorpaz1aksbkp" \
  --secret-file "velero-credentials" \
  --image "velero/velero:v1.1.0" \
  --backup-location-config \
  resourceGroup="rsg-lolcorp-uat-az1-aksbkp",storageAccount="stalolcorpuataz1aksbkp"
  --snapshot-location-config \
  apiTimeout="1m",resourceGroup="rsg-lolcorp-uat-az1-aksbkp" \
  --velero-pod-cpu-limit "0" \
  --velero-pod-cpu-request "0" \
  --velero-pod-mem-limit "0" \
  --velero-pod-mem-request "0"
  --use-restic --wait"


AZURE_SUBSCRIPTION_ID="...."

AZURE_TENANT_ID="...."

ZURE_CLIENT_ID="...."

AZURE_CLIENT_SECRET="...."

AZURE_RESOURCE_GROUP="rsg-lolcorp-uat-az1-aksbkp"

AZURE_CLOUD_NAME="AzurePublicCloud"

Despite this specification, snapshots still appear in the AKS MC_… resource group (the “managed resource group”)

What did you expect to happen:

PV snapshots for managed disk will be stored in the resource group specified in the snapshotlocation specified in the install configuration.
This will provide the correct level of DR suitable for enterprise organisations

The output of the following commands will help us better understand what’s going on: (Pasting long output into a GitHub gist or other pastebin is fine.)


[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "pvc-24d7d033-5e91-11ea-9866-929bf458f004"   ^C

[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "75f44b4c-5f64-11ea-9866-929bf458f004"

time="2020-03-06T05:00:31Z" level=info msg="Backing up item" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:162" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods

time="2020-03-06T05:00:31Z" level=info msg="Executing takePVSnapshot" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:375" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods

time="2020-03-06T05:00:31Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic." backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default persistentVolume=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 resource=pods

Configuration of the velero snapshot location

[tests]$ velero get snapshot-locations -o yaml

apiVersion: velero.io/v1

kind: VolumeSnapshotLocation

metadata:

  creationTimestamp: 2020-03-06T04:31:16Z

  generation: 1

  labels:

    component: velero

  name: default

  namespace: velero

  resourceVersion: "169404"

  selfLink: /apis/velero.io/v1/namespaces/velero/volumesnapshotlocations/default

  uid: 517763a6-5f63-11ea-9866-929bf458f004

spec:

  config:

    apiTimeout: 1m

    resourceGroup: rsg-lolcorp-uat-az1-aksbkp

  provider: azure

status: {}

kubectl logs deployment/velero -n velero
velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

Velero version (use velero version):
Velero features (use velero client config get features):
Kubernetes version (use kubectl version):
Kubernetes installer & version:
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 18 (8 by maintainers)

Most upvoted comments

awesome, glad you got it working 😃 I’ll close this out.

skriss on Mar 19, 2020

Hi @skriss - I’ve been rebuilding my test set up to reproduce this issue on a clean cluster. I will paste the new results tomorrow.

archmangler on Mar 16, 2020