kubernetes: After cluster upgrade from 1.18.9 to 1.19.7 azureFile volume secrets are searched in wrong namespace

What happened:

After a cluster upgrade from 1.18.9 to 1.19.7 azureFile volume secrets are searched in wrong namespace.

We have a few pods/deployments that are mounting a volume that points to an azure file share. These pods are in various different namespaces. The volumes are directly specified in the deployment spec, e.g. like this:

volumes:
  - name: db-backups
    azureFile:
      secretName: azure-file-secret
      shareName: db-backups
      readOnly: false

In the same namespace as the deployment we have the secret azure-file-secret. As of Kubernetes 1.18.9 this worked perfectly. After upgrading to Kubernetes 1.19.7 the pods failed to start. Looking at the events we saw messages like this:

MountVolume.SetUp failed for volume “db-backups” : Couldn’t get secret default/azure-file-secret

As you can see, the volume is searching for the secret in the default namespace instead of the namespace of the deployment. Creating the secret in the default namespace allowed the pods to successfully start.

What you expected to happen:

The pods should have used the secret in the same namespace as they did in 1.18.9.

How to reproduce it (as minimally and precisely as possible):

  1. create a cluster in AKS
  2. create a new namespace “bug”
  3. create an azure file share
  4. create a secret with the required fields in the “bug” namespace (e.g. like in this example)
  5. create a pod with an azureFile volume in the “bug” namespace (e.g. like in this example)
  6. see the pod fail to start

Anything else we need to know?:

We saw in the documentation and examples that the alternative approach of manually creating a PVC and PV for the file share would allow specifying the secretNamespace. Maybe this field also needs to be allowed when specifying the volume directly in a pod/deployment spec?

Environment:

  • Kubernetes version: 1.19.7
  • Cloud provider or hardware configuration: AKS

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 9
  • Comments: 41 (23 by maintainers)

Most upvoted comments

@andyzhangx While it would probably be possible to use that approach, the other one with declaring the volume directly in the pod spec is still mentioned in the docs/examples, so I would expect it to continue working. And it would take quite some time (which I don’t have right now) to migrate all our deployments to the alternative approach.

As one comment in the linked issue mentioned, it is totally unacceptable to introduce a breaking change like this in a minor (or was it even only a patch?) update. I understand the previous behavior was considered a bug, but unfortunately quite a few people already depend on that buggy behavior. In any case, it should be explicitly mentioned in the changelog as a breaking change and optimally the “inline declaration in pod spec” approach should be properly deprecated if you prefer people to use the explicit PV approach.

We were also hit by this one. While we have worked around the problem I think it should be fixed. There are many real-world deployments that can run into this.

Yes @andyzhangx the fixed is merged, could you tell which version will fix the issue? Thanks!

@galiacheng the fix is still rolling out, you could run following command to disable csi migration migration on 1.22, that could mitigate the issue:

kubectl apply -f https://raw.githubusercontent.com/andyzhangx/demo/master/aks/disable-azurefile-csi-migration-flag-ds.yaml

mcaden

@StevenJDH are you referring to this issue? https://github.com/Azure/AKS/issues/2871

We faced this issue too on version 1.20.2.

@andyzhangx you mentioned this is now fixed on 1.18.18, 1.19.10, 1.20.6, 1.21.0. Upgrading to 1.20.7 fixed it in our case. However in the documentation at https://docs.microsoft.com/en-us/azure/aks/azure-files-volume#mount-file-share-as-an-inline-volume it still says:

Note: starting from 1.18.15, 1.19.7, 1.20.2, 1.21.0, secret namespace in inline azureFile volume can only be set as default namespace, to specify a different secret namespace, please use below persistent volume example instead.

So is it safe to assume version 1.21.0 will still search for the secrets in the same namespace as the pods after this fix? which would mean the docs are outdated…

@fire2 thanks for the catch, the doc is outdated, I will update that doc. from 1.21.0, this driver would search for the secret in the same namespace as the pod

@ramondeklein I just discovered a few days ago that the new AKS versions are now available:

  • 1.21.1 (preview)
  • 1.20.7
  • 1.19.11 (default)
  • 1.18.19