aws-efs-csi-driver: chown/chgrp on dynamic provisioned pvc fails

/kind bug

What happened?

Using dynamic provisioning feature introduced by #274 with applications that try to chown their pv fails.

What you expected to happen?

With the old efs-provisioner, this caused no issues. But with dynamic provisioning in this csi driver, the chown command fails. I must admit, I don’t understand how the uid/gid thing works with EFS access points. The pod user does not seem to have any association with the uid/gid on the access point, however pods can read & write mounted pv just fine.

How to reproduce it (as minimally and precisely as possible)?

Use the dynamic provisioning feature introuduced by #274

Create a storageClasss:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: efs-sc
mountOptions:
- tls
parameters:
  directoryPerms: "700"
  fileSystemId: fs-<id>
  provisioningMode: efs-ap
provisioner: efs.csi.aws.com

Use it with application charts that have a chown step in the beginning, like:
- https://github.com/grafana/helm-charts/blob/8dfa6da2790911ee78e2c9cf62f950f20e4a8129/charts/grafana/templates/_pod.tpl#L22-L40
- https://github.com/jenkins-x-charts/nexus/blob/f05c2172a550c6bf3daf2c25eef940e67290d346/Dockerfile#L9-L10

For the first chart, we saw the initContainer failing. We just tried disabling the initContainer on the grafana chart with these chart overrides:

initChownData:
  enabled: false

And the application worked fine.

For the nexus chart, there’s a whole bunch of errors from the logs,

chgrp: changing group of '/nexus-data/elasticsearch/nexus': Operation not permitted

Perhaps these charts don’t really need that step with dynamic provisioning. Another nexus chart seems to have an option to skip the chown step; https://github.com/travelaudience/docker-nexus/blob/a86261e35734ae514c0236e8f371402e2ea0feec/run#L3-L6

Anything else we need to know?:

Environment

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1", GitCommit:"c4d752765b3bbac2237bf87cf0b1c2e307844666", GitTreeState:"clean", BuildDate:"2020-12-23T02:22:53Z", GoVersion:"go1.15.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.12-eks-7684af", GitCommit:"7684af4ac41370dd109ac13817023cb8063e3d45", GitTreeState:"clean", BuildDate:"2020-10-20T22:57:40Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Driver version:

amazon/aws-efs-csi-driver:master
quay.io/k8scsi/csi-node-driver-registrar:v1.3.0
k8s.gcr.io/sig-storage/csi-provisioner:v2.0.2

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 32
Comments: 69 (12 by maintainers)

Commits related to this issue

fix: Add option to disable chown on start Similar to the option here: https://github.com/travelaudience/docker-nexus/pull/33. This maybe required to address this issue: https://github.com/kubernetes-... — committed to gazal-k/nexus by gazal-k 3 years ago
fix: Add option to disable chown on start Similar to what's done here: https://github.com/travelaudience/docker-nexus/pull/33. This maybe required to address this issue: https://github.com/kubernetes... — committed to gazal-k/nexus by gazal-k 3 years ago
fix: Add option to disable chown on start Similar to what's done here: https://github.com/travelaudience/docker-nexus/pull/33. This maybe required to address this issue: https://github.com/kubernetes... — committed to gazal-k/nexus by gazal-k 3 years ago

Most upvoted comments

How can this be closed? I understand there are not enough resources to work on it now but it seems quite severe so it ought to be kept in the backlog rather than just be closed. And considering the feedback it would probably make sense to assign it a higher priority and use some of the available resources to actually solve it.

+30

srudin on Mar 9, 2023

@pkit’s workaround helped me to run bitanmi/postgres chart (it requires uid:gid=1001:1001) without modifications just by creating a new storageClass:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: postgr-efs
parameters:
  basePath: /dynamic_provisioning
  directoryPerms: 700
  fileSystemId: fs-xxxxxxxxxxxxxxxxx
  provisioningMode: efs-ap
  gid=1001
  uid: 1001

and specifying this storageClass --set global.storageClass=postgr-efs

ktaletsk on Mar 18, 2022

Unbelievable how hard is to aws fix this.

jsalatiel on Jan 5, 2023

I’m also seeing this. This means that things like postgres, which fails to start if its data directory is not owned by the postgres user, don’t work with dynamic provisioning.

From https://docs.aws.amazon.com/efs/latest/ug/accessing-fs-nfs-permissions.html,

By default, root squashing is disabled on EFS file systems. Amazon EFS behaves like a Linux NFS server with no_root_squash. If a user or group ID is 0, Amazon EFS treats that user as the root user, and bypasses permissions checks (allowing access and modification to all file system objects). Root squashing can be enabled on a client connection when the AWS Identity and Access Management (AWS IAM) identity or resource policy does not allow access to the ClientRootAccess action. When root squashing is enabled, the root user is converted to a user with limited permissions on the NFS server.

I can reproduce this without a file system policy, and with a file system policy that grants elasticfilesystem:ClientRootAccess, it doesn’t seem to make a difference. Granting elasticfilesystem:ClientRootAccess to the driver and pod’s roles also doesn’t help.

gabegorelick on Mar 30, 2021

Thanks for reporting this. We’re investigating possible fixes, but in the meantime let me explain the reason this is happening.

Dynamic provisioning shares a single file system among multiple PVs by using EFS Access Points. The way Access Points work is they allow server-side overwrites of user/group information, overriding whatever the user/group of the app/container is. When we create an AP with dynamic provisioning we allocate a unique UID/GID (for instance 50000:50000) to overwrite all operations to, and create a unique directory (e.g. /ap50000) that is owned by that user/group. This ensures that no matter how the container is configured it has read/write access to its root directory.

What is happening in this case is the application is trying to take its own steps to make its root directory writeable. For instance, if in the container there is an application user with UID/GID 100:100, when it does a ls -la on its FS root directory it sees that it is owned by 50000:50000, not 100:100, so it assumes it needs to do a chown/chmod for it to work. However, even if we allowed this command to go through the application would lose access to its own volume.

This is why the original issue above was resolved by disabling the chown/chgrp checks. This method can be used as a workaround for any application, since you can trust the PVs to be writeable out of box.

wochanda on Apr 22, 2021

@z0rc: You can’t reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

I hate this bot.

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on Jun 22, 2023

This issue probably needs to be addressed, but I’m not sure about using EFS for postgres. I have seen less demanding storage use cases where EFS struggles.

gazal-k on Sep 7, 2021

This is why the original issue above was resolved by disabling the chown/chgrp checks. This method can be used as a workaround for any application

Some applications don’t support disabling ownership checks. E.g. I’m not aware of any way to disable it in Postgres. In such cases, the only workaround I’ve found is to create a user for the UID assigned by the driver (something like useradd --uid "$(stat -c '%u' "$mountpath")") and then run the application as that user.

gabegorelick on Apr 22, 2021

I hate this bot.

/reopen /remove-lifecycle rotten

z0rc on Jun 22, 2023

I would love to see this Pr merged in. Currently hitting this issue while trying to run docker:dind with a pv

nickperkins on Sep 29, 2021

@kaiknight take a look at https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner which you can use in combination with AWS EFS. Isn’t as “clean” as using a CSI driver, but will resolve your permission issue, which seems to be a dealbreaker in your use case.

Jasper-Ben on Jan 3, 2023

@kaiknight Unfortunately it looks like it’s an AWS problem within EFS implementation. So, it won’t be solved soon…or maybe ever.

pkit on Jan 3, 2023

So basically, for this issue to be resolved the driver would have to refrain from using user enforcement. I believe this could be done by using 777 permissions on the unique directories (obviously, the uniqueness would have to come from somewhere else). This might sound like a bad idea at first, giving any user within the container write access to the directory.

You will need to alter umask too as any non-777 dir inside the root-dir will still have same problems.

I think a better workaround would be to use an application-specific StorageVolume definition with a proper uid/gid

pkit on Jan 29, 2022

Gives me this in AWS:

And an even weirder result on the pod (notice GUID):

This driver is messed up on so many levels.

This is because the current helm chart installs the 1.3.5 version which hasn’t included the configuration for GID and UID yet, in order to use that, make sure that you override the image tag value with the “master” tag that enabled this parameters.

You can override this value in the values.yaml file as following:

nameOverride: ""
fullnameOverride: ""

replicaCount: 2

image:
  repository: amazon/aws-efs-csi-driver
  tag: "master"  ## here!!
  pullPolicy: IfNotPresent
...

or with the helm command

helm upgrade --install aws-efs-csi-driver --namespace kube-system aws-efs-csi-driver/aws-efs-csi-driver --set image.tag="master"

to check what version are running your pods, just

kubectl describe pods -n kube-system efs-csi-controller-{....} | grep Image

NOTE: make sure you use this just as a workaround while the new release with the gid and uid is published, since it’s not reliable to be always pulling the master tag

eechava66 on Jan 14, 2022

So if I read this rightly #434 (merged) will allow the uid and gid values to be squashed to preferred values, and if these are set to match the user/group that is running, when new files are created they will appear with the user/group that created them and this should avoid typical logic deciding it needs to use chown. (Should this go further and support something like idmapd?)

The second question is: What is the right behaviour for chown in such an envionment? Should it fail if the requested state doesn’t match the current state (because the user and group are fixed and can’t be changed)? Should it always succeed (because the process will have access to the file regardless of the displayed user/group)? Or should that decision be another parameter so that the person setting this up can choose which of those two options is right for them?

kergon on Dec 2, 2021

Ran into this issue, as gabegorelick mention, only got it to work by chowning the mount path with the UID and GUID of the dynamic volume and then setting postgres user to the same IDs. Not ideal, but it works…

Helm code:

command: ["bash", "-c"]
args: ["usermod -u $(stat -c '%u' '{{ .Values.postgres_volume_mount_path }}')  postgres && \
        groupmod -g $(stat -c '%u' '{{ .Values.postgres_volume_mount_path }}')  postgres && \
        chown -R postgres:postgres {{ .Values.postgres_volume_mount_path }} && \
        /usr/local/bin/docker-entrypoint.sh postgres"]

Colbize on Oct 14, 2021

Can someone please suggest a workaround for postgres chown issue when using EFS with dynamic provisioning via access points?

I used https://github.com/kubernetes-sigs/aws-efs-csi-driver/pull/434. That’s still not merged though, so you’ll have to run a fork if you want it. But that is the only way that I’m aware of to specify a UID for the provisioned volumes.

if I setup runAsUser and runAsGroup in pod security context then postgres pod fails with a FATAL error

One workaround is to not do this. Instead, use runAsUser: 0 and fsGroup: 0. Then in your container’s entrypoint, invoke postgres (or whatever process you want to run) as the UID of the dynamic volume.

gabegorelick on Sep 7, 2021

Part of the reason efs-provisioner worked seamlessly is it relied on a beta annotation “pv.beta.kubernetes.io/gid” https://github.com/kubernetes-retired/external-storage/blob/201f40d78a9d3fd57d8a441cfc326988d88f35ec/nfs/pkg/volume/provision.go#L62 that silently does basically what your workaround does: it ensures that the Pod using the PV has the annotated group in its supplemental groups (i.e. if the annotation says ‘100’ and you execute groups as the pod user, ‘100’ will be among them)

This feature is very old and predates the rigorous KEP/feature tracking system that exists today, and I think it’s been forgotten by sig-storage. Certainly i am culpable for relying on it but doing nothing to make it more than a beta annotation, I’ll try to bring it up in isg-storage and see if we can rely on it as an alternative solution.

wongma7 on Apr 22, 2021