longhorn: [BUG] Wordpress helm chart in RWX mode has permission issues

Describe the bug

When deploying the bitnami Wordpress helm chart in RWO mode, everything works correctly. But when changing to RWX mode, I get to the white screen of death in Wordpress. After logging into the admin console I can see that the theme is not installed, which leads me to check the file system permissions in the pod.

File system permissions and ownership in RWO mode (working configuration):

$ ls -alh /bitnami/wordpress/wp-content/
total 32K
drwxrwxr-x  7 1001 1001 4.0K Feb 24 14:04 .
drwxrwsr-x  3 root 1001 4.0K Feb 24 14:04 ..
-rw-rw-r--  1 1001 1001   28 Feb 24 14:04 index.php
drwxrwxr-x  2 1001 1001 4.0K Feb 24 14:04 languages
drwxrwxr-x 12 1001 1001 4.0K Feb 24 14:04 plugins
drwxrwxr-x  5 1001 1001 4.0K Feb 24 14:04 themes
drwxrwxr-x  2 1001 1001 4.0K Feb 24 14:04 upgrade
drwxrwxr-x  3 1001 1001 4.0K Feb 24 14:04 uploads

File system permissions and ownership in RWX mode:

$ ls -alh bitnami/wordpress/wp-content/
total 24K
drwxr-xr-x 5 1001 root 4.0K Feb 24 14:28 .
drwxrwxrwx 3 root root 4.0K Feb 24 14:24 ..
-rw-r--r-- 1 1001 root   28 Feb 24 14:24 index.php
drwxr-xr-x 2 1001 root 4.0K Feb 24 14:24 languages
drwxr-xr-x 8 1001 root 4.0K Feb 24 14:26 plugins
drwxr-xr-x 3 1001 root 4.0K Feb 24 14:28 uploads

So in conclusion, it looks like RWX mode is causing some permission issues for the pod, which I suspect is related to the NFS share being used in that case.

Environment

  • Longhorn version: 1.2.3
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): kubectl
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: kubeadm
    • Number of management node in the cluster: 3
    • Number of worker node in the cluster: 6
  • Node config
    • OS type and version: Ubuntu 20.04
    • CPU per node: 4
    • Memory per node: 4GB
    • Disk type(e.g. SSD/NVMe): NVMe and mechanical
    • Network bandwidth between the nodes: virtual
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Proxmox (KVM)
  • Number of Longhorn volumes in the cluster: 2

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 16 (10 by maintainers)

Most upvoted comments

In regard to “fsGroupPolicy” by default kubernetes defaulted to ReadWriteOnceWithFSType which implied that no permission/ownership changes would be performed as part of kubelet for RWX volumes.

@linucksrox as a workaround, you can run an init container with the RWX workload to do the chmod+chown

@linucksrox I tried to create the wordpress by helm install wordpress bitnami/wordpress -f values.yaml using longhorn RWX and RWO volumes. I noticed that folders with permission 755 were created firstly and then updated to 775 gradually in both cases. The permission modification looks updated by the wordpress workload.

The fsGroup in values.yaml is already set to 1001, but the group of the files is still root rather than 1001 in the RWX volume. This might be related to the fsGroupPolicy=ReadWriteOnceWithFSType setting in the driver.longhorn.io.

Ref: https://kubernetes-csi.github.io/docs/support-fsgroup.html

I also checked the RWX volume with fsGroupPolicy=ReadWriteOnceWithFSType and fsGroupPolicy=File by csi-driver-nfs. The fsGroup is respected when fsGroupPolicy=File .

The feature https://kubernetes-csi.github.io/docs/support-fsgroup.html#delegate-fsgroup-to-csi-driver seems to have to wait since it is only available in beta in Kubernetes v1.23.

While waiting, we should check setting fsGroupPolicy=File to see if it can meet our requirement: respect the setting pod.spec.securityContext.fsGroupChangePolicy for both RWO and RWX which is:

  • if pod.spec.securityContext.fsGroupChangePolicy=OnRootMismatch, it should not recursive change the volume ownership and permissions unless the root is mismatch. (if it doesn’t meet this requirement, this cannot be a solution because there will be a big delay when mounting volume, see more at https://github.com/longhorn/longhorn/issues/2131#issuecomment-778897129)
  • if pod.spec.securityContext.fsGroupChangePolicy=always, it should always change the volume ownership.

Thanks, @joshimoo and @derekbit analysis 👍

Right now, we are using release-1.18 code base, so this is why we don’t have any custom behavior for ReadWriteOnceWithFSType of CSIDriverSpec and it is just the default behavior of fsGroupPolicy (ReadWriteOnceWithFSType ) instead as @joshimoo and @derekbit mentioned.

The workaround is as @joshimoo mentioned, and the end goal is to define the expected behavior by providing the setting for fsGroupPolicy. Also, we can take a look if we would like to leave volume fs permission to our CSI handle completely instead of relying on kubelet (https://kubernetes-csi.github.io/docs/support-fsgroup.html#delegate-fsgroup-to-csi-driver).

ref:

I have the same issue on a 1.23.7 RKE2 cluster with longhorn 1.2.4. Wordpress still fails to write to the RWX volume due to permissions. Any workaround? Thank you.

cc @innobead For the fsGroupPolicy specs on the driver, we can revisit once we have done a min kubernetes version update for v1.3 https://kubernetes-csi.github.io/docs/support-fsgroup.html

Remember there was a couple of other issues, that we can address when doing the min version update, i.e. go version bump, client-go bump.

values.yaml looks fine. The CSIDriver creation is embedded in longhorn-manager, so we need to do more investigation on fsGroupPolicy.

cc @joshimoo @jenting @innobead

@derekbit Sure, I just ran helm install bitnami/wordpress -f values.yaml --version=13.0.15 with the following yaml file (just with a different ingress hostname) values.yaml.txt

After waiting about 5 minutes for everything to deploy, I can access the wordpress admin page. Then I check the permissions and they are set to 0755 with some missing directories (for example themes). If I change nothing but access modes in the file to ReadWriteOnce (3 occurrences), then it works as expected.