longhorn: [BUG] Wordpress helm chart in RWX mode has permission issues
Describe the bug
When deploying the bitnami Wordpress helm chart in RWO mode, everything works correctly. But when changing to RWX mode, I get to the white screen of death in Wordpress. After logging into the admin console I can see that the theme is not installed, which leads me to check the file system permissions in the pod.
File system permissions and ownership in RWO mode (working configuration):
$ ls -alh /bitnami/wordpress/wp-content/
total 32K
drwxrwxr-x 7 1001 1001 4.0K Feb 24 14:04 .
drwxrwsr-x 3 root 1001 4.0K Feb 24 14:04 ..
-rw-rw-r-- 1 1001 1001 28 Feb 24 14:04 index.php
drwxrwxr-x 2 1001 1001 4.0K Feb 24 14:04 languages
drwxrwxr-x 12 1001 1001 4.0K Feb 24 14:04 plugins
drwxrwxr-x 5 1001 1001 4.0K Feb 24 14:04 themes
drwxrwxr-x 2 1001 1001 4.0K Feb 24 14:04 upgrade
drwxrwxr-x 3 1001 1001 4.0K Feb 24 14:04 uploads
File system permissions and ownership in RWX mode:
$ ls -alh bitnami/wordpress/wp-content/
total 24K
drwxr-xr-x 5 1001 root 4.0K Feb 24 14:28 .
drwxrwxrwx 3 root root 4.0K Feb 24 14:24 ..
-rw-r--r-- 1 1001 root 28 Feb 24 14:24 index.php
drwxr-xr-x 2 1001 root 4.0K Feb 24 14:24 languages
drwxr-xr-x 8 1001 root 4.0K Feb 24 14:26 plugins
drwxr-xr-x 3 1001 root 4.0K Feb 24 14:28 uploads
So in conclusion, it looks like RWX mode is causing some permission issues for the pod, which I suspect is related to the NFS share being used in that case.
Environment
- Longhorn version: 1.2.3
- Installation method (e.g. Rancher Catalog App/Helm/Kubectl): kubectl
- Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: kubeadm
- Number of management node in the cluster: 3
- Number of worker node in the cluster: 6
- Node config
- OS type and version: Ubuntu 20.04
- CPU per node: 4
- Memory per node: 4GB
- Disk type(e.g. SSD/NVMe): NVMe and mechanical
- Network bandwidth between the nodes: virtual
- Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Proxmox (KVM)
- Number of Longhorn volumes in the cluster: 2
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 16 (10 by maintainers)
In regard to “fsGroupPolicy” by default kubernetes defaulted to
ReadWriteOnceWithFSTypewhich implied that no permission/ownership changes would be performed as part of kubelet for RWX volumes.@linucksrox as a workaround, you can run an init container with the RWX workload to do the
chmod+chown@linucksrox I tried to create the wordpress by
helm install wordpress bitnami/wordpress -f values.yamlusing longhorn RWX and RWO volumes. I noticed that folders with permission755were created firstly and then updated to775gradually in both cases. The permission modification looks updated by the wordpress workload.The
fsGroupinvalues.yamlis already set to1001, but thegroupof the files is stillrootrather than1001in the RWX volume. This might be related to thefsGroupPolicy=ReadWriteOnceWithFSTypesetting in thedriver.longhorn.io.Ref: https://kubernetes-csi.github.io/docs/support-fsgroup.html
I also checked the RWX volume with
fsGroupPolicy=ReadWriteOnceWithFSTypeandfsGroupPolicy=Fileby csi-driver-nfs. ThefsGroupis respected whenfsGroupPolicy=File.The feature https://kubernetes-csi.github.io/docs/support-fsgroup.html#delegate-fsgroup-to-csi-driver seems to have to wait since it is only available in beta in Kubernetes v1.23.
While waiting, we should check setting
fsGroupPolicy=Fileto see if it can meet our requirement: respect the settingpod.spec.securityContext.fsGroupChangePolicyfor both RWO and RWX which is:pod.spec.securityContext.fsGroupChangePolicy=OnRootMismatch, it should not recursive change the volume ownership and permissions unless the root is mismatch. (if it doesn’t meet this requirement, this cannot be a solution because there will be a big delay when mounting volume, see more at https://github.com/longhorn/longhorn/issues/2131#issuecomment-778897129)pod.spec.securityContext.fsGroupChangePolicy=always, it should always change the volume ownership.Thanks, @joshimoo and @derekbit analysis 👍
Right now, we are using
release-1.18code base, so this is why we don’t have any custom behavior forReadWriteOnceWithFSTypeof CSIDriverSpec and it is just the default behavior offsGroupPolicy(ReadWriteOnceWithFSType) instead as @joshimoo and @derekbit mentioned.The workaround is as @joshimoo mentioned, and the end goal is to define the expected behavior by providing the setting for
fsGroupPolicy. Also, we can take a look if we would like to leave volume fs permission to our CSI handle completely instead of relying on kubelet (https://kubernetes-csi.github.io/docs/support-fsgroup.html#delegate-fsgroup-to-csi-driver).ref:
I have the same issue on a 1.23.7 RKE2 cluster with longhorn 1.2.4. Wordpress still fails to write to the RWX volume due to permissions. Any workaround? Thank you.
cc @innobead For the fsGroupPolicy specs on the driver, we can revisit once we have done a min kubernetes version update for v1.3 https://kubernetes-csi.github.io/docs/support-fsgroup.html
Remember there was a couple of other issues, that we can address when doing the min version update, i.e. go version bump, client-go bump.
values.yamllooks fine. TheCSIDrivercreation is embedded in longhorn-manager, so we need to do more investigation onfsGroupPolicy.cc @joshimoo @jenting @innobead
@derekbit Sure, I just ran
helm install bitnami/wordpress -f values.yaml --version=13.0.15with the following yaml file (just with a different ingress hostname) values.yaml.txtAfter waiting about 5 minutes for everything to deploy, I can access the wordpress admin page. Then I check the permissions and they are set to 0755 with some missing directories (for example themes). If I change nothing but access modes in the file to ReadWriteOnce (3 occurrences), then it works as expected.