longhorn: [BUG] Longhorn 1.2.0 - wrong volume permissions inside container / broken fsGroup

Describe the bug After upgrade longhorn to 1.20, some container are unable to start corectly (e.g prometheus).

Looks like root cause is wrong Longhorn volume permisions inside container when container is not running as root.

Even with fsGroup specified, permissions are not set for volume.

To Reproduce

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: broken-longhorn
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: longhorn

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: broken-longhorn
  namespace: default
  labels:
    app.kubernetes.io/name: broken-longhorn
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app.kubernetes.io/name: broken-longhorn
  template:
    metadata:
      labels:
        app.kubernetes.io/name: broken-longhorn
    spec:
      containers:
        - name: broken-longhorn
          image: ubuntu:focal-20210723
          command:
          - "/bin/sh"
          - "-ec"
          - |
            tail -f /dev/null
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - mountPath: /data
              name: data
      securityContext:
        runAsUser: 65534
        runAsNonRoot: true
        runAsGroup: 65534
        fsGroup: 65534
        fsGroupChangePolicy: Always
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: broken-longhorn

% kubectl get pvc broken-longhorn -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"broken-longhorn","namespace":"default"},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}},"storageClassName":"longhorn"}}
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: driver.longhorn.io
  creationTimestamp: "2021-09-01T07:48:46Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: broken-longhorn
  namespace: default
  resourceVersion: "25249959"
  uid: 9ab5ca66-0794-4ad2-8aa5-73e96fa603fc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: longhorn
  volumeMode: Filesystem
  volumeName: pvc-9ab5ca66-0794-4ad2-8aa5-73e96fa603fc
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  phase: Bound

% kubectl get pv pvc-9ab5ca66-0794-4ad2-8aa5-73e96fa603fc -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: driver.longhorn.io
  creationTimestamp: "2021-09-01T07:48:48Z"
  finalizers:
  - kubernetes.io/pv-protection
  - external-attacher/driver-longhorn-io
  name: pvc-9ab5ca66-0794-4ad2-8aa5-73e96fa603fc
  resourceVersion: "25250154"
  uid: d4db7690-4763-45bf-a5c7-9b7ae0b5d584
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: broken-longhorn
    namespace: default
    resourceVersion: "25249908"
    uid: 9ab5ca66-0794-4ad2-8aa5-73e96fa603fc
  csi:
    driver: driver.longhorn.io
    volumeAttributes:
      fromBackup: ""
      numberOfReplicas: "3"
      staleReplicaTimeout: "30"
      storage.kubernetes.io/csiProvisionerIdentity: 1630473362980-8081-driver.longhorn.io
    volumeHandle: pvc-9ab5ca66-0794-4ad2-8aa5-73e96fa603fc
  persistentVolumeReclaimPolicy: Delete
  storageClassName: longhorn
  volumeMode: Filesystem
status:
  phase: Bound

% kubectl get csidriver driver.longhorn.io -o yaml
apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
  annotations:
    driver.longhorn.io/kubernetes-version: v1.20.7+k3s1
    driver.longhorn.io/version: v1.2.0
  creationTimestamp: "2021-08-31T17:47:21Z"
  name: driver.longhorn.io
  resourceVersion: "24953648"
  uid: 274cd12a-6aca-47a9-bfd8-32261eb5033a
spec:
  attachRequired: true
  fsGroupPolicy: ReadWriteOnceWithFSType
  podInfoOnMount: true
  volumeLifecycleModes:
  - Persistent

$ kubectl exec -t -i broken-longhorn-c4ccbbb6f-79djg -- bash
nobody@broken-longhorn-c4ccbbb6f-79djg:/$ ls -la /data/
total 24
drwxr-xr-x 3 root root  4096 Sep  1 07:49 .
drwxr-xr-x 1 root root  4096 Sep  1 07:49 ..
drwx------ 2 root root 16384 Sep  1 07:49 lost+found
nobody@broken-longhorn-c4ccbbb6f-79djg:/$ touch /data/test
touch: cannot touch '/data/test': Permission denied

Expected behavior When fsGroup is provided, it should be used to chown destination mount.

Environment:

Longhorn version: 1.20
Installation method (e.g. Rancher Catalog App/Helm/Kubectl): helm
Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: k3os/k3s
Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): baremetal

Additional context

% kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:52:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.7+k3s1", GitCommit:"aa768cbdabdb44c95c5c1d9562ea7f5ded073bc0", GitTreeState:"clean", BuildDate:"2021-05-20T01:07:13Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"}

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 8
Comments: 20 (9 by maintainers)

Commits related to this issue

Install Longhorn from patched chart to workaround issue See https://github.com/longhorn/longhorn/issues/2964#issuecomment-910969543 for details. — committed to yankcrime/fleeting by yankcrime 3 years ago

Most upvoted comments

Workaround: manually add a new flag --default-fstype=ext4 to the csi-provisioner deployment in longhorn-system namespace. It should look like this:

...
      containers:
      - args:
        - --v=2
        - --csi-address=$(ADDRESS)
        - --timeout=1m50s
        - --leader-election
        - --leader-election-namespace=$(POD_NAMESPACE)
        - --default-fstype=ext4
        env:
 ...

Root cause:

The field fsType is missing in the PV created in Longhorn v1.2.0. And this prevents Kubernetes from changing the volume ownership and permissions, link
The problem of missing fsType in PVs seems to come from the csi-provisioner. In the old csi-provisioner (v1.6.0), the defaultFSType is hard-coded to ext4, link. However, in the new csi-provisioner (v2.1.2) it is empty by default, link.

+25

PhanLe1010 on Sep 2, 2021

In my opinion, if you can’t/won’t submit a quick patch release, the 1.2.0 should be pulled as it’s broken in a non obvious way. In my case I thought there was some kind of issue with the GitLab helm chart, wasted literally hours on this, I’m afraid I won’t be alone. Looking forward to upgrade to 1.2.1.

mgcrea on Sep 20, 2021

@mstrent While this issue has indeed high impact:

The workaround is non-intrusive and should solve the problem immediately without side effects.
We’re accelerating the release of v1.2.1 to less than 2 weeks from now (09/24), to fix this issue and a few other issues we’ve found after v1.2.0 release.

Retag or re-released a version is generally a bad idea, since there is no way to upgrade from and to the same version, and there won’t be helpful to any existing users already hit the bug. It’s going to be lots of things mixed up if we choose to do that.

Sorry for the inconvenience. v1.2.1 will be there soon.

yasker on Sep 10, 2021

Is pointing people to a workaround enough if 1.2.1 is still weeks away? This is a fatal enough flaw I’d think 1.2 should either be pulled or re-released with the fix.

mstrent on Sep 9, 2021

/dev/longhorn/pvc-fb7df8b3-0090-4fdf-b74f-309fd8056563 is apparently in use by the system

@samip5 I believe you have a different problem. Something is hijacking the Longhorn block device. There are some debugging steps here #2983

https://github.com/longhorn/longhorn/issues/1210#issuecomment-671591451: ~~Those instructions are not applicable, as there is no major:minor version befode device name?~~ Oh, my bad

Culprit: multipathd

samip5 on Sep 15, 2021

Retag is usually a no-go, but you can still release 1.2.1, then just go with 1.2.2 for whatever is “planned” for 1.2.1. This is also fixed in the helm package, what about at least upgrading that? The version of the helm-chart and app-version is not the same.

My point is that you are a storage provider, and you should be able to release quick bug-fixes like this. 1.2.x is for patch-releases, they shouldn’t be planned for, just release!

xeor on Sep 11, 2021

I can confirm this happens for automatically provisioned volumes as well (i.e. statefulsets)

[hubbe@ma3a ~]$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:45:37Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.4", GitCommit:"3cce4a82b44f032d0cd1a1790e6d2f5a55d20aae", GitTreeState:"clean", BuildDate:"2021-08-11T18:10:22Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}

statefulset example:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: pvc-ownership-test
spec:
  selector:
    matchLabels:
      app: pvc-ownership-test
  serviceName: pvc-ownership-test
  template:
    metadata:
      labels:
        app: pvc-ownership-test
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        fsGroup: 1000
      containers:
        - image: docker.io/library/busybox:latest
          name: pvc-ownership-test
          command: ["/bin/sh"]
          args: ["-c", "sleep 6000000"]
          volumeMounts:
            - name: test
              mountPath: /test
  volumeClaimTemplates:
    - metadata:
        name: test
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 1Gi
        storageClassName: longhorn

[hubbe@ma3a ~]$ kubectl exec -it pvc-ownership-test-0 -- sh

/ $ id
uid=1000 gid=1000 groups=1000
/ $ ls -ahl /test
total 24K    
drwxr-xr-x    3 root     root        4.0K Sep  1 09:38 .
drwxr-xr-x    1 root     root        4.0K Sep  1 09:38 ..
drwx------    2 root     root       16.0K Sep  1 09:38 lost+found

HubbeKing on Sep 1, 2021

Test steps:

Install/Upgrade Longhorn using kubectl/helm
Deploy statefulset at https://github.com/longhorn/longhorn/issues/2964#issuecomment-910117570
Exec into the pod and verify that /test/ has correct permission and can read/write new files to /test

PhanLe1010 on Sep 2, 2021