rook: 1.9 Rook upgrade. Permission denied

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior: Formerly (1.8.9) my initContainer create directory on persistent volume well:

      initContainers:
      - name: plugins-extractor
        image: {{ .Values.global.imageDashboards }}
        command:
        - /bin/sh
        args:
        - -c
        - "mkdir -p /grafana/plugins && cp -avr /var/lib/grafana/plugins/. /grafana/plugins"
        volumeMounts:
        - name: var-lib-grafana
          mountPath: /grafana
...
  volumeClaimTemplates:
  - metadata:
      name: var-lib-grafana
    spec:
      accessModes:
      - ReadWriteOnce
      storageClassName: core-rook-block
      resources:
        requests:
          storage: 1Gi

After upgrade to 1.9.x:

plugins-extractor mkdir: can't create directory '/grafana/plugins': Permission denied

Expected behavior:

Note that the image sets USER to non-root.

Directory is created without permission problems.

Directory is created sussessfully

How to reproduce it (minimal and precise): Try to create directory inside mounted PV with non-root user.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 35 (18 by maintainers)

Most upvoted comments

I will close for now. If I have time will try to reproduce this strange behavior later.

IMHO changing the behavior (whether it’s a bug fix or not) without notifying the users is bad. We should have documentation of a breaking change in existing applications as a caution to the user that this is what it is. As a storage admin, you will not have control over all the apps running in the cluster. Because of this one, the whole upgrade will mess up things; We have the below options.

  • Bring back the old behavior with some flag so that users can update to the latest version, which might have a few bug fixes or new features
    • Remove this flag in some next release(s) with notification about the behavioral change
  • Dont update cephcsi to 3.7.x until the required security context is added to all existing workloads and the same is communicated to all application users.

@travisn thoughts?

Will try later. Do not want to disturb my team at work time. Could you try on 1.8.9. and tell if it work in both ways or with SecurityContext only?

I think its a change in cephcsi 3.6.0 which is present in 1.9 not in 1.8 i can test it and confirm tomorrow.

Something definitely changed. Pre rook 1.9 I was able to create a fresh cephblock PVC and use it as an existingClaim in a deployment and it appears the volume would be chowned with the user and group of the UID/GID of the running container.

Now with rook 1.9 I have to be explicit and set a pod security context like below for the application to be able to access the existingClaim. The UID and GID are the default user/group running the container.

    securityContext:
      runAsUser: 568
      runAsGroup: 568
      fsGroup: 568
      fsGroupChangePolicy: "OnRootMismatch"

After the pod starts and runs you can remove the securityContext and everything runs normal as well.