rook: Container mounting CephFS with fsGroup stuck in ContainerCreating

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior: Starting the container with this definition:

apiVersion: v1
kind: Pod
metadata:
  name: test3
  namespace: jupyterlab
spec:
  containers:
  - args:
    - "sleep"
    - "1000000000"
    image: busybox
    name: notebook
    volumeMounts:
    - mountPath: /shared
      name: jupyter
  securityContext:
    fsGroup: 100
  volumes:
  - flexVolume:
      driver: ceph.rook.io/rook
      fsType: ceph
      options:
        clusterNamespace: rook
        fsName: nautilusfs
        path: /jupyter
    name: jupyter

The container stays in “CoontainerCreating” for a long time and never starts.

Logs from rook-ceph-agent on the node:

2018-11-02 19:22:24.659207 I | flexdriver: mounting ceph filesystem nautilusfs on /var/lib/kubelet/pods/a0a805aa-ded4-11e8-ac6e-0cc47a6be994/volumes/ceph.rook.io~rook/jupyter
2018-11-02 19:22:24.743989 I | cephmon: parsing mon endpoints: rook-ceph-mon68=10.111.179.96:6790,rook-ceph-mon62=10.110.34.138:6790,rook-ceph-mon70=10.105.80.195:6790,rook-ceph-mon72=10.105.71.196:6790,rook-ceph-mon75=10.100.58.15:6790
2018-11-02 19:22:24.744058 I | op-mon: loaded: maxMonID=75, mons=map[rook-ceph-mon62:0xc420216180 rook-ceph-mon70:0xc420216220 rook-ceph-mon72:0xc420216340 rook-ceph-mon75:0xc4202163e0 rook-ceph-mon68:0xc420216100], mapping=&{Node:map[rook-ceph-mon68:0xc4205d2690 rook-ceph-mon70:0xc4205d26c0 rook-ceph-mon72:0xc4205d26f0 rook-ceph-mon75:0xc4205d2720 rook-ceph-mon62:0xc4205d2540] Port:map[]}
2018-11-02 19:22:24.745863 I | flexdriver: mounting ceph filesystem nautilusfs on 10.111.179.96:6790,10.110.34.138:6790,10.105.80.195:6790,10.105.71.196:6790,10.100.58.15:6790:/jupyter to /var/lib/kubelet/pods/a0a805aa-ded4-11e8-ac6e-0cc47a6be994/volumes/ceph.rook.io~rook/jupyter
2018-11-02 19:22:25.351412 I | flexdriver:
2018-11-02 19:22:25.351541 I | flexdriver: ceph filesystem nautilusfs has been attached and mounted

Once I comment out the fsGroup line, all works fine.

What’s weird is that issue starts sometimes, and other times all works fine.

Expected behavior: Containers can mount cephFS with fsGroup defined.

How to reproduce it (minimal and precise): See above

Environment:

  • OS (e.g. from /etc/os-release): Centos 7.5
  • Kernel (e.g. uname -a): 4.15.15
  • Cloud provider or hardware configuration: baremetal
  • Rook version (use rook version inside of a Rook Pod): 0.8.2
  • Kubernetes version (use kubectl version): 1.11.3
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): kubeadm
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): healthy (some S3 issues)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 4
  • Comments: 33 (28 by maintainers)

Most upvoted comments

I’ve already opened one, see link above your message

Fixed in csi

Still hitting this, manually chowning all new volumes for new users All new jupyter notebooks are broken

How can we move forward? Should somebody talk to kubernetes developers? Can any changes in rook solve this?