ceph-csi: [csi-cephfs] Plugin Pod dies

First of all, thanks for the csi-cephfs plugin implementations.

I have been using the csi-cephfs version 0.3.0 in our production Kubernetes 1.11.5-gke.5 for about a month. Most of the time the plugin works great, except when the plugin Pod was suddenly died and re-created by the DaemonSet. When this happened, I had to re-deploy the Pods and PVCs that use CephFS because the Pods got an error Transport endpoint is not connected (see #91). This situation happened 3-4 times so far.

Is there any way to mitigate this issue?

Has anyone considered to store the plugin’s state (including mountpaths) on the node’s volume? By this way, the plugin Pod may be able to restore the previous state when it is re-created.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 24 (10 by maintainers)

Commits related to this issue

Most upvoted comments

Thanks @huaizong for #282. The PR will help us to recover when the CSI plugin Pod unexpectedly exits. Does anyone know when this will be released?

The PR already merged in csi-v1.0 and master branch.

@gman0 @Madhu-1 maybe we can release some minor version just like v1.0.1 ?

The v1.0.0 image in quay should already be built with this fix. Every PR merge refreshes the image, and hence this should be updated with the fix.

In the near future we would get to a versioned system, for minor updates to the 1.0 branch.