katib: ReadWriteMany unsupported for PVCs on GCP

/kind bug

What steps did you take and what happened: When creating tfevent-pvc, it fails on GCP with this error: Failed to provision volume with StorageClass "standard": invalid AccessModes [ReadWriteMany]: only AccessModes [ReadWriteOnce ReadOnlyMany] are supported

What did you expect to happen: The PVC to be created without errors.

Anything else you would like to add: Since C2D doesn’t support v0.7.0 yet, this version (v1alpha2) should work on GCP without failures.

Environment:

  • Kubeflow version: v0.6.2 (latest in C2D)
  • Minikube version: n/a
  • Kubernetes version: (use kubectl version): client: v1.13.11, server: v1.12.10-gke.17
  • OS (e.g. from /etc/os-release): MacOS v10.14.6

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

It is a common problem among cloud provider and storage backends in general. There are some heavily used (GCP PD, AWS EBS, Ceph) which do not support ReadWriteMany specifically because they won’t allow you to attach one block device to multiple nodes, thus if you scale up and scheduler distributes the load, containers won’t start because the volume will not be able to attach to multiple nodes.

Maybe it would make sense to note that in the example or Katib documentation that ReadWriteMany for storage backend is required (e.g. NFS), or think about a different option like using object storage somehow.