gcp-compute-persistent-disk-csi-driver: CSINode creation fails because spec.drivers[*].nodeID can exceed the maximum allowed length of 128 characters
For GCP projects and nodes with long names the nodeID
can become longer than 128 characters. If this happens, the CSINode
object cannot be created because of this validation https://github.com/kubernetes/kubernetes/blob/d822b8b230bce0178400e8949ff3f45423d315d0/pkg/apis/storage/validation/validation.go#L345-L347 and the result is that no volume managed by the CSI driver can be attached and used on this node.
In my honest opinion, the problem is the used nodeID format https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/d7d9f2e35d897c00a860963a40a000bf059902df/pkg/common/utils.go#L44 which contains too much information which might not be needed.
According to the spec
// The identifier of the node as understood by the SP. // This field is REQUIRED. // This field MUST contain enough information to uniquely identify // this specific node vs all other nodes supported by this plugin. // This field SHALL be used by the CO in subsequent calls, including //
ControllerPublishVolume
, to refer to this node. // The SP is NOT responsible for global uniqueness of node_id across // multiple SPs.
My understanding of the spec is that the CSI driver should not care for the global uniqueness of the Node, so only the Node name should be enough, isn’t it? Or at least the placeholders projects
, zones
, and instances
can be removed from the nodeID which will free up 25 characters.
I have quickly checked the names length limit:
- The maximum allowed project ID length is 30 (ref: https://cloud.google.com/resource-manager/docs/creating-managing-projects).
- The instance name should be no more than 63 characters (ref: https://cloud.google.com/compute/docs/naming-resources).
- The zone name is managed by Google, but currently the longest one is
northamerica-northeast1-a
with 25 characters.
With two /
separators the total achieved length is 120 characters which is less than the allowed 128.
Here is a sample logs from the csi-node-driver-registrar
from a k8s cluster on GCP (managed by Gardener).
I0505 13:45:10.412325 1 main.go:110] Version: v1.3.0-0-g6e9fff3e
I0505 13:45:10.412405 1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0505 13:45:10.412427 1 connection.go:151] Connecting to unix:///csi/csi.sock
I0505 13:45:10.412865 1 main.go:127] Calling CSI driver to discover driver name
I0505 13:45:10.412881 1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
I0505 13:45:10.412886 1 connection.go:181] GRPC request: {}
I0505 13:45:10.414462 1 connection.go:183] GRPC response: {"name":"pd.csi.storage.gke.io","vendor_version":"v0.7.0-gke.0"}
I0505 13:45:10.414820 1 connection.go:184] GRPC error: <nil>
I0505 13:45:10.414827 1 main.go:137] CSI driver name: "pd.csi.storage.gke.io"
I0505 13:45:10.414915 1 node_register.go:51] Starting Registration Server at: /registration/pd.csi.storage.gke.io-reg.sock
I0505 13:45:10.415103 1 node_register.go:60] Registration Server started at: /registration/pd.csi.storage.gke.io-reg.sock
I0505 13:45:10.870866 1 main.go:77] Received GetInfo call: &InfoRequest{}
I0505 13:45:11.870960 1 main.go:77] Received GetInfo call: &InfoRequest{}
I0505 13:45:13.585060 1 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: error updating CSINode object with CSI driver node info: error updating CSINode: timed out waiting for the condition; caused by: CSINode.storage.k8s.io "shoot--12345678--123-56789ab-cpu-worker-z1-7c4f48599f-q6vbk" is invalid: spec.drivers[0].nodeID: Invalid value: "projects/012-34-56789abcdefghij-klmnopq/zones/us-central1-a/instances/shoot--12345678--123-56789ab-cpu-worker-z1-7c4f48599f-q6vbk": must be 128 characters or less,}
E0505 13:45:13.585115 1 main.go:89] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: error updating CSINode object with CSI driver node info: error projects/012-34-56789abcdefghij-klmnopq/zones/us-central1-a/instances/shoot--12345678--123-56789ab-cpu-worker-z1-7c4f48599f-q6vbk": must be 128 characters or less, restarting registration container.
/kind bug
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 3
- Comments: 24 (14 by maintainers)
Agree we should definitely discuss this in the next csi meeting @saad-ali. No changes to sidecars are anticipated.