kubernetes: CSI block: Why is NodePublishVolume called in SetUpDevice?
What happened: I’m doing a retroactive review of the CSI block implementation and I have a few questions:
Why are there 4 different paths that point to the same block device? I see: * global map path * staging target path * publish target path * pod path
This seems like an unnecessary amount of redirection. Ideally, global map path == staging target path, and publish target path == pod path.
Along the same lines, I see that publish target path is not a per pod path and NodeStageVolume and NodePublishVolume are called in the same SetUpDevice() function. This kind of makes the NodePublish call useless because it’s just bind mounting the volume to another global location, which is what NodeStage already provides. This is also different from filesystem semantics, where NodeStage/Unstage is serialized per volume, and NodePublish/Unpublish is serialized per pod.
What you expected to happen: Can we simplify this and try to align with how the filesystem semantics work? The more intermediate steps we have, the more likely one of them is going to fail. Also each additional mount consumes kernel resources, and will make operations that need to list all mounts slower.
@kubernetes/sig-storage-bugs /assign @mkimuram @vladimirvivien @bswartz
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 2
- Comments: 24 (21 by maintainers)
@bswartz @mkimuram @wongma7 and I discussed this in more detail today. Summary:
globalMapPathandpodPathare needed for Kubernetes to refcnt the number of pods using the device to gate the unstage call. The path returned bynodeStageVolumeis not actually a device, it’s a directory where a plugin keeps information about the device.