external-provisioner: Pod is not created under selected zone of Volume
Is this a BUG REPORT or FEATURE REQUEST?:
Uncomment only one, leave it on its own line:
/kind bug
/kind feature
What happened: I am testing dynamic provisioning with EBS CSI driver with delayed binding. Most of the time the pod is created under the same zone as volume. There is one time that pod creation is failed because of pod is created in a different zone to volume’s zone.
What you expected to happen: Volume and pod should always be created under the same topology domain with volume scheduling enabled.
How to reproduce it (as minimally and precisely as possible): Non-deterministic so far
Anything else we need to know?: Provisioner log:
I1015 20:30:34.630580 1 controller.go:991] provision "default/late-claim" class "late-sc": started
I1015 20:30:34.643414 1 controller.go:121] GRPC call: /csi.v0.Identity/GetPluginCapabilities
I1015 20:30:34.643430 1 controller.go:122] GRPC request:
I1015 20:30:34.643634 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"late-claim", UID:"1d44c823-d0b9-11e8-81f1-0a75e9a76798", APIVersion:"v1", ResourceVersion:"1694", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/late-claim"
I1015 20:30:34.644379 1 controller.go:124] GRPC response: capabilities:<service:<type:CONTROLLER_SERVICE > > capabilities:<service:<type:ACCESSIBILITY_CONSTRAINTS > >
I1015 20:30:34.644443 1 controller.go:125] GRPC error: <nil>
I1015 20:30:34.644453 1 controller.go:121] GRPC call: /csi.v0.Controller/ControllerGetCapabilities
I1015 20:30:34.644459 1 controller.go:122] GRPC request:
I1015 20:30:34.645083 1 controller.go:124] GRPC response: capabilities:<rpc:<type:CREATE_DELETE_VOLUME > > capabilities:<rpc:<type:PUBLISH_UNPUBLISH_VOLUME > >
I1015 20:30:34.645139 1 controller.go:125] GRPC error: <nil>
I1015 20:30:34.645151 1 controller.go:121] GRPC call: /csi.v0.Identity/GetPluginInfo
I1015 20:30:34.645190 1 controller.go:122] GRPC request:
I1015 20:30:34.645621 1 controller.go:124] GRPC response: name:"com.amazon.aws.csi.ebs" vendor_version:"0.0.1"
I1015 20:30:34.645658 1 controller.go:125] GRPC error: <nil>
I1015 20:30:34.661737 1 controller.go:428] CreateVolumeRequest {Name:pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798 CapacityRange:required_bytes:4294967296 VolumeCapabilities:[mount:<> access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[] ControllerCreateSecrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1b" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > > XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1015 20:30:34.661862 1 controller.go:121] GRPC call: /csi.v0.Controller/CreateVolume
I1015 20:30:34.661868 1 controller.go:122] GRPC request: name:"pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" capacity_range:<required_bytes:4294967296 > volume_capabilities:<mount:<> access_mode:<mode:SINGLE_NODE_WRITER > > accessibility_requirements:<requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1b" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > > >
I1015 20:30:34.841760 1 leaderelection.go:227] successfully renewed lease default/com.amazon.aws.csi.ebs
I1015 20:30:35.114422 1 controller.go:124] GRPC response: volume:<capacity_bytes:4294967296 id:"vol-0c696d140008a61a8" accessible_topology:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > >
I1015 20:30:35.114527 1 controller.go:125] GRPC error: <nil>
I1015 20:30:35.114540 1 controller.go:484] create volume rep: {CapacityBytes:4294967296 Id:vol-0c696d140008a61a8 Attributes:map[] ContentSource:<nil> AccessibleTopology:[segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > ] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1015 20:30:35.114631 1 controller.go:546] successfully created PV {GCEPersistentDisk:nil AWSElasticBlockStore:nil HostPath:nil Glusterfs:nil NFS:nil RBD:nil ISCSI:nil Cinder:nil CephFS:nil FC:nil Flocker:nil FlexVolume:nil AzureFile:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil PortworxVolume:nil ScaleIO:nil Local:nil StorageOS:nil CSI:&CSIPersistentVolumeSource{Driver:com.amazon.aws.csi.ebs,VolumeHandle:vol-0c696d140008a61a8,ReadOnly:false,FSType:ext4,VolumeAttributes:map[string]string{storage.kubernetes.io/csiProvisionerIdentity: 1539635296092-8081-com.amazon.aws.csi.ebs,},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,}}
I1015 20:30:35.114740 1 controller.go:1091] provision "default/late-claim" class "late-sc": volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" provisioned
I1015 20:30:35.114760 1 controller.go:1105] provision "default/late-claim" class "late-sc": trying to save persistentvvolume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798"
I1015 20:30:35.134870 1 controller.go:1112] provision "default/late-claim" class "late-sc": persistentvolume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" saved
I1015 20:30:35.134930 1 controller.go:1153] provision "default/late-claim" class "late-sc": succeeded
I1015 20:30:35.135246 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"late-claim", UID:"1d44c823-d0b9-11e8-81f1-0a75e9a76798", APIVersion:"v1", ResourceVersion:"1694", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798
EBS Driver Log:
I1015 20:28:16.294223 1 driver.go:52] Driver: com.amazon.aws.csi.ebs
I1015 20:28:16.294360 1 mount_linux.go:199] Detected OS without systemd
I1015 20:28:16.294928 1 driver.go:107] Listening for connections on address: &net.UnixAddr{Name:"/var/lib/csi/sockets/pluginproxy/csi.sock", Net:"unix"}
I1015 20:30:34.644708 1 controller.go:175] ControllerGetCapabilities: called with args &csi.ControllerGetCapabilitiesRequest{XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}
I1015 20:30:34.662445 1 controller.go:31] CreateVolume: called with args &csi.CreateVolumeRequest{Name:"pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798", CapacityRange:(*csi.CapacityRange)(0xc0001ab560), VolumeCapabilities:[]*csi.VolumeCapability{(*csi.VolumeCapability)(0xc0001b0b80)}, Parameters:map[string]string(nil), ControllerCreateSecrets:map[string]string(nil), VolumeContentSource:(*csi.VolumeContentSource)(nil), AccessibilityRequirements:(*csi.TopologyRequirement)(0xc0001d78b0), XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}
POD event:
Name: app
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: ip-172-20-127-156.ec2.internal/172.20.127.156
Start Time: Mon, 15 Oct 2018 13:30:35 -0700
Labels: <none>
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container app
Status: Pending
IP:
Containers:
app:
Container ID:
Image: centos
Image ID:
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
while true; do echo $(date -u) >> /data/out.txt; sleep 5; done
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/data from persistent-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-rw8jc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: late-claim
ReadOnly: false
default-token-rw8jc:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-rw8jc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18s default-scheduler Successfully assigned default/app to ip-172-20-127-156.ec2.internal
Warning FailedAttachVolume 17s attachdetach-controller AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Co
uld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatc
h: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
status code: 400, request id: 33c4a9cc-37d1-4e78-b37b-c9df81f659f9
Warning FailedAttachVolume 17s attachdetach-controller AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
status code: 400, request id: 8baabab8-e0e1-4063-9107-ea86cb7c9fda
Warning FailedAttachVolume 16s attachdetach-controller AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
status code: 400, request id: fff0faa0-df0e-4ad8-af27-0483267b09f7
Warning FailedAttachVolume 14s attachdetach-controller AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
status code: 400, request id: 2b03cea9-1ccb-4f65-91f8-bca33dab29f1
Warning FailedAttachVolume 10s attachdetach-controller AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
status code: 400, request id: 8b1129ab-1289-493a-a02b-981aa9d9478f
Warning FailedAttachVolume 2s attachdetach-controller AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Coul
d not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch:
The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
status code: 400, request id: 3a1f317d-8240-4f16-99ff-12982b1d673c
>> cat late-bind-sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: late-sc
provisioner: com.amazon.aws.csi.ebs
volumeBindingMode: WaitForFirstConsumer
>> cat late-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: late-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: late-sc
resources:
requests:
storage: 4Gi
>> cat pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
containers:
- name: app
image: centos
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 5; done"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: late-claim
Environment:
- Kubernetes version (use
kubectl version): client: v1.12.0 server: v1.12.1 - Cloud provider or hardware configuration: aws
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a): - Install tools: cluster is set up using kops
- Others:
- external-registrar: v0.4.0
- external-provisioner: v0.4.0
- external-attacher: v0.4.0
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 21 (7 by maintainers)
After upgrading to v0.4.1 for provisioner/attacher/registrar, I can see
preferredtoo.