kubernetes: VerifyVolumesAreAttached is failing and looping on remounts for openstack cinder
Is this a request for help?: Not necessarily, things appear to be working
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): #28962 looks similar
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
Kubernetes version (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T07:31:07Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:52:34Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Environment:
- Cloud provider or hardware configuration: OpenStack with CinderV1 API
- OS (e.g. from /etc/os-release): Container Linux by CoreOS 1235.6.0
- Kernel (e.g.
uname -a):Linux kube-master-02.openstacklocal 4.7.3-coreos-r2 #1 SMP Sun Jan 8 00:32:25 UTC 2017 x86_64 Intel Xeon E312xx (Sandy Bridge) GenuineIntel GNU/Linux - Install tools: Kargo
- Others:
What happened: I was following the guide for a StatefulSet zookeeper install and things appear to have booted up, the pods are stable, but I see errors in the UI about Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"zk-0". list of unattached/unmounted volumes=[datadir]. This led me to check the hyperkube controller-manager’s logs and I see a spewing of logs related to remounting the ZK volumes:
I0115 21:46:51.541418 1 node_status_updater.go:135] Updating status for node "kube-node-04" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/vdb\",\"name\":\"kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb\"}]}}" VolumesAttached: [{kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb /dev/vdb}]
I0115 21:46:55.566696 1 attacher.go:140] VolumesAreAttached: check volume "efeddaeb-ed27-4e92-9733-f46251cee3cb" (specName: "pvc-dc5e1785-dafe-11e6-b025-fa163e0158f7") is no longer attached
I0115 21:46:55.566758 1 operation_executor.go:565] VerifyVolumesAreAttached determined volume "kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb" (spec.Name: "pvc-dc5e1785-dafe-11e6-b025-fa163e0158f7") is no longer attached to node %!q(MISSING), therefore it was marked as detached.
I0115 21:46:55.610216 1 reconciler.go:213] Started AttachVolume for volume "kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb" to node "kube-node-04"
I0115 21:46:56.180376 1 attacher.go:92] Attach operation is successful. volume "efeddaeb-ed27-4e92-9733-f46251cee3cb" is already attached to node "a1c515d3-2f79-4067-a7c7-e7cac564a60b".
I0115 21:46:56.508940 1 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb" (spec.Name: "pvc-dc5e1785-dafe-11e6-b025-fa163e0158f7") from node "kube-node-04".
I0115 21:46:56.721459 1 node_status_updater.go:135] Updating status for node "kube-node-04" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/vdb\",\"name\":\"kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb\"}]}}" VolumesAttached: [{kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb /dev/vdb}]
I0115 21:47:00.617475 1 attacher.go:140] VolumesAreAttached: check volume "efeddaeb-ed27-4e92-9733-f46251cee3cb" (specName: "pvc-dc5e1785-dafe-11e6-b025-fa163e0158f7") is no longer attached
I0115 21:47:00.617563 1 operation_executor.go:565] VerifyVolumesAreAttached determined volume "kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb" (spec.Name: "pvc-dc5e1785-dafe-11e6-b025-fa163e0158f7") is no longer attached to node %!q(MISSING), therefore it was marked as detached.
I0115 21:47:00.835167 1 reconciler.go:213] Started AttachVolume for volume "kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb" to node "kube-node-04"
I0115 21:47:01.457503 1 attacher.go:92] Attach operation is successful. volume "efeddaeb-ed27-4e92-9733-f46251cee3cb" is already attached to node "a1c515d3-2f79-4067-a7c7-e7cac564a60b".
I0115 21:47:01.743547 1 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/cinder/efeddaeb-ed27-4e92-9733-f46251cee3cb" (spec.Name: "pvc-dc5e1785-dafe-11e6-b025-fa163e0158f7") from node "kube-node-04".
What you expected to happen: Things should be quiet, the kube cluster should not be retrying the mount. Although it appears to be a NOOP it is noisy and I fear it will add load to our openstack cluster with all the API calls.
How to reproduce it (as minimally and precisely as possible):
Try the zookeeper config on an openstack cluster. In my case, I had to create my own storage class (for example, a ceph-backed storage system) and then update the YAML to go to the ceph storage class:
kubectl create -f http://k8s.io/docs/tutorials/stateful-application/zookeeper.yaml on a openstack cluster.
Anything else do we need to know:
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (10 by maintainers)
This is fixed in - https://github.com/kubernetes/kubernetes/pull/39998 , we need to cherry pick this to
release-1.5branch. @xrl if you don’t mind you can cherry pick the commit to release-1.5 branch and open a PR. You can run:Replace you github id, branch name and PR number.
I think this is similar to what we discovered https://github.com/kubernetes/kubernetes/pull/39551/files . By default kubernetes verifies if volumes are indeed attached to nodes every 5second. As a workaround - you can increase the time duration.
But increasing the polling duration should be done with some caution. cc @jingxu97 @chrislovecnm