kubernetes: vsphere cloud provider vcp stopped working after update to kubernetes 1.11.0
Is this a BUG REPORT or FEATURE REQUEST?: /kind bug /sig cloud-provider What happened: I had vcp working perfectly on kubernetes v1.9.6 . I had the installation done using kubeadm on our vsphere/vcenter environment .
Yesterday i did update from 1.9.6 -> 1.10.4 -> 1.11.0 , but after the update I am not able to put vcp back to work. I tried complete reinstall but still it is not working . It is complaining about VM Not Found as below What you expected to happen: VCP to work as it was on v1.9.6 How to reproduce it (as minimally and precisely as possible): Not sure, this is a production environment so I can’t a lot of tries
Anything else we need to know?:
Jul 7 05:08:41 localhost journal: E0707 05:08:41.853267 1 datacenter.go:78] Unable to find VM by UUID. VM UUID:
Jul 7 05:08:41 localhost journal: E0707 05:08:41.853374 1 nodemanager.go:414] Error "No VM found" node info for node "engine01" not found
Jul 7 05:08:41 localhost journal: E0707 05:08:41.853416 1 vsphere_util.go:134] Error while obtaining Kubernetes node nodeVmDetail details. error : No VM found
Jul 7 05:08:41 localhost journal: E0707 05:08:41.853444 1 vsphere.go:1160] Failed to get shared datastore: No VM found
Jul 7 05:08:41 localhost journal: I0707 05:08:41.854301 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-disk", UID:"a7c88c13-813d-11e8-aa8c-0050568166d0", APIVersion:"v1", ResourceVersion:"31538424", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' Failed to provision volume with StorageClass "thin-disk": No VM found
Jul 7 05:08:43 localhost journal: I0707 05:08:43.768259 1 reconciler.go:291] attacherDetacher.AttachVolume started for volume "pvc-a3091746-6a16-11e8-87de-0050568166d0" (UniqueName: "kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-a3091746-6a16-11e8-87de-0050568166d0.vmdk") from node "engine02"
Jul 7 05:08:43 localhost journal: E0707 05:08:43.785846 1 datacenter.go:78] Unable to find VM by UUID. VM UUID:
Jul 7 05:08:43 localhost journal: E0707 05:08:43.785913 1 nodemanager.go:282] Error "No VM found" node info for node "engine02" not found
Jul 7 05:08:43 localhost journal: E0707 05:08:43.785938 1 vsphere.go:550] Cannot find node "engine02" in cache. Node not found!!!
Jul 7 05:08:43 localhost journal: E0707 05:08:43.786012 1 attacher.go:80] Error attaching volume "[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-a3091746-6a16-11e8-87de-0050568166d0.vmdk" to node "engine02": No VM found
Jul 7 05:31:40 localhost journal: E0707 05:31:40.976785 1 datacenter.go:78] Unable to find VM by UUID. VM UUID:
Jul 7 05:31:40 localhost journal: E0707 05:31:40.976856 1 nodemanager.go:414] Error "No VM found" node info for node "engine01" not found
Jul 7 05:31:40 localhost journal: E0707 05:31:40.976883 1 vsphere_util.go:134] Error while obtaining Kubernetes node nodeVmDetail details. error : No VM found
Jul 7 05:31:40 localhost journal: E0707 05:31:40.976900 1 vsphere.go:1160] Failed to get shared datastore: No VM found
Jul 7 05:31:40 localhost journal: I0707 05:31:40.977444 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-disk", UID:"a7c88c13-813d-11e8-aa8c-0050568166d0", APIVersion:"v1", ResourceVersion:"31538424", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' Failed to provision volume with StorageClass "thin-disk": No VM found
Environment:
- Kubernetes version (use
kubectl version):
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:08:34Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration: vsphere
- OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
- Kernel (e.g.
uname -a): Linux engine03 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux - Install tools:
- Others:
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 55 (8 by maintainers)
@w-leads great news for me 👍 , after updating the node info everything worked perfectly. I don’t know if it is correct to set it manually as I did
One thing I noticed while updating kubernetes through kubeadm , is that on master it worked perfectly without any issue while on nodes the kubelet did not start and I had to manually create the file
/var/lib/kubelet/kubeadm-flags.envwithKUBELET_KUBEADM_ARGS=--cgroup-driver=systemd --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cniThanks for your help
Hello.
After updating node providerId everything worked perfectly.
To update this info, I used this command :
To show the vm uuid, connect in the vm and execute :
@divyenpatel ,
I specified the configuration before running
kubeadm, the config file I passed tokubeadmlooked something like this:I also specified the following in
/etc/sysconfig/kublet:KUBELET_EXTRA_ARGS=--cloud-provider=vsphere --cloud-config=/etc/kubernetes/pki/vsphere.confI ran
kubeadm initon the first master node, and then usedkubeadm joinwith the proper arguments on the remaining two master nodes and the three worker nodes (per the current 1.13 documentation on kubernetes.io). When I checked the provider ids of all the nodes viakubectlthey all had the proper provider id that matched the machine’s UUID.Besides the workaround to manually patch each node, was there a fix for newer versions? Just tried k8s 1.11.3 is still still facing this issue (with same vsphere settings which work perfectly on 1.10.x) I think only vmware experts like @divyenpatel can help.
Let me suggest a small fix to the above script that supports spaces in FOLDER path
@Vislor
In this case kubelet is already running with cloud-provider vsphere enabled. and node registration is happening after that, which is setting provider id correctly.
But if you already have kubernetes cluster deployed with kubeadm, without VCP enabled, and later when you try to enable VCP, i think node does not get provider id. and we have to patch nodes manually, or remove nodes from API servers and register them back with restarting kubelet on nodes.
@dbason 1.12 looks fine to me.