oci-cloud-controller-manager: Failed to get ProviderID by nodeName - StorageClass "oci-bv"
BUG REPORT
Versions
CCM Version:
v 0.9
Environment:
- Kubernetes version (use
kubectl version):
lient Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.5", GitCommit:"e6503f8d8f769ace2f338794c914a96fc335df0f", GitTreeState:"clean", BuildDate:"2020-06-26T03:47:41Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.12", GitCommit:"7cd5e9086de8ae25d6a1514d0c87bac67ca4a481", GitTreeState:"clean", BuildDate:"2020-11-12T09:11:15Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
- OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
- Kernel (e.g.
uname -a):
Linux master01 5.4.0-1028-oracle #29~18.04.1-Ubuntu SMP Tue Oct 6 13:05:53 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
- Others:
What happened?
We are trying to implement the new CSI plugin as per the following guide, on a self managed K8S cluster v1.18.5 https://github.com/oracle/oci-cloud-controller-manager/blob/master/container-storage-interface.md
When we create a PVC and deploy a Pod, we can see a new Block Volume created, but not attached to the respective node.
we are getting the following error, when inspecting the pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m42s default-scheduler Successfully assigned default/app1 to worker03
Warning FailedAttachVolume 92s (x10 over 5m42s) attachdetach-controller AttachVolume.Attach failed for volume "csi-b1b9fabc-faca-4dab-8c96-f97c7c321d43" : rpc error: code = InvalidArgument desc = failed to get ProviderID by nodeName. error : missing provider id for node worker03
Warning FailedMount 82s (x2 over 3m39s) kubelet, 4worker03 Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[persistent-storage default-token-drrj8]: timed out waiting for the condition
Looking at the Nodes labels:
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
master01 Ready master 5h25m v1.18.5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=ME-JEDDAH-1-AD-1,kubernetes.io/arch=amd64,kubernetes.io/hostname=master01,kubernetes.io/os=linux,node-role.kubernetes.io/master=
worker01 Ready <none> 5h24m v1.18.5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=ME-JEDDAH-1-AD-1,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker01,kubernetes.io/os=linux
worker03 Ready <none> 5h24m v1.18.5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=ME-JEDDAH-1-AD-1,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker03,kubernetes.io/os=linux
What you expected to happen?
We are expecting the “oci-bv” StorageClass to provision an OCI Block Volume and attache it to the right node, and then attache it to the right Pod.
How to reproduce it (as minimally and precisely as possible)?
1- A clean self managed K8S cluster v1.18.5
2- OCI Console -> Identity -> Dynamic Groups
All {instance.compartment.id = 'ocid1.compartment.oc1..XXXXXXXXXXXX'}
3- OCI Console -> Identity -> Policies
Allow dynamic-group oci_csi_group to read vnic-attachments in compartment Test_Compartment
Allow dynamic-group oci_csi_group to read vnics in compartment Test_Compartment
Allow dynamic-group oci_csi_group to read instances in compartment Test_Compartment
Allow dynamic-group oci_csi_group to read subnets in compartment Test_Compartment
Allow dynamic-group oci_csi_group to use volumes in compartment Test_Compartment
Allow dynamic-group oci_csi_group to use instances in compartment Test_Compartment
Allow dynamic-group oci_csi_group to manage volume-attachments in compartment Test_Compartment
Allow dynamic-group oci_csi_group to manage volumes in compartment Test_Compartment
Allow dynamic-group oci_csi_group to manage file-systems in compartment Test_Compartment
4- Create generic secret oci-volume-provisioner
auth:
region: me-jeddah-1
tenancy: ocid1.tenancy.oc1..XXXXXX
useInstancePrincipals: true
compartment: ocid1.compartment.oc1..XXXXXX (Test_Compartment)
5- Apply the manifest
~$ kubectl apply -f https://raw.githubusercontent.com/oracle/oci-cloud-controller-manager/master/manifests/container-storage-interface/oci-csi-node-rbac.yaml
~$ kubectl apply -f https://raw.githubusercontent.com/oracle/oci-cloud-controller-manager/master/manifests/container-storage-interface/oci-csi-controller-driver.yaml
~$ kubectl apply -f https://raw.githubusercontent.com/oracle/oci-cloud-controller-manager/master/manifests/container-storage-interface/oci-csi-node-driver.yaml
~$ kubectl apply -f https://raw.githubusercontent.com/oracle/oci-cloud-controller-manager/master/manifests/container-storage-interface/storage-class.yaml
6- Verify
~$ kubectl -n kube-system get pod | grep oci
csi-oci-controller-56ddc7fc8d-gl2qp 3/3 Running 0 51m
csi-oci-node-4v9pf 2/2 Running 0 51m
csi-oci-node-8wdtp 2/2 Running 0 51m
csi-oci-node-kfcgg 2/2 Running 0 51m
7- Manually update failure-domain.beta.kubernetes.io/zone label
kubectl label nodes master01 failure-domain.beta.kubernetes.io/zone=ME-JEDDAH-1-AD-1 --overwrite
kubectl label nodes worker01 failure-domain.beta.kubernetes.io/zone=ME-JEDDAH-1-AD-1 --overwrite
kubectl label nodes worker03 failure-domain.beta.kubernetes.io/zone=ME-JEDDAH-1-AD-1 --overwrite
8- Deploy POD and PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: oci-bv-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: oci-bv
resources:
requests:
storage: 50Gi
---
apiVersion: v1
kind: Pod
metadata:
name: app1
spec:
containers:
- name: app1
image: centos
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 5; done"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: oci-bv-claim
9- kubectl describe pvc/oci-bv-claim
Normal Provisioning 10m (x2 over 10m) blockvolume.csi.oraclecloud.com_master01_2a582dbf-7cec-467a-81af-c6d02efb6144 External provisioner is provisioning volume for claim "default/oci-bv-claim"
Normal ProvisioningSucceeded 10m (x2 over 10m) blockvolume.csi.oraclecloud.com_master01_2a582dbf-7cec-467a-81af-c6d02efb6144 Successfully provisioned volume csi-b1b9fabc-faca-4dab-8c96-f97c7c321d43
10- kubectl describe pod app1
Warning FailedAttachVolume 24s (x13 over 10m) attachdetach-controller AttachVolume.Attach failed for volume "csi-b1b9fabc-faca-4dab-8c96-f97c7c321d43" : rpc error: code = InvalidArgument desc = failed to get ProviderID by nodeName. error : missing provider id for node worker03
Anything else we need to know?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 61 (28 by maintainers)
Hi @Bytelegion ,
Can you please share the following details in a new issue:
Thanks.
As per README here https://github.com/oracle/oci-cloud-controller-manager I think you are missing this step Preparing Your Cluster To deploy the Cloud Controller Manager (CCM) your cluster must be configured to use an external cloud-provider.
This involves:
Setting the --cloud-provider=external flag on the kubelet on all nodes in your cluster. Setting the --provider-id=<instanceID> flag on the kubelet on all nodes in your cluster. Where <instanceID> is the instance ocid of a node (unique for each node). Setting the --cloud-provider=external flag on the kube-controller-manager in your Kubernetes control plane.
I would say do not set --provider-id and see what happens, but the other 2 steps are important