azuredisk-csi-driver: Driver install fails on Azure RedHat OpenShift

What happened:

I have installed the driver Master version on my ARO 4.5 Cluster, I do not think my snippet has changed, today the driver failed to install.

oc -n kube-system get pod -l app=csi-azuredisk-controller -o wide 
NAME                                      READY   STATUS             RESTARTS   AGE   IP           NODE                                           NOMINATED NODE   READINESS GATES
csi-azuredisk-controller-54b7c649-7v92d   1/6     CrashLoopBackOff   50         31m   172.32.2.5   aro-azarc-101-x7jmv-worker-westeurope3-g492c   <none>           <none>
csi-azuredisk-controller-54b7c649-h2bsn   1/6     CrashLoopBackOff   49         31m   172.32.2.4   aro-azarc-101-x7jmv-worker-westeurope2-bk6x7   <none>           <none>
oc logs csi-azuredisk-controller-54b7c649-7v92d -n kube-system -c azuredisk
I0107 13:39:43.118894   13104 request.go:621] Throttling request took 1.0730318s, request: GET:https://api.w6a05gb7.westeurope.aroapp.io:6443/apis/operators.coreos.com/v1alpha2?timeout=32s
W0107 12:37:58.996885       1 main.go:59] nodeid is empty
I0107 12:37:58.997178       1 main.go:86] set up prometheus server on [::]:29604
I0107 12:37:58.997475       1 azuredisk.go:118]
DRIVER INFORMATION:
-------------------
Build Date: "2021-01-04T02:41:39Z"
Compiler: gc
Driver Name: disk.csi.azure.com
Driver Version: v0.11.0
Git Commit: c42faa334b9e6d7cd4642c07e7aa437161b363cf
Go Version: go1.15.4
Platform: linux/amd64
Topology Key: topology.disk.csi.azure.com/zone

Streaming logs below:
I0107 12:37:59.091148       1 azure.go:56] reading cloud config from secret
I0107 12:37:59.104832       1 azure_auth.go:229] Using public cloud environment
I0107 12:37:59.104869       1 azure_auth.go:117] azure: using client_id+client_secret to retrieve access token
I0107 12:37:59.104925       1 azure_interfaceclient.go:61] Azure InterfacesClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.104952       1 azure_interfaceclient.go:64] Azure InterfacesClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.104967       1 azure_vmsizeclient.go:61] Azure VirtualMachineSizesClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.104971       1 azure_vmsizeclient.go:64] Azure VirtualMachineSizesClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.104981       1 azure_snapshotclient.go:62] Azure SnapshotClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.104989       1 azure_snapshotclient.go:65] Azure SnapshotClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.104997       1 azure_storageaccountclient.go:67] Azure StorageAccountClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105004       1 azure_storageaccountclient.go:70] Azure StorageAccountClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105016       1 azure_diskclient.go:67] Azure DisksClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105023       1 azure_diskclient.go:70] Azure DisksClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105059       1 azure_vmclient.go:62] Azure VirtualMachine client (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105082       1 azure_vmclient.go:65] Azure VirtualMachine client (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105090       1 azure_vmssclient.go:62] Azure VirtualMachineScaleSetClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105097       1 azure_vmssclient.go:65] Azure VirtualMachineScaleSetClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105107       1 azure_vmssvmclient.go:63] Azure vmssVM client (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105114       1 azure_vmssvmclient.go:66] Azure vmssVM client (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105121       1 azure_routeclient.go:60] Azure RoutesClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105125       1 azure_routeclient.go:63] Azure RoutesClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105134       1 azure_subnetclient.go:61] Azure SubnetsClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105138       1 azure_subnetclient.go:64] Azure SubnetsClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105145       1 azure_routetableclient.go:60] Azure RouteTablesClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105149       1 azure_routetableclient.go:63] Azure RouteTablesClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105155       1 azure_loadbalancerclient.go:62] Azure LoadBalancersClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105159       1 azure_loadbalancerclient.go:65] Azure LoadBalancersClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105165       1 azure_securitygroupclient.go:62] Azure SecurityGroupsClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105168       1 azure_securitygroupclient.go:65] Azure SecurityGroupsClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105175       1 azure_publicipclient.go:62] Azure PublicIPAddressesClient (read ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105182       1 azure_publicipclient.go:65] Azure PublicIPAddressesClient (write ops) using rate limit config: QPS=1, bucket=5
I0107 12:37:59.105265       1 azure.go:62] could not read cloud config from secret
I0107 12:37:59.105284       1 azure.go:65] AZURE_CREDENTIAL_FILE env var set as /etc/kubernetes/cloud.conf
I0107 12:37:59.105313       1 azure.go:83] read cloud config from file: /etc/kubernetes/cloud.conf successfully
I0107 12:37:59.105653       1 azure_auth.go:232] Using AzurePublicCloud environment
I0107 12:37:59.105697       1 azure.go:418] Azure cloud provider is starting without credentials
I0107 12:37:59.105708       1 azure.go:452] Azure cloudprovider using try backoff: retries=6, exponent=1.500000, duration=6, jitter=1.000000
I0107 12:37:59.105727       1 azuredisk.go:128] disable UseInstanceMetadata for controller
I0107 12:37:59.105849       1 mount_linux.go:188] Detected OS without systemd
I0107 12:37:59.105863       1 driver.go:80] Enabling controller service capability: CREATE_DELETE_VOLUME
I0107 12:37:59.105868       1 driver.go:80] Enabling controller service capability: PUBLISH_UNPUBLISH_VOLUME
I0107 12:37:59.105872       1 driver.go:80] Enabling controller service capability: CREATE_DELETE_SNAPSHOT
I0107 12:37:59.105875       1 driver.go:80] Enabling controller service capability: LIST_SNAPSHOTS
I0107 12:37:59.105878       1 driver.go:80] Enabling controller service capability: CLONE_VOLUME
I0107 12:37:59.105882       1 driver.go:80] Enabling controller service capability: EXPAND_VOLUME
I0107 12:37:59.105885       1 driver.go:80] Enabling controller service capability: LIST_VOLUMES
I0107 12:37:59.105888       1 driver.go:80] Enabling controller service capability: LIST_VOLUMES_PUBLISHED_NODES
I0107 12:37:59.105894       1 driver.go:99] Enabling volume access mode: SINGLE_NODE_WRITER
I0107 12:37:59.105901       1 driver.go:90] Enabling node service capability: STAGE_UNSTAGE_VOLUME
I0107 12:37:59.105907       1 driver.go:90] Enabling node service capability: EXPAND_VOLUME
I0107 12:37:59.105912       1 driver.go:90] Enabling node service capability: GET_VOLUME_STATS
I0107 12:37:59.106155       1 server.go:117] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}
I0107 12:37:59.295185       1 utils.go:84] GRPC call: /csi.v1.Identity/GetPluginInfo
I0107 12:37:59.295251       1 utils.go:85] GRPC request: {}
I0107 12:37:59.298010       1 identityserver.go:33] Using default GetPluginInfo
I0107 12:37:59.298019       1 utils.go:90] GRPC response: {"name":"disk.csi.azure.com","vendor_version":"v0.11.0"}
I0107 12:37:59.300099       1 utils.go:84] GRPC call: /csi.v1.Identity/Probe
I0107 12:37:59.300126       1 utils.go:85] GRPC request: {}
I0107 12:37:59.300171       1 utils.go:90] GRPC response: {"ready":{"value":true}}
I0107 12:37:59.302589       1 utils.go:84] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0107 12:37:59.302663       1 utils.go:85] GRPC request: {}
I0107 12:37:59.302707       1 controllerserver.go:530] Using default ControllerGetCapabilities
I0107 12:37:59.302715       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":6}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":3}}},{"Type":{"Rpc":{"type":10}}}]}
I0107 12:37:59.341107       1 utils.go:84] GRPC call: /csi.v1.Identity/Probe
I0107 12:37:59.341135       1 utils.go:85] GRPC request: {}
I0107 12:37:59.341190       1 utils.go:90] GRPC response: {"ready":{"value":true}}
I0107 12:37:59.343199       1 utils.go:84] GRPC call: /csi.v1.Identity/GetPluginInfo
I0107 12:37:59.343245       1 utils.go:85] GRPC request: {}
I0107 12:37:59.343302       1 identityserver.go:33] Using default GetPluginInfo
I0107 12:37:59.343310       1 utils.go:90] GRPC response: {"name":"disk.csi.azure.com","vendor_version":"v0.11.0"}
I0107 12:37:59.345456       1 utils.go:84] GRPC call: /csi.v1.Identity/GetPluginCapabilities
I0107 12:37:59.345474       1 utils.go:85] GRPC request: {}
I0107 12:37:59.345516       1 identityserver.go:59] Using default capabilities
I0107 12:37:59.345526       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"Service":{"type":2}}}]}
I0107 12:37:59.348709       1 utils.go:84] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0107 12:37:59.348724       1 utils.go:85] GRPC request: {}
I0107 12:37:59.348758       1 controllerserver.go:530] Using default ControllerGetCapabilities
I0107 12:37:59.348766       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":6}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":3}}},{"Type":{"Rpc":{"type":10}}}]}
I0107 12:37:59.794768       1 utils.go:84] GRPC call: /csi.v1.Identity/Probe
I0107 12:37:59.794813       1 utils.go:85] GRPC request: {}
I0107 12:37:59.794896       1 utils.go:90] GRPC response: {"ready":{"value":true}}
I0107 12:37:59.796826       1 utils.go:84] GRPC call: /csi.v1.Identity/GetPluginInfo
I0107 12:37:59.796847       1 utils.go:85] GRPC request: {}
I0107 12:37:59.796891       1 identityserver.go:33] Using default GetPluginInfo
I0107 12:37:59.796900       1 utils.go:90] GRPC response: {"name":"disk.csi.azure.com","vendor_version":"v0.11.0"}
I0107 12:37:59.798344       1 utils.go:84] GRPC call: /csi.v1.Identity/GetPluginCapabilities
I0107 12:37:59.798360       1 utils.go:85] GRPC request: {}
I0107 12:37:59.798399       1 identityserver.go:59] Using default capabilities
I0107 12:37:59.798409       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"Service":{"type":2}}}]}
I0107 12:37:59.800421       1 utils.go:84] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0107 12:37:59.800436       1 utils.go:85] GRPC request: {}
I0107 12:37:59.800473       1 controllerserver.go:530] Using default ControllerGetCapabilities
I0107 12:37:59.800481       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":6}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":3}}},{"Type":{"Rpc":{"type":10}}}]}
I0107 12:37:59.861019       1 utils.go:84] GRPC call: /csi.v1.Identity/Probe
I0107 12:37:59.861058       1 utils.go:85] GRPC request: {}
I0107 12:37:59.861119       1 utils.go:90] GRPC response: {"ready":{"value":true}}
I0107 12:37:59.862978       1 utils.go:84] GRPC call: /csi.v1.Identity/GetPluginInfo
I0107 12:37:59.862994       1 utils.go:85] GRPC request: {}
I0107 12:37:59.863033       1 identityserver.go:33] Using default GetPluginInfo
I0107 12:37:59.863039       1 utils.go:90] GRPC response: {"name":"disk.csi.azure.com","vendor_version":"v0.11.0"}
I0107 12:37:59.891118       1 utils.go:84] GRPC call: /csi.v1.Identity/GetPluginCapabilities
I0107 12:37:59.891138       1 utils.go:85] GRPC request: {}
I0107 12:37:59.891180       1 identityserver.go:59] Using default capabilities
I0107 12:37:59.891189       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"Service":{"type":2}}}]}
I0107 12:37:59.894510       1 utils.go:84] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0107 12:37:59.894611       1 utils.go:85] GRPC request: {}
I0107 12:37:59.894671       1 controllerserver.go:530] Using default ControllerGetCapabilities
I0107 12:37:59.894681       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":6}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":3}}},{"Type":{"Rpc":{"type":10}}}]}
I0107 12:37:59.995299       1 utils.go:84] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0107 12:37:59.995342       1 utils.go:85] GRPC request: {}
I0107 12:37:59.995398       1 controllerserver.go:530] Using default ControllerGetCapabilities
I0107 12:37:59.995407       1 utils.go:90] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":6}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":3}}},{"Type":{"Rpc":{"type":10}}}]}
I0107 12:38:19.691883       1 utils.go:84] GRPC call: /csi.v1.Controller/ListVolumes
I0107 12:38:19.691914       1 utils.go:85] GRPC request: {}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x167a44c]

goroutine 42 [running]:
sigs.k8s.io/azuredisk-csi-driver/pkg/azuredisk.(*Driver).ListVolumes(0xc000492ea0, 0x1c5ada0, 0xc00052a5a0, 0xc00052e200, 0xc000492ea0, 0x2658d5e, 0xc0004fe310)
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/pkg/azuredisk/controllerserver.go:556 +0x6c
github.com/container-storage-interface/spec/lib/go/csi._Controller_ListVolumes_Handler.func1(0x1c5ada0, 0xc00052a5a0, 0x1932160, 0xc00052e200, 0x0, 0x0, 0x1a39b50, 0x10)
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:5597 +0x89
sigs.k8s.io/azuredisk-csi-driver/pkg/csi-common.logGRPC(0x1c5ada0, 0xc00052a5a0, 0x1932160, 0xc00052e200, 0xc000261a80, 0xc000261aa0, 0xc000542b78, 0x49afa6, 0x190fb60, 0xc00052a5a0)
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/pkg/csi-common/utils.go:86 +0x1f9
github.com/container-storage-interface/spec/lib/go/csi._Controller_ListVolumes_Handler(0x19f2960, 0xc000492ea0, 0x1c5ada0, 0xc00052a5a0, 0xc000310d20, 0x1af9f98, 0x1c5ada0, 0xc00052a5a0, 0x0, 0x0)
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:5599 +0x150
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000083080, 0x1c6ec20, 0xc00066c480, 0xc000538100, 0xc000462930, 0x26f6e18, 0x0, 0x0, 0x0)
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/vendor/google.golang.org/grpc/server.go:1024 +0x522
google.golang.org/grpc.(*Server).handleStream(0xc000083080, 0x1c6ec20, 0xc00066c480, 0xc000538100, 0x0)
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/vendor/google.golang.org/grpc/server.go:1313 +0xd34
google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0002653b0, 0xc000083080, 0x1c6ec20, 0xc00066c480, 0xc000538100)
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/vendor/google.golang.org/grpc/server.go:722 +0xa5
created by google.golang.org/grpc.(*Server).serveStreams.func1
        /root/go/src/sigs.k8s.io/azuredisk-csi-driver/vendor/google.golang.org/grpc/server.go:720 +0xa5

What you expected to happen:

Install should be sucessfullu cimpleted with all pods up & running

How to reproduce it:


oc create configmap azure-cred-file --from-literal=path="/etc/kubernetes/cloud.conf" -n kube-system

oc adm policy add-scc-to-user privileged system:serviceaccount:kube-system:csi-azuredisk-node-sa
oc describe scc privileged

driver_version=master #v0.10.0
curl -skSL https://raw.githubusercontent.com/kubernetes-sigs/azuredisk-csi-driver/$driver_version/deploy/install-driver.sh | bash -s $driver_version --

oc logs csi-azuredisk-controller-54b7c649-7v92d -n kube-system -c azuredisk

Anything else we need to know?:

see also https://github.com/kubernetes-sigs/azuredisk-csi-driver/issues/398

Environment:

  • CSI Driver version: Master
  • Kubernetes version (use kubectl version):

Client Version: openshift-clients-4.5.0-202006231303.p0-16-g3f6a83fb7 Server Version: 4.5.16 Kubernetes Version: v1.18.3+2fbd7c7

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 26 (9 by maintainers)

Commits related to this issue

Most upvoted comments

@andyzhangx , agreed the ARO case should be better documented :

  1. we can explain that the ConfigMap can not be used because the /etc/kubernetes/cloud.conf does not contains the Azure Credentials “aadClientId” & “aadClientSecret”
  2. we should document to use the Secret , we can reuse thesnippet I I have shared in this issue