longhorn: [BUG] Volumes don't mount with mTLS enabled

Describe the bug (🐛 if you encounter this issue)

If I create a longhorn-grpc-tls secret to enable mTLS, various bits of Longhorn seem to break. PersistentVolumes are successfully created from PersistentVolumeClaims, but the mount operation times out:

Events:
  Type     Reason              Age                  From                     Message
  ----     ------              ----                 ----                     -------
  Warning  FailedScheduling    25m                  default-scheduler        0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
  Warning  FailedScheduling    25m                  default-scheduler        0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
  Normal   Scheduled           25m                  default-scheduler        Successfully assigned default/test to sfackler-virtual-machine
  Warning  FailedMount         2m55s (x4 over 23m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[vol], unattached volumes=[kube-api-access-mpj8w vol]: timed out waiting for the condition
  Warning  FailedAttachVolume  73s (x19 over 23m)   attachdetach-controller  AttachVolume.Attach failed for volume "pvc-2e8677f6-4e71-4126-a8c6-0f43fa114c17" : rpc error: code = DeadlineExceeded desc = volume pvc-2e8677f6-4e71-4126-a8c6-0f43fa114c17 failed to attach to node sfackler-virtual-machine with attachmentID csi-ee1ea16a498fa14971ff791c088d358b646df394f69931a4c6d9013bf3cbff17
  Warning  FailedMount         37s (x7 over 18m)    kubelet                  Unable to attach or mount volumes: unmounted volumes=[vol], unattached volumes=[vol kube-api-access-mpj8w]: timed out waiting for the condition

Looking at the volume in the Longhorn UI, an error repeatedly appears saying

failed to list snapshot: proxyServer=10.42.0.31:8501 destination=10.42.0.31:10010: failed to list snapshots: rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: EOF"

From what I can tell, the instance manager’s proxy GRPC endpoint is set up to use TLS, but longhorn-manager is trying to connect with unencrypted GRPC since #3975 is not yet implemented. I can kubectl exec into the longhorn-manager container and connect successfully to the instance manager with curl using TLS and the configured cert:

root@sfackler-virtual-machine:~# kubectl exec -it -n longhorn-system longhorn-manager-p4gfs -- bash
longhorn-manager-p4gfs:/ # curl --cert /tls-files/tls.crt --key /tls-files/tls.key -kv --http2-prior-knowledge https://10.42.0.31:8501
*   Trying 10.42.0.31:8501...
* Connected to 10.42.0.31 (10.42.0.31) port 8501 (#0)
* ALPN: offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, CERT verify (15):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=longhorn-backend
*  start date: Nov  4 00:24:09 2023 GMT
*  expire date: Nov  3 00:24:09 2024 GMT
*  issuer: CN=longhorn-backend
*  SSL certificate verify result: self signed certificate (18), continuing anyway.
* using HTTP/1.x
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority: 10.42.0.31:8501]
* h2h3 [user-agent: curl/8.0.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x557587f0fb80)
> GET / HTTP/2
> Host: 10.42.0.31:8501
> user-agent: curl/8.0.1
> accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 415 
< content-type: application/grpc
< grpc-status: 3
< grpc-message: invalid gRPC request content-type ""
< 
* Connection #0 to host 10.42.0.31 left intact

To Reproduce

  1. Create a longhorn-grpc-tls secret (I generated this one with Helm’s genSelfSignedCert function):
apiVersion: v1
data:
  ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZFekNDQS91Z0F3SUJBZ0lSQUtJZnhlU1Rqd0JzSWM1MzNmUW84aXN3RFFZSktvWklodmNOQVFFTEJRQXcKR3pFWk1CY0dBMVVFQXhNUWJHOXVaMmh2Y200dFltRmphMlZ1WkRBZUZ3MHlNekV4TURRd01ESTBNRGxhRncweQpOREV4TURNd01ESTBNRGxhTUJzeEdUQVhCZ05WQkFNVEVHeHZibWRvYjNKdUxXSmhZMnRsYm1Rd2dnRWlNQTBHCkNTcUdTSWIzRFFFQkFRVUFBNElCRHdBd2dnRUtBb0lCQVFDWG1rdEtGRWlqcFlVNXlvdG1TV3FjOGNETldTbTgKWkdFL04yY2hJWmkyK2M0Y3lzNElRRmZ6ZnBvWW0vQlRJNGVKb1daTlRiSFk5cVpsbEo1SUgyMlc2VFBiRS9zcQp1YlhDSGxnV1diRWxjd2MvcmUxNVFscTZoNk9wbGJlZUpsdldDblo0ZjJDZ2w0c1hERkxZNEVMNEIzU213MytpCmNTR2swUE15RWNwRXBTZlNRZWNORHY1THlaYTRoZFl4VnlLN2grL2p0UUJaY2FqdUY0b1czeTVsRHlUY2lISUIKQUFJTHVwMjdXY1RMUFhPMVowUWtKeHhrM1lhK3ZQaWt1d2s1Q3hmZXdja2h1Rk5IUjJaOW5XbVBqNUtST3FsVApiREs1WXlUQ3ZKNFUvTFRYNk9VOWhQTEJWcWQ2MDZua2dOdHZ6TWJ6NTFIdmx5VlM3ZnAvL0NFekFnTUJBQUdqCmdnSlFNSUlDVERBT0JnTlZIUThCQWY4RUJBTUNCYUF3SFFZRFZSMGxCQll3RkFZSUt3WUJCUVVIQXdFR0NDc0cKQVFVRkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd2dnSUxCZ05WSFJFRWdnSUNNSUlCL29JUWJHOXVaMmh2Y200dApZbUZqYTJWdVpJSWdiRzl1WjJodmNtNHRZbUZqYTJWdVpDNXNiMjVuYUc5eWJpMXplWE4wWlcyQ0pHeHZibWRvCmIzSnVMV0poWTJ0bGJtUXViRzl1WjJodmNtNHRjM2x6ZEdWdExuTjJZNElSYkc5dVoyaHZjbTR0Wm5KdmJuUmwKYm1TQ0lXeHZibWRvYjNKdUxXWnliMjUwWlc1a0xteHZibWRvYjNKdUxYTjVjM1JsYllJbGJHOXVaMmh2Y200dApabkp2Ym5SbGJtUXViRzl1WjJodmNtNHRjM2x6ZEdWdExuTjJZNElYYkc5dVoyaHZjbTR0Wlc1bmFXNWxMVzFoCmJtRm5aWEtDSjJ4dmJtZG9iM0p1TFdWdVoybHVaUzF0WVc1aFoyVnlMbXh2Ym1kb2IzSnVMWE41YzNSbGJZSXIKYkc5dVoyaHZjbTR0Wlc1bmFXNWxMVzFoYm1GblpYSXViRzl1WjJodmNtNHRjM2x6ZEdWdExuTjJZNElZYkc5dQpaMmh2Y200dGNtVndiR2xqWVMxdFlXNWhaMlZ5Z2loc2IyNW5hRzl5YmkxeVpYQnNhV05oTFcxaGJtRm5aWEl1CmJHOXVaMmh2Y200dGMzbHpkR1Z0Z2l4c2IyNW5hRzl5YmkxeVpYQnNhV05oTFcxaGJtRm5aWEl1Ykc5dVoyaHYKY200dGMzbHpkR1Z0TG5OMlk0SU1iRzl1WjJodmNtNHRZM05wZ2h4c2IyNW5hRzl5YmkxamMya3ViRzl1WjJodgpjbTR0YzNsemRHVnRnaUJzYjI1bmFHOXliaTFqYzJrdWJHOXVaMmh2Y200dGMzbHpkR1Z0TG5OMlk0SVFiRzl1CloyaHZjbTR0WW1GamEyVnVaSWNFZndBQUFUQU5CZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFJSEJZV3VYR0EwL1kKK0hDOWFKc2dqZVRKcWxyZk5aeVF1UWJENnVQaFhGWXk4TWNpR0J5QlBZZ2s5M1QrNWF4eHVyeHdKQnBNUHNwcApxRHBBM3BScWdNOW02WjBIcDd0WHVGNnB1bHJaYmUzdXJwNzVWZnNIb3I5bXo4R1FYQlpHTEJxQ0ZXNTQzQkdSCmhZWXQ3Z3JQNjBYT1hGUkk5clpjdm9NOWFkWFc2WjAybmxKMVkzcEJpTjYzSFBCYzNLL0U3OUJzeUZjYjVyK2QKcVd6OHRsdlZVTmU3VXY5a3BUK1FmdGFvamdmRWl3V2llSCtKY0YrZW5OSnlacCtLSWFLV3Q5TmFTZE1ObjY1Mwp6aWVDZ1NtMWNNby9Ua2hiQ2dBYmRSWUFtcGo4V3l0Q0JPQ3V5dTczTXZSU1JhTVIwMnl2VHcrQmtoYkVacDBGCkhpRjdJNmxHdGc9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
  tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZFekNDQS91Z0F3SUJBZ0lSQUtJZnhlU1Rqd0JzSWM1MzNmUW84aXN3RFFZSktvWklodmNOQVFFTEJRQXcKR3pFWk1CY0dBMVVFQXhNUWJHOXVaMmh2Y200dFltRmphMlZ1WkRBZUZ3MHlNekV4TURRd01ESTBNRGxhRncweQpOREV4TURNd01ESTBNRGxhTUJzeEdUQVhCZ05WQkFNVEVHeHZibWRvYjNKdUxXSmhZMnRsYm1Rd2dnRWlNQTBHCkNTcUdTSWIzRFFFQkFRVUFBNElCRHdBd2dnRUtBb0lCQVFDWG1rdEtGRWlqcFlVNXlvdG1TV3FjOGNETldTbTgKWkdFL04yY2hJWmkyK2M0Y3lzNElRRmZ6ZnBvWW0vQlRJNGVKb1daTlRiSFk5cVpsbEo1SUgyMlc2VFBiRS9zcQp1YlhDSGxnV1diRWxjd2MvcmUxNVFscTZoNk9wbGJlZUpsdldDblo0ZjJDZ2w0c1hERkxZNEVMNEIzU213MytpCmNTR2swUE15RWNwRXBTZlNRZWNORHY1THlaYTRoZFl4VnlLN2grL2p0UUJaY2FqdUY0b1czeTVsRHlUY2lISUIKQUFJTHVwMjdXY1RMUFhPMVowUWtKeHhrM1lhK3ZQaWt1d2s1Q3hmZXdja2h1Rk5IUjJaOW5XbVBqNUtST3FsVApiREs1WXlUQ3ZKNFUvTFRYNk9VOWhQTEJWcWQ2MDZua2dOdHZ6TWJ6NTFIdmx5VlM3ZnAvL0NFekFnTUJBQUdqCmdnSlFNSUlDVERBT0JnTlZIUThCQWY4RUJBTUNCYUF3SFFZRFZSMGxCQll3RkFZSUt3WUJCUVVIQXdFR0NDc0cKQVFVRkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd2dnSUxCZ05WSFJFRWdnSUNNSUlCL29JUWJHOXVaMmh2Y200dApZbUZqYTJWdVpJSWdiRzl1WjJodmNtNHRZbUZqYTJWdVpDNXNiMjVuYUc5eWJpMXplWE4wWlcyQ0pHeHZibWRvCmIzSnVMV0poWTJ0bGJtUXViRzl1WjJodmNtNHRjM2x6ZEdWdExuTjJZNElSYkc5dVoyaHZjbTR0Wm5KdmJuUmwKYm1TQ0lXeHZibWRvYjNKdUxXWnliMjUwWlc1a0xteHZibWRvYjNKdUxYTjVjM1JsYllJbGJHOXVaMmh2Y200dApabkp2Ym5SbGJtUXViRzl1WjJodmNtNHRjM2x6ZEdWdExuTjJZNElYYkc5dVoyaHZjbTR0Wlc1bmFXNWxMVzFoCmJtRm5aWEtDSjJ4dmJtZG9iM0p1TFdWdVoybHVaUzF0WVc1aFoyVnlMbXh2Ym1kb2IzSnVMWE41YzNSbGJZSXIKYkc5dVoyaHZjbTR0Wlc1bmFXNWxMVzFoYm1GblpYSXViRzl1WjJodmNtNHRjM2x6ZEdWdExuTjJZNElZYkc5dQpaMmh2Y200dGNtVndiR2xqWVMxdFlXNWhaMlZ5Z2loc2IyNW5hRzl5YmkxeVpYQnNhV05oTFcxaGJtRm5aWEl1CmJHOXVaMmh2Y200dGMzbHpkR1Z0Z2l4c2IyNW5hRzl5YmkxeVpYQnNhV05oTFcxaGJtRm5aWEl1Ykc5dVoyaHYKY200dGMzbHpkR1Z0TG5OMlk0SU1iRzl1WjJodmNtNHRZM05wZ2h4c2IyNW5hRzl5YmkxamMya3ViRzl1WjJodgpjbTR0YzNsemRHVnRnaUJzYjI1bmFHOXliaTFqYzJrdWJHOXVaMmh2Y200dGMzbHpkR1Z0TG5OMlk0SVFiRzl1CloyaHZjbTR0WW1GamEyVnVaSWNFZndBQUFUQU5CZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFJSEJZV3VYR0EwL1kKK0hDOWFKc2dqZVRKcWxyZk5aeVF1UWJENnVQaFhGWXk4TWNpR0J5QlBZZ2s5M1QrNWF4eHVyeHdKQnBNUHNwcApxRHBBM3BScWdNOW02WjBIcDd0WHVGNnB1bHJaYmUzdXJwNzVWZnNIb3I5bXo4R1FYQlpHTEJxQ0ZXNTQzQkdSCmhZWXQ3Z3JQNjBYT1hGUkk5clpjdm9NOWFkWFc2WjAybmxKMVkzcEJpTjYzSFBCYzNLL0U3OUJzeUZjYjVyK2QKcVd6OHRsdlZVTmU3VXY5a3BUK1FmdGFvamdmRWl3V2llSCtKY0YrZW5OSnlacCtLSWFLV3Q5TmFTZE1ObjY1Mwp6aWVDZ1NtMWNNby9Ua2hiQ2dBYmRSWUFtcGo4V3l0Q0JPQ3V5dTczTXZSU1JhTVIwMnl2VHcrQmtoYkVacDBGCkhpRjdJNmxHdGc9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
  tls.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3dJQkFBS0NBUUVBbDVwTFNoUklvNldGT2NxTFprbHFuUEhBelZrcHZHUmhQemRuSVNHWXR2bk9ITXJPCkNFQlg4MzZhR0p2d1V5T0hpYUZtVFUyeDJQYW1aWlNlU0I5dGx1a3oyeFA3S3JtMXdoNVlGbG14SlhNSFA2M3QKZVVKYXVvZWpxWlczbmlaYjFncDJlSDlnb0plTEZ3eFMyT0JDK0FkMHBzTi9vbkVocE5Eek1oSEtSS1VuMGtIbgpEUTcrUzhtV3VJWFdNVmNpdTRmdjQ3VUFXWEdvN2hlS0Z0OHVaUThrM0loeUFRQUNDN3FkdTFuRXl6MXp0V2RFCkpDY2NaTjJHdnJ6NHBMc0pPUXNYM3NISkliaFRSMGRtZloxcGo0K1NrVHFwVTJ3eXVXTWt3cnllRlB5MDEramwKUFlUeXdWYW5ldE9wNUlEYmI4ekc4K2RSNzVjbFV1MzZmL3doTXdJREFRQUJBb0lCQVFDQWlQd2VvZFg0a2FURQpHOXRXN1JZc1hMaFlJcW5GSmVKaG84cVhoNUdnU1dvY2RVSjhNbm1mWkE2b29NWUE1MVhLTmdLenRoVDgzQnEyCmMyeER3QW05Y3BsWnZMWXVRbWc5WGxiWEZGS2lhc1dSa3hpTnY5bUczdXUvSThZYm0zQXZxSTFMbXN2RlBOZGIKd2tJWHlRUmVvSXVodkkxaG44T2pwdGthOFlScDc4TUpyb08wZFRISGRyQVFHV0RZS1BaSmVnZ21sRnI4ZE94cQpRRmMrQWJZNkdNSHpESFF6TnVYV21FdDhDaE9tZWRaS3NkSnVIamRVbEZQTGc3WHduMUFqUms4UmFUdEhkbm5HCmh1VnpqY2NhYWhCaWFZYU8yRWlBV3NvU0pDajY1cmlGYVFSc2tULzEwVDdkcTMrQnJKNFRBanFQM2ZNUmdGVUcKSFFJaldQUmhBb0dCQU1vTkRyWFkxMytCb1I3R3R1ZEowNEd3NFRKZDNWWno5RHVtN05PUml0K2FJVjFhM3R4TwptalcyNXJsWWZNTmtLLzQxRHUxNHBTeklMQy9vNVNGRnQ1alRUUjJPVEMrRkVtclVaNXZTbE4zenBoU2ZRRXNSCmJLRzREb1BUVEFZQ21tLzQwaWJXR3lBVVFvd3lUcmRBNmxsaEgwSzR2TmtVeDhPUGJERitweEZqQW9HQkFNQVUKNkNseUQxWnlJOEMvc0dJTi9XeGphR3VUZDk5V0gzeCtyUHYrL05vK3ErQ2ltUHFzdlZJN3RXTXJyMGZBbEY2WQpyem01SEJVSmxmd0RsTXl4MXo5cmhhaFlEMVhJQU14T3prakxEcDFZanZPdU1uRlpNUmVvU2tpUzVmQjAxUTVPCnZOYmNhUmpoYTVFYnp5ajRtK0xFNXFxYXE5QzlhemJHdklvejNTSHhBb0dBWlE3UTI3MWdVNkwzZmxndnBWRTAKbTdwbmVIU2dQeHh4L09BSnRld214SjNuc0RUQ0lQaWpndGcvWUZiVTJEbWpFMXRnWXdBanhWazlXSjBvOVZKVQplUGkrcWxqQTNFZTNwWDBsY1RlTFE5UVlybG5VbzNkTW1UcGc4Q2hmN3VXZ2J2N0p4YWp6R2tGbjQ0MUo5N2hkClBtVW9hSXZUME5QbThuWXF6RHFudnpjQ2dZQlZHcTVHZHJmZStGRm8vRVY5SEcrMVQxSWJuOG9UMVFlOFZDLzIKc3VKN0hCdHhPdm1HejNST2RCQUk4WGtHMUllb1pnUDFFbFFqNmd4ZkRZTW5Nb0NKSSsyalNlajVlOVVHc0wvcgpOclN4K0dQV0NjOWxzenZ5SEdsVlFHaktvaWtuM3JFQ2pjT3U5MmwrY0pEWHVWYndJWWVGL1dPMDU4Z1NDL25MClRmRmpNUUtCZ0hYbkgxbVRDWnNuODVHR0tEMjM3MmZpTWdqcXhJdVZwSjVmU0lkbHBjSnhLYm1JblJ4akludDUKMVBKdmdlb0FmOC9Ja0hNZVFiWUUxRDMzVDJubnZpb3pMeWt2NjZTZGNBNWpqRzVvT1RzWVhoMWFOeXF1d0NBRQplTkRmSGJjWi9VZ1p1bjcrWXREQTBTS29BYWNmWG94eWY3SUF4TE1rNzh2S0s4b3NQQ2JFCi0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==
kind: Secret
metadata:
  name: longhorn-grpc-tls
  namespace: longhorn-system
type: kubernetes.io/tls
  1. Install Longhorn. I used the Helm chart trying out both 1.5.1 and 1.5.2 with a few overrides to run on one node:
ingress:
  enabled: true
  host: localhost

persistence:
  defaultClassReplicaCount: 1

csi:
  attacherReplicaCount: 1
  provisionerReplicaCount: 1
  resizerReplicaCount: 1
  snapshotterReplicaCount: 1

defaultSettings:
  defaultReplicaCount: 1

longhornUI:
  replicas: 1
  1. Create a PersistentVolumeClaim and Pod:
piVersion: v1
kind: PersistentVolumeClaim
metadata: 
  name: test    
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 2Gi
---     
apiVersion: v1
kind: Pod
metadata:     
  name: test
spec:
  containers:
    - name: test
      image: alpine:latest
      command:
        - sh
        - -c
        - echo hello world > /data/hello && cat /data/hello
      volumeMounts:
        - name: vol
          mountPath: /data
  restartPolicy: Never
  volumes:  
    - name: vol
      persistentVolumeClaim:
        claimName: test

Expected behavior

The test pod will run, exiting with logs of “hello world”. This is the behavior I see when running the identical setup without the TLS secret.

Support bundle for troubleshooting

supportbundle_4613bf50-f4f2-42d2-80e0-78ee1f9c587d_2023-11-04T01-11-14Z.zip

Environment

  • Longhorn version: 1.5.1 and 1.5.2
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Helm
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: rke2 version v1.26.10+rke2r1 (9b01c8e4e9f781a2654075f43cb59b922fed1713)
    • Number of management node in the cluster: 1
    • Number of worker node in the cluster: 0
  • Node config
    • OS type and version: Ubuntu 22.04
    • Kernel version: 6.2.0-36-generic #37~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Oct 9 15:34:04 UTC 2
    • CPU per node: 2
    • Memory per node: 4GB
    • Disk type(e.g. SSD/NVMe/HDD): SSD
    • Network bandwidth between the nodes: N/A
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): VMware Workstation
  • Number of Longhorn volumes in the cluster: 1
  • Impacted Longhorn resources:
    • Volume names: pvc-2e8677f6-4e71-4126-a8c6-0f43fa114c17

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 16 (15 by maintainers)

Most upvoted comments

I tested with the latest changes and I think this is working as expected. Thanks for all the collaboration!

It took me a bit of time to get an mTLS secret that would work for testing this issue generated. I created a wiki page that will hopefully streamline this process for future developers: https://github.com/longhorn/longhorn/wiki/Generate-Test-mTLS-Certificates.

@chriscchien very good to do self-assign.

cc @longhorn/qa

It looks like the behavior was tested ~4 days before the proxy was first added to the instance-manager: https://github.com/longhorn/longhorn-instance-manager/commit/183c39fc1d93a4950955ae6635c3c541d106a672

Verified pass on longhorn master(longhorn-manager 305ff2), longhorn v1.6.x(longhorn-manager 6e3605)

  1. Create secret longhorn-grpc-tls in namespace longhorn-system then deploy Longhorn master/v1.6.x-head
    • Volume attach/detach worked well
    • Perform e2e core tests passed
  2. Deploy Longhorn master/v1.6.x-head then create secret longhorn-grpc-tls in namespace longhorn-system
    • Create volume1, attach/detach operation of volume1 worked well
    • Delete one longhorn-manager pod and instance-manager pod
      • Create new volume worked properly
      • Reattach volume1 worked without problem, data correct

Thanks!

Ah, I see. So the functionality was working when https://github.com/longhorn/longhorn-instance-manager/pull/109 merged on May 4, 2022. But it was likely silently broken again when https://github.com/longhorn/longhorn-instance-manager/pull/105 merged on May 16, 2022. And we didn’t have either a manual or automated test to detect it.

@ejweber Please help with this and review @sfackler PR.