longhorn: [BUG] Error starting manager: Internal error occurred:

Describe the bug (πŸ› if you encounter this issue)

Longhorn-manager is failing to start on first install

A clear and concise description of what the bug is.

longhorn-manager cannot connect to the admission-webhook because of the bellow error (there are no errors on the admission-webhook pods )

To Reproduce

Default values, version1.4.0, manually port-forwarding to the admission-webhook pod svc and doing a curl works

Steps to reproduce the behavior:

  1. Go to β€˜β€¦β€™
  2. Click on β€˜β€¦β€™
  3. Perform β€˜β€¦β€™
  4. See error

Expected behavior

A clear and concise description of what you expected to happen.

Log or Support bundle

time="2023-07-03T08:28:22Z" level=info msg="cannot list the content of the src directory /var/lib/rancher/longhorn/engine-binaries for the copy, will do nothing: failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net bash -c ls /var/lib/rancher/longhorn/engine-binaries/*], output , stderr ls: cannot access /var/lib/rancher/longhorn/engine-binaries/*: No such file or directory\n: exit status 2"
99
time="2023-07-03T08:28:22Z" level=info msg="Skip the leader election for the upgrade since the current Longhorn system is already up to date"
98
W0703 08:28:22.868957       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
97
I0703 08:28:22.872608       1 shared_informer.go:240] Waiting for caches to sync for longhorn datastore
96
E0703 08:28:22.892357       1 kubernetes_node_controller.go:227] couldn't get kubernetes node k8snode01: node "k8snode01" not found
95
I0703 08:28:22.972721       1 shared_informer.go:247] Caches are synced for longhorn datastore
94
time="2023-07-03T08:28:22Z" level=info msg="Initializing metrics collector system" node=k8snode01
93
W0703 08:28:22.972817       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
92
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn backing image data source controller"
91
I0703 08:28:22.973267       1 shared_informer.go:240] Waiting for caches to sync for longhorn backing image data source
90
I0703 08:28:22.973275       1 shared_informer.go:247] Caches are synced for longhorn backing image data source
89
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn replica controller" controller=longhorn-replica node=k8snode01
88
I0703 08:28:22.973335       1 shared_informer.go:240] Waiting for caches to sync for longhorn replicas
87
I0703 08:28:22.973339       1 shared_informer.go:247] Caches are synced for longhorn replicas
86
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn engine controller" controller=longhorn-engine node=k8snode01
85
I0703 08:28:22.973357       1 shared_informer.go:240] Waiting for caches to sync for longhorn engines
84
I0703 08:28:22.973360       1 shared_informer.go:247] Caches are synced for longhorn engines
83
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn volume controller" controller=longhorn-volume node=k8snode01
82
I0703 08:28:22.973380       1 shared_informer.go:240] Waiting for caches to sync for longhorn engines
81
I0703 08:28:22.973383       1 shared_informer.go:247] Caches are synced for longhorn engines
80
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Engine Image controller" controller=longhorn-engine-image node=k8snode01
79
I0703 08:28:22.973412       1 shared_informer.go:240] Waiting for caches to sync for longhorn engine images
78
I0703 08:28:22.973416       1 shared_informer.go:247] Caches are synced for longhorn engine images
77
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn node controller"
76
I0703 08:28:22.973440       1 shared_informer.go:240] Waiting for caches to sync for longhorn node
75
I0703 08:28:22.973443       1 shared_informer.go:247] Caches are synced for longhorn node
74
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backup Target controller" controller=longhorn-backup-target node=k8snode01
73
I0703 08:28:22.973924       1 shared_informer.go:240] Waiting for caches to sync for longhorn-backup-target
72
I0703 08:28:22.973931       1 shared_informer.go:247] Caches are synced for longhorn-backup-target
71
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backup Volume controller" controller=longhorn-backup-volume node=k8snode01
70
I0703 08:28:22.973958       1 shared_informer.go:240] Waiting for caches to sync for longhorn-backup-volume
69
I0703 08:28:22.973961       1 shared_informer.go:247] Caches are synced for longhorn-backup-volume
68
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backup controller" controller=longhorn-backup node=k8snode01
67
I0703 08:28:22.974006       1 shared_informer.go:240] Waiting for caches to sync for longhorn-backup
66
I0703 08:28:22.974009       1 shared_informer.go:247] Caches are synced for longhorn-backup
65
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Recurring Job controller"
64
I0703 08:28:22.974038       1 shared_informer.go:240] Waiting for caches to sync for longhorn recurring jobs
63
I0703 08:28:22.974042       1 shared_informer.go:247] Caches are synced for longhorn recurring jobs
62
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Orphan controller" controller=longhorn-orphan node=k8snode01
61
I0703 08:28:22.974075       1 shared_informer.go:240] Waiting for caches to sync for longhorn-orphan
60
I0703 08:28:22.974079       1 shared_informer.go:247] Caches are synced for longhorn-orphan
59
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Snapshot Controller" controller=longhorn-snapshot node=k8snode01
58
I0703 08:28:22.974120       1 shared_informer.go:240] Waiting for caches to sync for longhorn-snapshot
57
I0703 08:28:22.974126       1 shared_informer.go:247] Caches are synced for longhorn-snapshot
56
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Support Bundle controller" controller=longhorn-support-bundle node=k8snode01
55
I0703 08:28:22.974157       1 shared_informer.go:240] Waiting for caches to sync for longhorn-support-bundle
54
I0703 08:28:22.974161       1 shared_informer.go:247] Caches are synced for longhorn-support-bundle
53
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn SystemBackup controller" controller=longhorn-system-backup node=k8snode01
52
I0703 08:28:22.974193       1 shared_informer.go:240] Waiting for caches to sync for longhorn-system-backup
51
I0703 08:28:22.974198       1 shared_informer.go:247] Caches are synced for longhorn-system-backup
50
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn SystemRestore controller" controller=longhorn-system-restore node=k8snode01
49
I0703 08:28:22.974245       1 shared_informer.go:240] Waiting for caches to sync for longhorn-system-restore
48
I0703 08:28:22.974248       1 shared_informer.go:247] Caches are synced for longhorn-system-restore
47
time="2023-07-03T08:28:22Z" level=info msg="Start kubernetes controller"
46
I0703 08:28:22.974270       1 shared_informer.go:240] Waiting for caches to sync for kubernetes
45
I0703 08:28:22.974273       1 shared_informer.go:247] Caches are synced for kubernetes
44
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Kubernetes node controller"
43
I0703 08:28:22.974315       1 shared_informer.go:240] Waiting for caches to sync for longhorn kubernetes node
42
I0703 08:28:22.974319       1 shared_informer.go:247] Caches are synced for longhorn kubernetes node
41
time="2023-07-03T08:28:22Z" level=info msg="Start longhorn-kubernetes-pod-controller" controller=longhorn-kubernetes-pod node=k8snode01
40
I0703 08:28:22.974354       1 shared_informer.go:240] Waiting for caches to sync for longhorn-kubernetes-pod-controller
39
I0703 08:28:22.974359       1 shared_informer.go:247] Caches are synced for longhorn-kubernetes-pod-controller
38
time="2023-07-03T08:28:22Z" level=info msg=Start controller=longhorn-kubernetes-configmap-controller node=k8snode01
37
I0703 08:28:22.974378       1 shared_informer.go:240] Waiting for caches to sync for longhorn-kubernetes-configmap-controller
36
I0703 08:28:22.974382       1 shared_informer.go:247] Caches are synced for longhorn-kubernetes-configmap-controller
35
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Kubernetes secret controller" controller=longhorn-kubernetes-secret-controller node=k8snode01
34
I0703 08:28:22.974657       1 shared_informer.go:240] Waiting for caches to sync for longhorn-kubernetes-secret-controller
33
I0703 08:28:22.974662       1 shared_informer.go:247] Caches are synced for longhorn-kubernetes-secret-controller
32
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
31
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
30
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
29
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
28
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
27
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn websocket controller" controller=longhorn-websocket node=k8snode01
26
I0703 08:28:22.975017       1 shared_informer.go:240] Waiting for caches to sync for longhorn websocket
25
I0703 08:28:22.975022       1 shared_informer.go:247] Caches are synced for longhorn websocket
24
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Setting controller" controller=longhorn-setting node=k8snode01
23
I0703 08:28:22.975040       1 shared_informer.go:240] Waiting for caches to sync for longhorn settings
22
I0703 08:28:22.975044       1 shared_informer.go:247] Caches are synced for longhorn settings
21
time="2023-07-03T08:28:22Z" level=warning msg="Failed to select node for sync backup target" controller=longhorn-setting error="no ready nodes available" node=k8snode01
20
time="2023-07-03T08:28:22Z" level=warning msg="Error syncing Longhorn setting longhorn-system/backup-target" controller=longhorn-setting error="failed to sync setting for longhorn-system/backup-target: failed to sync backup target: no ready nodes available" node=k8snode01
19
time="2023-07-03T08:28:22Z" level=warning msg="Failed to select node for sync backup target" controller=longhorn-setting error="no ready nodes available" node=k8snode01
18
time="2023-07-03T08:28:22Z" level=warning msg="Error syncing Longhorn setting longhorn-system/backup-target-credential-secret" controller=longhorn-setting error="failed to sync setting for longhorn-system/backup-target-credential-secret: failed to sync backup target: no ready nodes available" node=k8snode01
17
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn instance manager controller"
16
I0703 08:28:22.975341       1 shared_informer.go:240] Waiting for caches to sync for longhorn instance manager
15
I0703 08:28:22.975347       1 shared_informer.go:247] Caches are synced for longhorn instance manager
14
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn share manager controller" controller=longhorn-share-manager node=k8snode01
13
I0703 08:28:22.975370       1 shared_informer.go:240] Waiting for caches to sync for longhorn-share-manager-controller
12
I0703 08:28:22.975373       1 shared_informer.go:247] Caches are synced for longhorn-share-manager-controller
11
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backing Image controller"
10
I0703 08:28:22.975394       1 shared_informer.go:240] Waiting for caches to sync for longhorn backing images
9
I0703 08:28:22.975397       1 shared_informer.go:247] Caches are synced for longhorn backing images
8
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn backing image manager controller"
7
I0703 08:28:22.975416       1 shared_informer.go:240] Waiting for caches to sync for longhorn backing image manager
6
I0703 08:28:22.975420       1 shared_informer.go:247] Caches are synced for longhorn backing image manager
5
time="2023-07-03T08:28:22Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"longhorn-system\", Name:\"k8snode03\", UID:\"71463fd1-bb3c-464a-87f4-4fc820d30240\", APIVersion:\"longhorn.io/v1beta2\", ResourceVersion:\"6642667\", FieldPath:\"\"}): type: 'Warning' reason: 'Delete' Deleting node k8snode03"
4
time="2023-07-03T08:28:22Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"longhorn-system\", Name:\"k8snode04\", UID:\"554c2ad1-1619-4f6d-bb3a-82c00bca22a7\", APIVersion:\"longhorn.io/v1beta2\", ResourceVersion:\"6642668\", FieldPath:\"\"}): type: 'Warning' reason: 'Delete' Deleting node k8snode04"
3
time="2023-07-03T08:28:22Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"longhorn-system\", Name:\"k8snode05\", UID:\"f2bbe973-7df1-4e25-a442-4428212fa167\", APIVersion:\"longhorn.io/v1beta2\", ResourceVersion:\"6642669\", FieldPath:\"\"}): type: 'Warning' reason: 'Delete' Deleting node k8snode05"
2
time="2023-07-03T08:28:33Z" level=warning msg="Error syncing ConfigMap longhorn-system/longhorn-default-setting" controller=longhorn-kubernetes-configmap-controller error="longhorn-kubernetes-configmap-controller: failed to sync longhorn-system/longhorn-default-setting: failed to update built-in settings with customized values: Internal error occurred: failed calling webhook \"validator.longhorn.io\": Post \"https://longhorn-admission-webhook.longhorn-system.svc:9443/v1/webhook/validaton?timeout=10s\": context deadline exceeded" node=k8snode01
1
time="2023-07-03T08:28:33Z" level=fatal msg="Error starting manager: Internal error occurred: failed calling webhook \"validator.longhorn.io\": Post \"https://longhorn-admission-webhook.longhorn-system.svc:9443/v1/webhook/validaton?timeout=10s\": context deadline exceeded"

If applicable, add the Longhorn managers’ log or support bundle when the issue happens. You can generate a Support Bundle using the link at the footer of the Longhorn UI.

Environment

  • Longhorn version: 1.4.0
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): ArgoCD
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: RKE 1.22
    • Number of management node in the cluster: 3
    • Number of worker node in the cluster: 5
  • Node config
    • OS type and version: RHEL 7.9
    • CPU per node: 4
    • Memory per node: 16
    • Disk type(e.g. SSD/NVMe): NVMe
    • Network bandwidth between the nodes:
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Openstack
  • Number of Longhorn volumes in the cluster: none created so far

Additional context

Add any other context about the problem here.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

Most likely there is some underlying networking issue on the openstack … the API is having similar connection issues to other cluster targets. I guess we can close this one and have a chat with the private cloud infra team.

Thanks

Close the issue and feel free to open it if you need any support.