longhorn: [BUG] Error starting manager: Internal error occurred:
Describe the bug (π if you encounter this issue)
Longhorn-manager is failing to start on first install
A clear and concise description of what the bug is.
longhorn-manager cannot connect to the admission-webhook because of the bellow error (there are no errors on the admission-webhook pods )
To Reproduce
Default values, version1.4.0, manually port-forwarding to the admission-webhook pod svc and doing a curl works
Steps to reproduce the behavior:
- Go to ββ¦β
- Click on ββ¦β
- Perform ββ¦β
- See error
Expected behavior
A clear and concise description of what you expected to happen.
Log or Support bundle
time="2023-07-03T08:28:22Z" level=info msg="cannot list the content of the src directory /var/lib/rancher/longhorn/engine-binaries for the copy, will do nothing: failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net bash -c ls /var/lib/rancher/longhorn/engine-binaries/*], output , stderr ls: cannot access /var/lib/rancher/longhorn/engine-binaries/*: No such file or directory\n: exit status 2"
99
time="2023-07-03T08:28:22Z" level=info msg="Skip the leader election for the upgrade since the current Longhorn system is already up to date"
98
W0703 08:28:22.868957 1 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
97
I0703 08:28:22.872608 1 shared_informer.go:240] Waiting for caches to sync for longhorn datastore
96
E0703 08:28:22.892357 1 kubernetes_node_controller.go:227] couldn't get kubernetes node k8snode01: node "k8snode01" not found
95
I0703 08:28:22.972721 1 shared_informer.go:247] Caches are synced for longhorn datastore
94
time="2023-07-03T08:28:22Z" level=info msg="Initializing metrics collector system" node=k8snode01
93
W0703 08:28:22.972817 1 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
92
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn backing image data source controller"
91
I0703 08:28:22.973267 1 shared_informer.go:240] Waiting for caches to sync for longhorn backing image data source
90
I0703 08:28:22.973275 1 shared_informer.go:247] Caches are synced for longhorn backing image data source
89
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn replica controller" controller=longhorn-replica node=k8snode01
88
I0703 08:28:22.973335 1 shared_informer.go:240] Waiting for caches to sync for longhorn replicas
87
I0703 08:28:22.973339 1 shared_informer.go:247] Caches are synced for longhorn replicas
86
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn engine controller" controller=longhorn-engine node=k8snode01
85
I0703 08:28:22.973357 1 shared_informer.go:240] Waiting for caches to sync for longhorn engines
84
I0703 08:28:22.973360 1 shared_informer.go:247] Caches are synced for longhorn engines
83
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn volume controller" controller=longhorn-volume node=k8snode01
82
I0703 08:28:22.973380 1 shared_informer.go:240] Waiting for caches to sync for longhorn engines
81
I0703 08:28:22.973383 1 shared_informer.go:247] Caches are synced for longhorn engines
80
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Engine Image controller" controller=longhorn-engine-image node=k8snode01
79
I0703 08:28:22.973412 1 shared_informer.go:240] Waiting for caches to sync for longhorn engine images
78
I0703 08:28:22.973416 1 shared_informer.go:247] Caches are synced for longhorn engine images
77
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn node controller"
76
I0703 08:28:22.973440 1 shared_informer.go:240] Waiting for caches to sync for longhorn node
75
I0703 08:28:22.973443 1 shared_informer.go:247] Caches are synced for longhorn node
74
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backup Target controller" controller=longhorn-backup-target node=k8snode01
73
I0703 08:28:22.973924 1 shared_informer.go:240] Waiting for caches to sync for longhorn-backup-target
72
I0703 08:28:22.973931 1 shared_informer.go:247] Caches are synced for longhorn-backup-target
71
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backup Volume controller" controller=longhorn-backup-volume node=k8snode01
70
I0703 08:28:22.973958 1 shared_informer.go:240] Waiting for caches to sync for longhorn-backup-volume
69
I0703 08:28:22.973961 1 shared_informer.go:247] Caches are synced for longhorn-backup-volume
68
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backup controller" controller=longhorn-backup node=k8snode01
67
I0703 08:28:22.974006 1 shared_informer.go:240] Waiting for caches to sync for longhorn-backup
66
I0703 08:28:22.974009 1 shared_informer.go:247] Caches are synced for longhorn-backup
65
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Recurring Job controller"
64
I0703 08:28:22.974038 1 shared_informer.go:240] Waiting for caches to sync for longhorn recurring jobs
63
I0703 08:28:22.974042 1 shared_informer.go:247] Caches are synced for longhorn recurring jobs
62
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Orphan controller" controller=longhorn-orphan node=k8snode01
61
I0703 08:28:22.974075 1 shared_informer.go:240] Waiting for caches to sync for longhorn-orphan
60
I0703 08:28:22.974079 1 shared_informer.go:247] Caches are synced for longhorn-orphan
59
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Snapshot Controller" controller=longhorn-snapshot node=k8snode01
58
I0703 08:28:22.974120 1 shared_informer.go:240] Waiting for caches to sync for longhorn-snapshot
57
I0703 08:28:22.974126 1 shared_informer.go:247] Caches are synced for longhorn-snapshot
56
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Support Bundle controller" controller=longhorn-support-bundle node=k8snode01
55
I0703 08:28:22.974157 1 shared_informer.go:240] Waiting for caches to sync for longhorn-support-bundle
54
I0703 08:28:22.974161 1 shared_informer.go:247] Caches are synced for longhorn-support-bundle
53
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn SystemBackup controller" controller=longhorn-system-backup node=k8snode01
52
I0703 08:28:22.974193 1 shared_informer.go:240] Waiting for caches to sync for longhorn-system-backup
51
I0703 08:28:22.974198 1 shared_informer.go:247] Caches are synced for longhorn-system-backup
50
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn SystemRestore controller" controller=longhorn-system-restore node=k8snode01
49
I0703 08:28:22.974245 1 shared_informer.go:240] Waiting for caches to sync for longhorn-system-restore
48
I0703 08:28:22.974248 1 shared_informer.go:247] Caches are synced for longhorn-system-restore
47
time="2023-07-03T08:28:22Z" level=info msg="Start kubernetes controller"
46
I0703 08:28:22.974270 1 shared_informer.go:240] Waiting for caches to sync for kubernetes
45
I0703 08:28:22.974273 1 shared_informer.go:247] Caches are synced for kubernetes
44
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Kubernetes node controller"
43
I0703 08:28:22.974315 1 shared_informer.go:240] Waiting for caches to sync for longhorn kubernetes node
42
I0703 08:28:22.974319 1 shared_informer.go:247] Caches are synced for longhorn kubernetes node
41
time="2023-07-03T08:28:22Z" level=info msg="Start longhorn-kubernetes-pod-controller" controller=longhorn-kubernetes-pod node=k8snode01
40
I0703 08:28:22.974354 1 shared_informer.go:240] Waiting for caches to sync for longhorn-kubernetes-pod-controller
39
I0703 08:28:22.974359 1 shared_informer.go:247] Caches are synced for longhorn-kubernetes-pod-controller
38
time="2023-07-03T08:28:22Z" level=info msg=Start controller=longhorn-kubernetes-configmap-controller node=k8snode01
37
I0703 08:28:22.974378 1 shared_informer.go:240] Waiting for caches to sync for longhorn-kubernetes-configmap-controller
36
I0703 08:28:22.974382 1 shared_informer.go:247] Caches are synced for longhorn-kubernetes-configmap-controller
35
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Kubernetes secret controller" controller=longhorn-kubernetes-secret-controller node=k8snode01
34
I0703 08:28:22.974657 1 shared_informer.go:240] Waiting for caches to sync for longhorn-kubernetes-secret-controller
33
I0703 08:28:22.974662 1 shared_informer.go:247] Caches are synced for longhorn-kubernetes-secret-controller
32
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
31
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
30
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
29
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
28
time="2023-07-03T08:28:22Z" level=warning msg="Cannot found the default backup target" controller=longhorn-kubernetes-secret-controller node=k8snode01
27
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn websocket controller" controller=longhorn-websocket node=k8snode01
26
I0703 08:28:22.975017 1 shared_informer.go:240] Waiting for caches to sync for longhorn websocket
25
I0703 08:28:22.975022 1 shared_informer.go:247] Caches are synced for longhorn websocket
24
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Setting controller" controller=longhorn-setting node=k8snode01
23
I0703 08:28:22.975040 1 shared_informer.go:240] Waiting for caches to sync for longhorn settings
22
I0703 08:28:22.975044 1 shared_informer.go:247] Caches are synced for longhorn settings
21
time="2023-07-03T08:28:22Z" level=warning msg="Failed to select node for sync backup target" controller=longhorn-setting error="no ready nodes available" node=k8snode01
20
time="2023-07-03T08:28:22Z" level=warning msg="Error syncing Longhorn setting longhorn-system/backup-target" controller=longhorn-setting error="failed to sync setting for longhorn-system/backup-target: failed to sync backup target: no ready nodes available" node=k8snode01
19
time="2023-07-03T08:28:22Z" level=warning msg="Failed to select node for sync backup target" controller=longhorn-setting error="no ready nodes available" node=k8snode01
18
time="2023-07-03T08:28:22Z" level=warning msg="Error syncing Longhorn setting longhorn-system/backup-target-credential-secret" controller=longhorn-setting error="failed to sync setting for longhorn-system/backup-target-credential-secret: failed to sync backup target: no ready nodes available" node=k8snode01
17
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn instance manager controller"
16
I0703 08:28:22.975341 1 shared_informer.go:240] Waiting for caches to sync for longhorn instance manager
15
I0703 08:28:22.975347 1 shared_informer.go:247] Caches are synced for longhorn instance manager
14
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn share manager controller" controller=longhorn-share-manager node=k8snode01
13
I0703 08:28:22.975370 1 shared_informer.go:240] Waiting for caches to sync for longhorn-share-manager-controller
12
I0703 08:28:22.975373 1 shared_informer.go:247] Caches are synced for longhorn-share-manager-controller
11
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn Backing Image controller"
10
I0703 08:28:22.975394 1 shared_informer.go:240] Waiting for caches to sync for longhorn backing images
9
I0703 08:28:22.975397 1 shared_informer.go:247] Caches are synced for longhorn backing images
8
time="2023-07-03T08:28:22Z" level=info msg="Starting Longhorn backing image manager controller"
7
I0703 08:28:22.975416 1 shared_informer.go:240] Waiting for caches to sync for longhorn backing image manager
6
I0703 08:28:22.975420 1 shared_informer.go:247] Caches are synced for longhorn backing image manager
5
time="2023-07-03T08:28:22Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"longhorn-system\", Name:\"k8snode03\", UID:\"71463fd1-bb3c-464a-87f4-4fc820d30240\", APIVersion:\"longhorn.io/v1beta2\", ResourceVersion:\"6642667\", FieldPath:\"\"}): type: 'Warning' reason: 'Delete' Deleting node k8snode03"
4
time="2023-07-03T08:28:22Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"longhorn-system\", Name:\"k8snode04\", UID:\"554c2ad1-1619-4f6d-bb3a-82c00bca22a7\", APIVersion:\"longhorn.io/v1beta2\", ResourceVersion:\"6642668\", FieldPath:\"\"}): type: 'Warning' reason: 'Delete' Deleting node k8snode04"
3
time="2023-07-03T08:28:22Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"longhorn-system\", Name:\"k8snode05\", UID:\"f2bbe973-7df1-4e25-a442-4428212fa167\", APIVersion:\"longhorn.io/v1beta2\", ResourceVersion:\"6642669\", FieldPath:\"\"}): type: 'Warning' reason: 'Delete' Deleting node k8snode05"
2
time="2023-07-03T08:28:33Z" level=warning msg="Error syncing ConfigMap longhorn-system/longhorn-default-setting" controller=longhorn-kubernetes-configmap-controller error="longhorn-kubernetes-configmap-controller: failed to sync longhorn-system/longhorn-default-setting: failed to update built-in settings with customized values: Internal error occurred: failed calling webhook \"validator.longhorn.io\": Post \"https://longhorn-admission-webhook.longhorn-system.svc:9443/v1/webhook/validaton?timeout=10s\": context deadline exceeded" node=k8snode01
1
time="2023-07-03T08:28:33Z" level=fatal msg="Error starting manager: Internal error occurred: failed calling webhook \"validator.longhorn.io\": Post \"https://longhorn-admission-webhook.longhorn-system.svc:9443/v1/webhook/validaton?timeout=10s\": context deadline exceeded"
If applicable, add the Longhorn managersβ log or support bundle when the issue happens. You can generate a Support Bundle using the link at the footer of the Longhorn UI.
Environment
- Longhorn version: 1.4.0
- Installation method (e.g. Rancher Catalog App/Helm/Kubectl): ArgoCD
- Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: RKE 1.22
- Number of management node in the cluster: 3
- Number of worker node in the cluster: 5
- Node config
- OS type and version: RHEL 7.9
- CPU per node: 4
- Memory per node: 16
- Disk type(e.g. SSD/NVMe): NVMe
- Network bandwidth between the nodes:
- Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Openstack
- Number of Longhorn volumes in the cluster: none created so far
Additional context
Add any other context about the problem here.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 19 (9 by maintainers)
Most likely there is some underlying networking issue on the openstack β¦ the API is having similar connection issues to other cluster targets. I guess we can close this one and have a chat with the private cloud infra team.
Thanks
Close the issue and feel free to open it if you need any support.