longhorn: [BUG]When the manager started, it reports failed to release lock

When a manager starts with other managers at the same time, some of them already finished the upgrade process but reports:

E0521 01:05:33.053845 1 leaderelection.go:282] Failed to release lock: Lease.coordination.k8s.io "longhorn-manager-upgrade-lock" is invalid: spec.leaseDurationSeconds: Invalid value: 0: must be greater than 0

This will delay the starting of managers in multiple nodes.

Longhorn version: v0.8.1

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 25 (18 by maintainers)

Commits related to this issue

Most upvoted comments

Hello,

After upgrade to v1.0.0 we are having the same error you mentioned but also causing an outage of the longhorn-manager

time="2020-06-04T08:05:15Z" level=info msg="Start upgrading"
time="2020-06-04T08:05:15Z" level=info msg="No API version upgrade is needed"
time="2020-06-04T08:05:15Z" level=info msg="Finish upgrading"
E0604 08:05:15.614795       1 leaderelection.go:282] Failed to release lock: Lease.coordination.k8s.io "longhorn-manager-upgrade-lock" is invalid: spec.leaseDurationSeconds: Invalid value: 0: must be greater than 0
time="2020-06-04T08:05:15Z" level=info msg="Upgrade leader lost: ip-10-226-29-79.eu-central-1.compute.internal"
E0604 08:05:16.512706       1 kubernetes_node_controller.go:256] Couldn't get nodes ip-10-226-29-79.eu-central-1.compute.internal: node "ip-10-226-29-79.eu-central-1.compute.internal" not found
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn Engine Image controller"
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn Setting controller"
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn node controller"
time="2020-06-04T08:05:17Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn engine controller"
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn websocket controller"
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn Kubernetes node controller"
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn replica controller"
time="2020-06-04T08:05:17Z" level=info msg="Start Longhorn volume controller"
time="2020-06-04T08:05:17Z" level=info msg="Starting Longhorn instance manager controller"
time="2020-06-04T08:05:17Z" level=info msg="Start kubernetes controller"
time="2020-06-04T08:05:17Z" level=debug msg="Start monitoring instance manager instance-manager-r-9becc455"
time="2020-06-04T08:05:17Z" level=debug msg="Start monitoring pvc-3d9132c7-024d-498c-a296-68454f8f6618-e-33b76109"
time="2020-06-04T08:05:17Z" level=debug msg="Start monitoring instance manager instance-manager-e-42ecd884"
time="2020-06-04T08:05:17Z" level=debug msg="Start backup store monitoring for s3://d3vw-i-com-europe-utilities-frankfurt-longhorn-backupstore@eu-central-1/pop-frankfurt/"
time="2020-06-04T08:05:21Z" level=debug msg="Skip rebuilding for volume pvc-3d9132c7-024d-498c-a296-68454f8f6618 because there is rebuilding in process"
time="2020-06-04T08:05:23Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:05:29Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:05:35Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:05:41Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:05:45Z" level=debug msg="Skip rebuilding for volume pvc-3d9132c7-024d-498c-a296-68454f8f6618 because there is rebuilding in process"
time="2020-06-04T08:05:47Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:05:53Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:05:59Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:05Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:11Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:15Z" level=debug msg="Skip rebuilding for volume pvc-3d9132c7-024d-498c-a296-68454f8f6618 because there is rebuilding in process"
time="2020-06-04T08:06:17Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:23Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:27Z" level=debug msg="Skip rebuilding for volume pvc-3d9132c7-024d-498c-a296-68454f8f6618 because there is rebuilding in process"
time="2020-06-04T08:06:29Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:35Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:41Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:45Z" level=debug msg="Skip rebuilding for volume pvc-3d9132c7-024d-498c-a296-68454f8f6618 because there is rebuilding in process"
time="2020-06-04T08:06:47Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:53Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:06:59Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:07:05Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:07:11Z" level=debug msg="Waiting for engine image longhornio/longhorn-engine:v1.0.0 to be ready"
time="2020-06-04T08:07:15Z" level=debug msg="Skip rebuilding for volume pvc-3d9132c7-024d-498c-a296-68454f8f6618 because there is rebuilding in process"
time="2020-06-04T08:07:17Z" level=fatal msg="Error starting manager: failed to wait for engine image longhornio/longhorn-engine:v1.0.0: Wait for engine image longhornio/longhorn-engine:v1.0.0 timed out"