lighthouse: InsufficientPeers resulting in missing sync commitees
Description
Our lighthouse nodes occasionally start to miss attestations and today also missed sync committees. Looking at the beaconchain node logs, it seems that it has issues with publishing these messages because of Insufficient Peers, despite being connected to 50+ peers. Restarting the node solves the issue.
Version
Lighthouse v2.0.1 (using vouch as the validator client)
Present Behaviour
Describe the present behaviour of the application, with regards to this issue.
Dec 07 13:30:41 lhs-val01 lighthouse[997]: Dec 07 13:30:41.000 INFO Synced slot: 2671351, block: 0xb807…9b44, epoch: 83479, finalized_epoch: 83477, finalized_root: 0x0e20…8ae0, peers: 55, service: slot_notifier
Dec 07 13:30:49 lhs-val01 lighthouse[997]: Dec 07 13:30:49.652 INFO New block received hash: 0x81272318e6771c48a8cdd656255c1c61afd45d9a6cc5d037d28059235600e05e, slot: 2671352
Dec 07 13:30:50 lhs-val01 lighthouse[997]: Dec 07 13:30:50.467 WARN Could not publish message error: InsufficientPeers, service: libp2p
Dec 07 13:30:51 lhs-val01 lighthouse[997]: Dec 07 13:30:51.487 WARN Could not publish message error: InsufficientPeers, service: libp2p
Dec 07 13:30:51 lhs-val01 lighthouse[997]: Dec 07 13:30:51.490 WARN Could not publish message error: InsufficientPeers, service: libp2p
Dec 07 13:30:51 lhs-val01 lighthouse[997]: Dec 07 13:30:51.491 WARN Could not publish message error: InsufficientPeers, service: libp2p
Steps to resolve
Restarting the beaconchain node usually resolves the issue.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 3
- Comments: 20 (9 by maintainers)
thanks for looking into this. i used somer esta’s guide with ntp service configured : System clock synchronized: yes
NTP service: active
i had a case with this service being active but not sync, i’ll watch this next time i have bad effectiveness.
for anyone reading this, this is a kind of error to look for :
Sep 05 08:31:55 eth3 lighthouse[26048]: Sep 05 07:31:55.984 DEBG Invalid attestation from network type: "aggregated", peer_id: 16Uiu2HAm4zThK1zDn8jhGoLiycPZfYNiJyi4p1xcBHEY7q7jAHC4, block: 0x259168fa37ca9548abf2812bba8d551c386a12ac380bc7d91a400ee0fa134f98, reason: FutureSlot { attestation_slot: Slot(4628258), latest_permissible_slot: Slot(4628257) }edit : can confirm it’s a ntp issue :
systemd-timesyncd[713]: Failed to set up connection socket: Address family not supported by protocolcaused by disabling ipv6.Thanks for reporting @fkbenjamin. There are some networking-wide issues with maintaining a healthy count of sync-committee peers at present. There are nascent talks about modifying the spec to increase the number of nodes which are required to connect to sync committees.
I’ll leave this open so we can track the issue in Lighthouse specifically.
We’re also seeing similar issues. We’re using lighthouse as the VC
Restarting beacon node fixes it: