lnd: failing link: unable to update commitment: cannot add duplicate keystone with error: internal error

Background

The error message “failing link: unable to update commitment: cannot add duplicate keystone with error: internal error” appeared in my logs, for reasons I don’t understand. Afterwards the channel was unusable and my peer (running CLN) immediately force-closed the channel at 01:26.

The force-close transaction contains one outgoing HTLC (timeout, according to lncli closedchannels) with size 200003 sat.

My peer says:

I’m getting hit by several force closes for all kind of reasons in the last couple days. Ours is just “internal error” in my log

Relevant snippet from my logs:

01:26:09.438 [INF] PEER: NodeKey(PUBKEY) loading ChannelPoint(CHANPOINT)
01:26:09.439 [DBG] CNCT: New ChainEventSubscription(id=15) for ChannelPoint(CHANPOINT)
01:26:09.439 [INF] HSWC: ChannelLink(CHANPOINT): starting
01:26:09.439 [INF] CNCT: Attempting to update ContractSignals for ChannelPoint(CHANPOINT)
01:26:09.439 [INF] HSWC: ChannelLink(CHANPOINT): HTLC manager started, bandwidth=3663215408 mSAT
01:26:09.439 [INF] HSWC: ChannelLink(CHANPOINT): attempting to re-synchronize
01:26:09.439 [INF] PEER: Negotiated chan series queries with PUBKEY
01:26:09.519 [ERR] RPCS: [connectpeer]: error connecting to peer: already connected to peer: PUBKEY@IP2:48304
01:26:09.519 [ERR] RPCS: [/lnrpc.Lightning/ConnectPeer]: already connected to peer: PUBKEY@IP2:48304
01:26:09.839 [INF] HSWC: ChannelLink(CHANPOINT): received re-establishment message from remote side
01:26:09.851 [DBG] HSWC: ChannelLink(CHANPOINT): loaded 0 fwd pks
01:26:11.237 [DBG] HSWC: ChannelLink(CHANPOINT): queueing keystone of ADD open circuit: (Chan ID=0:0:0, HTLC ID=6133225)->(Chan ID=CHAN_ID, HTLC ID=4619)
01:26:11.991 [DBG] HSWC: ChannelLink(CHANPOINT): removing Add packet (Chan ID=0:0:0, HTLC ID=6133225) from mailbox
01:26:13.366 [DBG] HSWC: ChannelLink(CHANPOINT): settle-fail-filter &{1 [0]}
01:26:13.366 [DBG] HSWC: ChannelLink(CHANPOINT): Failed to send 500059997 mSAT
01:26:15.297 [DBG] CNCT: ChannelArbitrator(CHANPOINT): attempting state step with trigger=chainTrigger from state=StateDefault
01:26:15.297 [DBG] CNCT: ChannelArbitrator(CHANPOINT): new block (height=734163) examining active HTLC's
01:26:15.297 [DBG] CNCT: ChannelArbitrator(CHANPOINT): checking commit chain actions at height=734163, in_htlc_count=0, out_htlc_count=2
01:26:15.297 [DBG] CNCT: ChannelArbitrator(CHANPOINT): no actions for chain trigger, terminating
01:26:15.297 [DBG] CNCT: ChannelArbitrator(CHANPOINT): terminating at state=StateDefault
01:26:16.485 [DBG] HSWC: ChannelLink(CHANPOINT): settle-fail-filter &{1 [0]}
01:26:16.485 [DBG] HSWC: ChannelLink(CHANPOINT): Failed to send 1000047000 mSAT
01:26:26.905 [DBG] HSWC: ChannelLink(CHANPOINT): queueing keystone of ADD open circuit: (Chan ID=0:0:0, HTLC ID=6133240)->(Chan ID=CHAN_ID, HTLC ID=4620)

01:26:26.956 [ERR] HSWC: ChannelLink(CHANPOINT): failing link: unable to update commitment: cannot add duplicate keystone with error: internal error

01:26:26.956 [INF] HSWC: ChannelLink(CHANPOINT): exited
01:26:26.957 [INF] HSWC: ChannelLink(CHANPOINT): stopping

Your environment

lnd 0.14.3-beta-rc1
Linux server 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux
bitcoind v23

Steps to reproduce

Have non-anchor channel with CLN behind tor. Have somewhat flaky connection. Send HTLCs to peer.

About this issue

Original URL
State: closed
Created 2 years ago
Reactions: 4
Comments: 27 (13 by maintainers)

Most upvoted comments

I run the latest CLN (0.11.1) - my node has force closed multiple channels when my peers told me they had an internal error.

According to BOLT 2 (https://github.com/lightning/bolts/blob/master/01-messaging.md#requirements-2)

The receiving node:

upon receiving error:
    if channel_id is all zero:
        MUST fail all channels with the sending node.
    otherwise:
        MUST fail the channel referred to by channel_id, if that channel is with the sending node.

So the CLN behaviour is up to specs AFAICT.

zerofeerouting on May 23, 2022

Ok, I might have an idea of what’s going on. If I’m right this is an lnd problem, not a c-l problem

Crypt-iQ on May 2, 2022

I have had about 20 of these over the last couple of days. This is a really pressing issue.

zerofeerouting on May 23, 2022

Noted a related thread in #6482 where in some cases we may not be properly cancelling inbound HTLCs if we attempt to send a commitment but the remote peer never replies. This is a bit trickier since we’ve technically already sent out that valid commitment, so we need to be playing that HTLC (may lead to a force close since we want to be able to safely time out that incoming HTLC).

Roasbeef on May 4, 2022

Looks like this was introduced inadvertently in this PR (according to @Crypt-iQ): https://github.com/lightningnetwork/lnd/pull/4183

Roasbeef on May 3, 2022