fedimint: Weird communication errors while setting up federation

If I setup a federation using two or more machines in the same datacenter, everything works fine. But if one of the machines is a home computer on another continent, then weird stuff starts to happen. In this example, DKG actually failed.

Note 1: all ports are open and properly forwarded. Communication seems to be okay in both directions. Home internet is 1Gbit/s fiber, so no bandwidth issues besides international link constraints

Note 2: logs abbreviate for repeated or sequential messages by replacing them with (...)

home computer logs:

fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:10.714149Z  INFO connection{remote_addr=<redacted>:47346 conn_id=17}: net::peer::dkg: Peer 0 running distributed key generation...
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:12.543892Z  INFO run_io_thread: net::peer: Replacing existing connection
(...)
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:21.222195Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:21.239828Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(90) received=MessageId(1)
(...)
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:21.245097Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(90) received=MessageId(89)
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:22.010248Z  INFO run_io_thread: net::peer: Replacing existing connection
(...)
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:26.160320Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:27.275235Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(530) received=MessageId(136)
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:27.275263Z  INFO run_io_thread: net::peer: Replacing existing connection
(...)
fedimintd-mutinynet-fedimintd-1    | 2023-07-20T19:02:58.968762Z  INFO run_io_thread: net::peer: Replacing existing connection

Remote datacenter computer:

fedimintd_1          | 2023-07-20T19:02:11.899547Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: net::peer::dkg: Peer 1 running distributed key generation...
fedimintd_1          | 2023-07-20T19:02:12.643151Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:13.667591Z  INFO connection{remote_addr=<redacted>:53844 conn_id=14}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:13.676290Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:14.065690Z ERROR connection{remote_addr=<redacted>:53844 conn_id=14}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 14
fedimintd_1          | 2023-07-20T19:02:14.293186Z  INFO run_io_thread: net::peer: Replacing existing connection
(...)
fedimintd_1          | 2023-07-20T19:02:16.813292Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:17.242244Z  INFO connection{remote_addr=<redacted>:53854 conn_id=15}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:17.641922Z ERROR connection{remote_addr=<redacted>:53854 conn_id=15}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 15
fedimintd_1          | 2023-07-20T19:02:18.241657Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:18.251738Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(113) received=MessageId(1)
(...)
fedimintd_1          | 2023-07-20T19:02:18.267630Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(113) received=MessageId(92)
fedimintd_1          | 2023-07-20T19:02:18.869571Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:19.693103Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:20.072910Z  INFO connection{remote_addr=<redacted>:53858 conn_id=16}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:20.483714Z ERROR connection{remote_addr=<redacted>:53858 conn_id=16}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 16
fedimintd_1          | 2023-07-20T19:02:20.501468Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:20.515497Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(186) received=MessageId(113)
(...)
fedimintd_1          | 2023-07-20T19:02:20.557392Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(186) received=MessageId(158)
fedimintd_1          | 2023-07-20T19:02:21.289447Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:21.940945Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:23.274142Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:23.467870Z  INFO connection{remote_addr=<redacted>:47002 conn_id=17}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:23.866845Z ERROR connection{remote_addr=<redacted>:47002 conn_id=17}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 17
fedimintd_1          | 2023-07-20T19:02:24.490124Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:24.692470Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(238) received=MessageId(186)
(...)
fedimintd_1          | 2023-07-20T19:02:24.730179Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(238) received=MessageId(237)
fedimintd_1          | 2023-07-20T19:02:26.302726Z  INFO connection{remote_addr=<redacted>:47018 conn_id=18}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:26.312642Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:26.723068Z ERROR connection{remote_addr=<redacted>:47018 conn_id=18}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 18
fedimintd_1          | 2023-07-20T19:02:27.164129Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:28.351101Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:28.362582Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(561) received=MessageId(368)
(...)
fedimintd_1          | 2023-07-20T19:02:28.474514Z  INFO run_io_thread: net::peer: Received old message expected=MessageId(561) received=MessageId(560)
fedimintd_1          | 2023-07-20T19:02:29.062812Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: net::peer::dkg: Sending confirmations to other peers.
fedimintd_1          | 2023-07-20T19:02:29.063067Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: net::peer::dkg: Waiting for confirmations from other peers.
fedimintd_1          | 2023-07-20T19:02:29.723137Z  INFO connection{remote_addr=<redacted>:47034 conn_id=19}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:30.121731Z ERROR connection{remote_addr=<redacted>:47034 conn_id=19}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 19
fedimintd_1          | 2023-07-20T19:02:30.151219Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:31.130850Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:31.998347Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:32.553553Z  INFO connection{remote_addr=<redacted>:46852 conn_id=20}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:32.949790Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:33.059406Z ERROR connection{remote_addr=<redacted>:46852 conn_id=20}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 20
fedimintd_1          | 2023-07-20T19:02:34.184564Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:35.255629Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:35.461529Z  INFO connection{remote_addr=<redacted>:46868 conn_id=21}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:36.038083Z ERROR connection{remote_addr=<redacted>:46868 conn_id=21}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 21
fedimintd_1          | 2023-07-20T19:02:36.479843Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:37.448694Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:38.830992Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:39.045601Z  INFO connection{remote_addr=<redacted>:46878 conn_id=22}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:39.587477Z ERROR connection{remote_addr=<redacted>:46878 conn_id=22}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 22
fedimintd_1          | 2023-07-20T19:02:39.808029Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:40.746505Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:41.981746Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:42.034322Z  INFO connection{remote_addr=<redacted>:38974 conn_id=23}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:42.525462Z ERROR connection{remote_addr=<redacted>:38974 conn_id=23}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 23
fedimintd_1          | 2023-07-20T19:02:42.945827Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:43.838556Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:45.226698Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:45.443084Z  INFO connection{remote_addr=<redacted>:38976 conn_id=24}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:45.979793Z ERROR connection{remote_addr=<redacted>:38976 conn_id=24}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 24
fedimintd_1          | 2023-07-20T19:02:46.198558Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:47.126712Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:48.402344Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:48.420062Z  INFO connection{remote_addr=<redacted>:38982 conn_id=25}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:48.993393Z ERROR connection{remote_addr=<redacted>:38982 conn_id=25}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 25
fedimintd_1          | 2023-07-20T19:02:49.413231Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:50.268860Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:51.638846Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:51.881008Z  INFO connection{remote_addr=<redacted>:60856 conn_id=26}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:52.405201Z ERROR connection{remote_addr=<redacted>:60856 conn_id=26}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 26
fedimintd_1          | 2023-07-20T19:02:52.621806Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:53.557667Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:54.793772Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:54.843049Z  INFO connection{remote_addr=<redacted>:60858 conn_id=27}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:55.191758Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
(...)
fedimintd_1          | 2023-07-20T19:02:55.391306Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
fedimintd_1          | 2023-07-20T19:02:55.391598Z ERROR connection{remote_addr=<redacted>:60858 conn_id=27}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 27
fedimintd_1          | 2023-07-20T19:02:55.811309Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:55.864001Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
(...)
fedimintd_1          | 2023-07-20T19:02:56.276473Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
fedimintd_1          | 2023-07-20T19:02:56.695335Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:56.742026Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
(...)
fedimintd_1          | 2023-07-20T19:02:58.448011Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
fedimintd_1          | 2023-07-20T19:02:58.448121Z  INFO connection{remote_addr=<redacted>:60862 conn_id=28}: jsonrpsee_server::server: Accepting new connection 6/10
fedimintd_1          | 2023-07-20T19:02:58.846899Z ERROR connection{remote_addr=<redacted>:60862 conn_id=28}: jsonrpsee_server::transport::ws: WS transport error: i/o error: Transport endpoint is not connected (os error 107); terminate connection: 28
fedimintd_1          | 2023-07-20T19:02:58.869064Z  INFO run_io_thread: net::peer: Replacing existing connection
fedimintd_1          | 2023-07-20T19:02:58.922201Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
(...)
fedimintd_1          | 2023-07-20T19:02:59.075063Z  WARN net::peer: Peer 0 has 10001 pending messages. Dropping new message.
fedimintd_1          | 2023-07-20T19:02:59.075453Z ERROR connection{remote_addr=<redacted>:34162 conn_id=2}: net::peer::dkg: Timeout waiting for dkg completion confirmation from other peers
fedimintd_1          | 2023-07-20T19:02:59.075876Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: net::peer: Distributed key generation has completed successfully!
fedimintd_1          | 2023-07-20T19:02:59.076230Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: timing: Operation timing name=distributed-gen duration_ms=47177
fedimintd_1          | 2023-07-20T19:02:59.079018Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: task: Waiting for task to finish task=io-thread-peer-0
fedimintd_1          | 2023-07-20T19:02:59.079722Z  INFO run_io_thread: net::peer: Shutting down peer connection state machine peer=PeerId(0)
fedimintd_1          | 2023-07-20T19:02:59.081718Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: task: Task finished task=io-thread-peer-0
fedimintd_1          | 2023-07-20T19:02:59.082166Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: task: Waiting for task to finish task=listen task
fedimintd_1          | 2023-07-20T19:02:59.082599Z  INFO connection{remote_addr=<redacted>:34162 conn_id=2}: task: Task finished task=listen task

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (15 by maintainers)

Commits related to this issue

Most upvoted comments

Anyone with the affected setup would like to try #2841 ?

Works for me!

Anyone with the affected setup would like to try https://github.com/fedimint/fedimint/pull/2841 ?

This got to be some logical error somewhere, not a performance issue. And if it’s performance it would rather be latency (some timeout somewhere).