relayer: Getting "failed to get trusted headers" error

relayer version: 1.0.0-rc I am trying to connect musselnet-2 and stargate-final networks via IBC. this is the error I am getting at the moment:

rly tx link musselnet-2_stargate-final
I[2021-01-18|00:02:01.812] ★ Clients created: client(07-tendermint-0) on chain[musselnet-2] and client(07-tendermint-0) on chain[stargate-final]
I[2021-01-18|00:02:42.784] failed to get trusted headers: chain(stargate-final): post failed: Post "http://x.x.x.x:26657": EOF
I[2021-01-18|00:02:42.784] retrying transaction...
I[2021-01-18|00:05:22.766] failed to get trusted headers: chain(stargate-final): post failed: Post "http://x.x.x.x:26657": EOF
I[2021-01-18|00:05:22.766] retrying transaction...
I[2021-01-18|00:05:52.751] failed to get trusted headers: chain(stargate-final): post failed: Post "http://x.x.x.x:26657": EOF
I[2021-01-18|00:05:52.751] retrying transaction...

When I followed up the bug I found out this line is causing the error https://github.com/cosmos/relayer/blob/v1.0.0-rc1/relayer/headers.go#L122 which looks like might be related to this issue https://github.com/cosmos/cosmos-sdk/pull/8341

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 30 (26 by maintainers)

Most upvoted comments

Release v0.8.2 appears to have fixed this for us. Haven’t noticed any issues and have been hitting it pretty hard for a couple hours. Nice work!

we should update the error message to indicate that the headers have likely been pruned

@kogisin thanks for the report!! This was very useful. It made me realize I forgot to check if a client is expired or frozen when reusing existing clients.

Since bifrost-2 requires trusting-period to be lower than unbonding-period which was 24 hours

This is a Tendermint light client security requirement. Trusting period must always be less than the unbonding-period

Here is what is happening:

  1. Clients are being created with the default configs
  2. For testing purposes, it is decided to abandon this client and restart
  3. A new linking attempt is made using the same default config
  4. To avoid redundancy, the code checks to see if a client created already matches the necessary configs
  5. What I forgot to check is that the client is not expired.
  6. The code is using the on-chain client to query the trusted height in order to construct proof correctly. The on-chain client has a very old height and thus the historical info is long gone. Code logically follows

Fixes:

  • Check that the client is not expired (there’s already a helper function we can use)
  • Update error messages. There’s a few things that could have gone wrong here. We should try to pin point exactly where and update the error message so it says “your client is expired” or “check that your node isn’t pruning height” etc
  • ensure off-chain clients are being updated before creating on-chain clients

@orkunkl thanks for trying. Did the heights in the error message change?

I will try to reproduce your error this week and then continue debugging.

As a side note, you’ll need to update your channel order to be UNORDERED. ICS-20 does not support ordered channels

I solved the issue by changing our full node pruning strategy to nothing