umee: chain reaches consensus failure on binary rebuild and restart

Summary of Bug

When running the beta version of the chain locally, rebuilding the binary in beta mode, stopping the chain, and then restarting, it reaches consensus failure via an error from the x/leverage module :

ERR CONSENSUS FAILURE!!! err="-0.000002092846270928 years: negative time elapsed since last interest time" module=consensus ...

Not sure if this is a major issue, but still worth jotting down

Version

Please provide the output of the following commands:

  • $ umeed version price-feeder/v0.1.0-c66f8961
  • $ go version go version go1.17.3 darwin/arm64
  • $ uname -a Darwin Adams-MacBook-Pro-2.local 21.3.0 Darwin Kernel Version 21.3.0: Wed Jan 5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_ARM64_T6000 arm64

Steps to Reproduce

Steps to reproduce the behavior:

  1. Launch beta umee network
UMEE_ENABLE_BETA=true starport chain serve -c ./starport.ci.beta.yml -v --reset-once
  1. Rebuild binary with beta support
UMEE_ENABLE_BETA=true make install
  1. Stop & Restart umee network
UMEE_ENABLE_BETA=true starport chain serve -c ./starport.ci.beta.yml -v

For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 29 (29 by maintainers)

Most upvoted comments

I see, I didn’t look at https://github.com/cosmos/gaia/issues/1533

In that case I agree: let’s:

  • change log to Warn
  • add todo to solve it once the bug will be resolved by Tendermint.

But the exporting / importing stuffs from blocks is done in tendermint/cosmos-sdk layer, right?

No. We control how to handle import / export. https://github.com/umee-network/umee/blob/main/x/leverage/genesis.go

The inaccurate time isn’t in leverage export - it’s coming from BlockTime() itself going backwards. That’s also why Rafael was able to reproduce with gaiad

Possible approaches:

Option 0: Remove the error and allow negative time to elapse. Bad because it would allow double-interest over the same time period. Option 1: Fix BlockTime behavior on export. We would need to open an issue on cosmos sdk probably (we can do that anyway) but shouldn’t rely on it for our fix. Option 2: Use block height. This would lead to inaccuracies each block (6.5sec being treated as 5sec on average) and reduce yields. Also interest would fail to accrue over long periods (e.g. 7 days) where chain is down. Option 3: Modify AccrueAllInterest to treat “negative time elapsed” as “zero time elapsed”, aborting the function without modifying state until an EndBlock where BlockTime > LastInterestTime.

Closing this back up, @RafilxTenfen was able to get the fork past this just by updating the genesis file :^)

Reopening this since @RafilxTenfen has seen it while trying to do a fork of our testnet

We were talking about it. and the only way for this to happen is for the LastInterestTime in state to be greater than ctx.BlockTime().Unix() - it comes down to the behavior of BlockTime and EndBlocker (during which AccrueAllInterest sets LastInterestTime to ctx.BlockTime().Unix() after accruing interest using the difference of the two.)

The error message posted, by the way, comes out to be a 3 minute time reversal, longer than a block time, and might be consistent with the time the chain was stopped before relaunch. Still can’t figure out why BlockTime would reverse though.