go-spacemesh: Smeshing fails on restarting Node
Description
Smeshing fails after a successful setup on the next Node run.
After smeshing is set up the Node creates postdata_0.bin and postdata_metadata.json in the directory that the User defined as a smeshing-datadir.
On the next run Node tries to find key.bin in this directory instead of the previously used default location ~/post/data/key.bin. As a consequence, it can’t find it and creates a new key.bin there with the different (new) Node ID. Due to the new Node ID, it fails:
failed to complete post setup {"node_id": "7ca3802629e39fe48e334e104b84590683db25dfe639afd92615719e6084dfd2", "module": "atxBuilder", "errmsg": "`ID` config mismatch; expected: 7ca3802629e39fe48e334e104b84590683db25dfe639afd92615719e6084dfd2, found: 8d46e54777dd4d60beeca9fd02b3ab1a5c2797af43d801844a3b22c71cc075a0, datadir: /some/user/defined/path/for/post/data", "name": "atxBuilder"}
Then if the User cleans up the defined post directory (or left only key.bin there) it will work fine.
Steps to reproduce
- Run the Node from scratch using some
node-config.jsonwithout smeshing props - Run
smrepl --server localhost:9092 - Run
post setup. Follow the steps, specify a non-default directory for the post data. - Put the new section
smeshingin thenode-config - Restart Node process
Actual Behavior
Node isn’t Smeshing. Post data “corrupted”. The Node log (related parts):
2021-10-04T20:27:01.205+0300 INFO 00000.defaultLogger App version: v0.2.2-beta.1. Git: 38056f5-dirty - 38056f59d331e35e1e576c12040f522c610aff35 . Go Version: go1.15.13. OS: darwin-amd64
...
2021-10-04T20:27:02.065+0300 INFO starting spacemesh {"data-dir": "/Users/brusher/Library/Application Support/Electron/node-data/205", "post-dir": "/Users/brusher/spacemesh", "hostname": "Kirill-557.local", "name": ""}
2021-10-04T20:27:02.065+0300 INFO 00000.defaultLogger Looking for identity file at `/Users/brusher/spacemesh/key.bin`
2021-10-04T20:27:02.065+0300 INFO 00000.defaultLogger Identity file not found. Creating new identity...
2021-10-04T20:27:02.066+0300 INFO 00000.defaultLogger created new identity {"public_key": "29137", "name": ""}
...
2021-10-04T20:27:02.399+0300 INFO 29137.clock started notifying {"node_id": "29137508efea26a1777a84c7d13f53f7d287b8cb3cacf789c09c3a3d5dc2dd9e"}
2021-10-04T20:27:02.399+0300 INFO 29137.post post setup session starting {"node_id": "29137508efea26a1777a84c7d13f53f7d287b8cb3cacf789c09c3a3d5dc2dd9e", "module": "post", "data_dir": "/Users/brusher/spacemesh", "num_units": "4", "labels_per_unit": "1024", "bits_per_label": "8", "provider": "1", "name": "post"}
...
2021-10-04T20:27:02.400+0300 INFO 00000.defaultLogger starting new grpc server on :9092
2021-10-04T20:27:02.400+0300 ERROR 29137.atxBuilder failed to complete post setup {"node_id": "29137508efea26a1777a84c7d13f53f7d287b8cb3cacf789c09c3a3d5dc2dd9e", "module": "atxBuilder", "errmsg": "`ID` config mismatch; expected: 29137508efea26a1777a84c7d13f53f7d287b8cb3cacf789c09c3a3d5dc2dd9e, found: 6859a5fe4f53226bd43c607053dcd62cb3bdf2eda5d9b01b4f1de6d1715d4e05, datadir: /Users/brusher/spacemesh", "name": "atxBuilder"}
...
Expected Behavior
Smeshing works well on the second run without any tricks to make it work.
There we should think about how it should work:
- The NodeID / SmesherID should be the same and should be stored in some default location only (so we don’t need to store it in post data dir and worry that someone can delete it).
In case that we do not assume that NodeID and reward address should change depending on the open wallet — I suggest storing this file in a single and even more secure place than
~/post/data/key.bin. E.G.%APPDATA%/spacemesh/key.bin(~/Libraries/Application Support/spacemesh/key.binon macOS,~/AppData/Roaming/spacemesh/key.bin,~/.configon Linux) I think this is the best option, but I can miss something. - The
key.binfile should be copied to the post data-dir right on the setting up smeshing.
Environment
macOS 10.13.6 go-spacemesh v.0.2.2-beta1
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 30 (30 by maintainers)
💯
I’ll add creating an issue for
go-spacemeshto my todo list.Cool. Let’s summarize:
key.binfile in the specified directory:~/post/data/key.bin)~/post/data/key.bin) using shasum. If it differs — update the config and restart the node. If it is the same — just callStartSmeshing.~/post/data/key.binnot found — update the config and restart the node as wellIf it sounds good, I’ll create an issue for Smapp and then paste a link here. About the issue related to go-sm, I propose to summarize everything in the new issue, post the link here and close this one 😃
@brusherru I suggested to do check first if the datadir changed…
The “kludge” you suggested is perfect. Then we don’t need a short term fix in the node.
I had a chat with @noamnelke about this. There are multiple considerations here:
PostIdand should be renamed to users as such in clients and dash/explore. The current name is misleading and was born out of historical misuse of this id (see bullet 2 below).@noamnelke - please review this summary