neo: Network protocol suggestion: getfullblocks

There are some improvements proposed in https://github.com/neo-project/neo/issues/366 I’d like to discuss one more that is also backwards compatible.

The current getblocks command is inefficient and also has a misleading name. The name suggests you will receive blocks. In reality you get hashes (via an inv message). You then basically repackage the same information of this inv message but now with the getdata command to finally receive the blocks messages. This is a 4-step process:

  1. -> getblocks
  2. <- inv with hash list
  3. -> getdata with the received hash list
  4. <- n times block (for all requested hashes)

We can do the same process in just 2-steps.

Proposal I’d like to propose a getfullblocks command (or something along those lines) that will do the above in 2 steps (specifically only step 1 & 4). This is nearly identical to the current getblocks but instead of sending an inv it will instantly return the requested blocks.

Untested example, but should be close enough to explain the idea

private void OnGetFullBlocksMessageReceived(GetBlocksPayload payload)
{
    UInt256 hash = payload.HashStart[0];
    if (hash == payload.HashStop) return;
    BlockState state = Blockchain.Singleton.Store.GetBlocks().TryGet(hash);
    if (state == null) return;

    for (uint i = 1; i <= InvPayload.MaxHashesCount; i++)
    {
        uint index = state.TrimmedBlock.Index + i;
        if (index > Blockchain.Singleton.Height)
            break;
        hash = Blockchain.Singleton.GetBlockHash(index);
        if (hash == null) break;
        if (hash == payload.HashStop) break;

        // code extracted from OnGetDataMessageReceived()
        Blockchain.Singleton.RelayCache.TryGet(hash, out IInventory inventory);
        if (inventory == null)
            inventory = Blockchain.Singleton.GetBlock(hash);
        if (inventory is Block block)
        {
            if (bloom_filter == null)
            {
                Context.Parent.Tell(Message.Create("block", inventory));
            }
            else
            {
                BitArray flags = new BitArray(block.Transactions.Select(p => bloom_filter.Test(p)).ToArray());
                Context.Parent.Tell(Message.Create("merkleblock", MerkleBlockPayload.Create(block, flags)));
            }
        }
        // end code extracted from OnGetDataMessageReceived()
    }
}

The above is basically a combination of OnGetBlocksMessageReceived() and OnGetDataMessageReceived()

Benefits

  • less network traffic
  • self-explanatory command instead of misleading command
  • less business logic for retrieving blocks from the network.

Possible improvements allow querying data using a start_index , stop_index. This would mean we cannot re-use the GetBlocksPayload but requires a new one.

Benefits of these parameters are;

  • smaller getfullblock message, as you do not need any UInt256 hashes to start requesting data, just an int for the index
  • does not require knowing any hashes to get started on block data syncing. Meaning; no need to know the genesis block hash to start from scratch, and no need to know the inv payload structure for deserialising a hash that was broadcasted on the network. You can skip all unnecessary data and just go straight to retrieving blocks.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 4
  • Comments: 24 (24 by maintainers)

Commits related to this issue

Most upvoted comments

Why can’t other projects also sync headers?

It’s not that they can’t, it’s that they not always have a need for it. Not every node is interested in also sharing headers/blocks on the network. The most obvious example being a tracker like neo-scan/neotracker. A statistics website might just want to collect data, not share. Enabling sharing of data also increases the load on the host, which might not be interested in that.

Is it the best for the network? Arguably not. But it is possible and there are legitimate reasons for it, so people will do it.


I do think getting headers first is probably a better design in general. For instance, it makes it less likely to get the wrong block at a height.

Can you elaborate why it would be less likely to get the wrong block? The first thing that comes to mind is that recently many testnet clients were stuck on a faulty block 2116137 even though they used the headers first, blocks later sequence. I’m not really seeing what a single header can provide more than a block (which contains all information).

This design makes sense, let me explain.

Suppose there are 4 nodes A, B, C, and D. Node A requests blocks 100~200 from the other 3 nodes. So he will send:

getblocks [100~200]

The problem here is that node A does not know if other nodes have these blocks, so it must send getblocks messages to all connected nodes.

Then Node B reply the message:

inv [100~150]

Node B has only blocks of 100 to 150 height.

Node C reply the message:

inv [100~200]

Node C has all the blocks.

Node D says nothing because he is the new one here too.

Then Node A send messages to B and C:

getdata [100~150] // to C
getdata [150~200] // to D

This way it can download blocks from multiple nodes at the same time.

If there are no inv and getdata messages, then all nodes may send blocks to node A at the same time, increasing the load on the network.