bee: /chunks/xxxxxxx?targets= causes crash
Summary
Running a v1.0.0 bee node with global-pinning-enable: true
. Requested a chunk with targets that the node did not have with:
curl http://192.168.10.185:8080/chunks/5e7132318b6f5482fb02e2ed48c293800f0774470387412fbd3db1a3366736bd?targets=f99db6
this returned the expected:
{"message":"chunk recovery initiated. retry after sometime.","code":202}
At this point, the bee node consumes copious amounts of CPU and then abruptly terminates with:
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xe012ab]
goroutine 129928953 [running]:
github.com/ethersphere/bee/pkg/pss.(*pss).Send(0xc0006d70e0, 0x18e53e0, 0xc00003e0a0, 0xb848d6c7d44639f3, 0xd02c4e295a5b9716, 0x776eafbe3b931776, 0xcf20d5397b98a205, 0xc005b29f60, 0x20, 0x160, ...)
github.com/ethersphere/bee/pkg/pss/pss.go:98 +0x14b
github.com/ethersphere/bee/pkg/recovery.NewCallback.func1(0xc005b29f60, 0x20, 0x160, 0xc109039d00, 0x1, 0x1)
github.com/ethersphere/bee/pkg/recovery/repair.go:38 +0xd1
created by github.com/ethersphere/bee/pkg/netstore.(*store).Get
github.com/ethersphere/bee/pkg/netstore/netstore.go:56 +0x2ec
Steps to reproduce
Run a bee node with global-pinning-enable: true
and request a /chunks/xxxxxx?targets=yyyyyy
where yyyyyy must be an even count of hex digits. Wait for the response and then wait even longer for the crash to occur. I believe the additional delay is caused by pss.Wrap inside pss.Send attempting to target the appropriate neighborhood.
Expected behavior
I would expect the recovery request to be issued successfully via pss.
Actual behavior
The application crashed.
I have tracked it down through the source code and it seems that pss.Send now requires a stamper, but the recovery repair callback specifies null. As soon as the stamper is used in pss.Send, the application terminates.
https://github.com/ethersphere/bee/blob/6f3a38292e5e0fa3cd4856bf7dc9f5ba27c6c293/pkg/pss/pss.go#L98
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 15 (9 by maintainers)
@ldeffenb this is not so high up on our priority list. there are some usability problems with the current implementation of global pinning that are a bit difficult to solve right now, and after an in-depth look they might even require protocol changes on some degree so that good UX can ensue.
@acud As @ldeffenb mentioned, the
recovery.NewCallback
function that the /chunks api uses sends a PSS message with theRECOVERY
topic to retrieve the chunk from the network. So it needs to stamp the payload before doing so, but anil
stamper is currently passed in. In order to stamp, we need a valid batch ID.Is the correct implementation reading the batch ID from the request header?
I will look into the segfault, but please keep in mind that
targets
specifies the preceding address bytes that the node will trying to mine. If you specify 1 byte (two hex characters) - it means that it will mine a chunk address with proximity order 16 to the specified target prefix; if you specify 2 bytes you’re already at 32, and you are trying to mine an address with po 48, i.e. you’re going to have to go through281474976710656
chunks and hash them in order to get an address at po 48 (if you’re lucky!)