mtproto: Getting panic: (*serialize.BadMsgNotification) 0xc0000381c0 [recovered] on Auth example

I’m getting the following error when I try to run the example auth code

panic: (*serialize.BadMsgNotification) 0xc0000381c0 [recovered]
	panic: (*serialize.BadMsgNotification) 0xc0000381c0

goroutine 19 [running]:
github.com/xelaj/mtproto.(*MTProto).recoverGoroutine(0xc0001cc6c0)
	/home/das/go/pkg/mod/github.com/xelaj/mtproto@v0.0.0-20201129002337-2c42bdacb467/mtproto_utils.go:77 +0x125
panic(0x761fe0, 0xc0000381c0)
	/usr/lib/go/src/runtime/panic.go:969 +0x1b9
github.com/xelaj/mtproto.(*MTProto).processResponse(0xc0001cc6c0, 0x5fdb37fd706ac401, 0x0, 0x8425c0, 0xc00010e000, 0xc00010e000, 0x0)
	/home/das/go/pkg/mod/github.com/xelaj/mtproto@v0.0.0-20201129002337-2c42bdacb467/mtproto.go:350 +0x87e
github.com/xelaj/mtproto.(*MTProto).startReadingResponses.func1(0xc0001cc6c0, 0x842c40, 0xc0001d7dc0)
	/home/das/go/pkg/mod/github.com/xelaj/mtproto@v0.0.0-20201129002337-2c42bdacb467/mtproto.go:277 +0x296
created by github.com/xelaj/mtproto.(*MTProto).startReadingResponses
	/home/das/go/pkg/mod/github.com/xelaj/mtproto@v0.0.0-20201129002337-2c42bdacb467/mtproto.go:253 +0x70
exit status 2

What is the possible cause and how do I solve it?

About this issue

Original URL
State: open
Created 4 years ago
Comments: 28 (20 by maintainers)

Commits related to this issue

🎨 seqno mutex moved to better position see https://github.com/xelaj/mtproto/issues/49#issuecomment-805517501 — committed to xelaj/mtproto by quenbyako 3 years ago
Possible fix of https://github.com/xelaj/mtproto/issues/49 — committed to hookzof/mtproto by hookzof 3 years ago

Most upvoted comments

From logging inside sendPacket, I’ve found two issues that seem to be causing this together. Firstly, the current logic only increments seqNo if the message was encrypted. However, the Telegram spec says this should be conditional on the message requiring acknowledgement, with no mention of encrypted: “those requiring acknowledgment, and in particular those that are not containers”.

Secondly, the error occurs after two messages with identical msgID happen - not seqNo misordering or thread safety as thought before. I have identified that in some cases the response comes back with a RpcError, which makeRequest handles in this case by submitting a second recursive call to m.makeRequest (mtproto.go:199) This happens quickly enough that the GenerateMessageId function returns an identical ID - it appears the UnixNano function does not have sufficient resolution. From experience on a previous project this value is often rounded - https://github.com/golang/go/issues/27301 covers this issue.

There are two potential fixes - one is to use the suggested method in that thread of storing a Start time and using time.Sub to get nano-second precision that way. The second, which I’ve tested, is to move the seqNoMutex lock to before GenerateMessageId, then changed GenerateMessageId to store the last generated value - if this was identical to the newly generated value, simply increment by 4.

var lastMessageId int64

// GenerateMessageId отдает по сути unix timestamp но ужасно специфическим образом
// TODO: нахуя нужно битовое и на -4??
func GenerateMessageId() (r int64) {
	const billion = 1000 * 1000 * 1000
	unixnano := time.Now().UnixNano()
	seconds := unixnano / billion
	nanoseconds := unixnano % billion
	r = (seconds << 32) | (nanoseconds & -4)
	if r == lastMessageId {
		r += 4
	}
	lastMessageId = r
	return r
}

This works and solves the issue, as duplicate IDs won’t occur. However it would break potentially if any other function than sendPacket called this utility function - I’d suggest moving it into network.go if this fix is to be used, so it can be a private method.

Unfortunately I’m not able to fully test this fix due to https://github.com/xelaj/mtproto/issues/86 blocking me from further testing - but I have gone from receiving the BadMsgNotification on almost all runs to never after implementing this change.

Let me know if you’d like a PR of a suggested full fix.

MJDrgn on Jun 4, 2021