go-mysql: replication block when network down
when the network is down (such as unplug the network cable) and recover after, the replication client will block at ReadPacket, this is the info of pprof:
goroutine 19 [IO wait, 14 minutes]:
net.runtime_pollWait(0x7fe48c176b70, 0x72, 0xc82177e000)
/home/cdyf/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc82176a760, 0x72, 0x0, 0x0)
/home/cdyf/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc82176a760, 0x0, 0x0)
/home/cdyf/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).Read(0xc82176a700, 0xc82177e000, 0x1000, 0x1000, 0x0, 0x7fe48c171050, 0xc82000a0f8)
/home/cdyf/go/src/net/fd_unix.go:250 +0x23a
net.(*conn).Read(0xc821768120, 0xc82177e000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/home/cdyf/go/src/net/net.go:172 +0xe4
bufio.(*Reader).fill(0xc8217783c0)
/home/cdyf/go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Read(0xc8217783c0, 0xc821c66148, 0x4, 0x4, 0x8, 0x0, 0x0)
/home/cdyf/go/src/bufio/bufio.go:207 +0x260
io.ReadAtLeast(0x7fe48c131078, 0xc8217783c0, 0xc821c66148, 0x4, 0x4, 0x4, 0x0, 0x0, 0x0)
/home/cdyf/go/src/io/io.go:297 +0xe6
io.ReadFull(0x7fe48c131078, 0xc8217783c0, 0xc821c66148, 0x4, 0x4, 0x20, 0x0, 0x0)
/home/cdyf/go/src/io/io.go:315 +0x62
mysql_byroad/vendor/github.com/siddontang/go-mysql/packet.(*Conn).ReadPacketTo(0xc82101e8c0, 0x7fe48c175d78, 0xc821a388c0, 0x0, 0x0)
/home/cdyf/goworkspace/src/mysql_byroad/vendor/github.com/siddontang/go-mysql/packet/conn.go:81 +0xf1
mysql_byroad/vendor/github.com/siddontang/go-mysql/packet.(*Conn).ReadPacket(0xc82101e8c0, 0x0, 0x0, 0x0, 0x0, 0x0)
/home/cdyf/goworkspace/src/mysql_byroad/vendor/github.com/siddontang/go-mysql/packet/conn.go:35 +0x92
mysql_byroad/vendor/github.com/siddontang/go-mysql/replication.(*BinlogSyncer).onStream(0xc821717480, 0xc82101eca0)
/home/cdyf/goworkspace/src/mysql_byroad/vendor/github.com/siddontang/go-mysql/replication/binlogsyncer.go:479 +0xa8
created by mysql_byroad/vendor/github.com/siddontang/go-mysql/replication.(*BinlogSyncer).startDumpStream
/home/cdyf/goworkspace/src/mysql_byroad/vendor/github.com/siddontang/go-mysql/replication/binlogsyncer.go:220 +0x111
I think this is because mysql server closed the connection and replication client didn’t receive any close data because of bad network, so it blocked at reading data from this connection. Is there any solution for this condition?
About this issue
- Original URL
- State: open
- Created 8 years ago
- Comments: 18 (3 by maintainers)
I tested in my local machine, mysql server does not close the binlog dump connection even if the underlying tcp connection is lost(or in close_wait state). until there is a new event to be send, mysql will check and clean the bad connections Seemingly no good solution for this Add tcp keepalive can avoid unplug the network cable situation but has no effect on the removal of bad connections
Hi @3manuek
The HEARTBEAT_EVENT is sent by the master periodically. We can add
MASTER_HEARTBEAT_PERIODconfiguration. If set (>0, maybe should better > the time configured in MySQL), we will set read timeout for the connection, and we will retry after the connection is closed.