salt: Minion connection does not survive IP address change

[from http://bugs.debian.org/690525]

If my provider disconnects the DSL line and subsequently assigns a new IP address, the salt-minion connection to the server dies and is not re-established.

Following the disconnect, the minion tries over and over and over again to establish a new connection, which gets shut down by the remote:

2.216183  10.178.17.2 -> 77.109.139.93 TCP 76 42836 > 4505 [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=294780260 TSecr=0 WS=16
2.244185 77.109.139.93 -> 10.178.17.2  TCP 76 4505 > 42836 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1452 SACK_PERM=1 TSval=250894922 TSecr=294780260 WS=16
2.244185  10.178.17.2 -> 77.109.139.93 TCP 68 42836 > 4505 [ACK] Seq=1 Ack=1 Win=14608 Len=0 TSval=294780267 TSecr=250894922
2.244185  10.178.17.2 -> 77.109.139.93 TCP 93 42836 > 4505 [PSH, ACK] Seq=1 Ack=1 Win=14608 Len=25 TSval=294780267 TSecr=250894922
2.272187 77.109.139.93 -> 10.178.17.2  TCP 68 4505 > 42836 [ACK] Seq=1 Ack=26 Win=14480 Len=0 TSval=250894929 TSecr=294780267
2.272187 77.109.139.93 -> 10.178.17.2  TCP 68 4505 > 42836 [FIN, ACK] Seq=1 Ack=26 Win=14480 Len=0 TSval=250894929 TSecr=294780267
2.272187  10.178.17.2 -> 77.109.139.93 TCP 68 42836 > 4505 [FIN, ACK] Seq=26 Ack=2 Win=14608 Len=0 TSval=294780274 TSecr=250894929
2.300190 77.109.139.93 -> 10.178.17.2  TCP 68 4505 > 42836 [ACK] Seq=2 Ack=27 Win=14480 Len=0 TSval=250894936 TSecr=294780274
[and the next port:]
2.404198  10.178.17.2 -> 77.109.139.93 TCP 76 42837 > 4505 [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=294780307 TSecr=0 WS=16
[…]

On the master side, this corresponds to several hundred of lingering connections, according to netstat:

tcp        0      0 77.109.139.93:4505      82.135.64.143:42250     TIME_WAIT   -

# netstat -natp | grep -c :4505.*TIME_WAIT
484

And obviously, the master cannot communicate with the minion.

I am running both minion and master with --log-level=trace, but there is nothing in the output about this.

There are a number of issues related to this:

  1. I think that the master should periodically ping the minion. If the minion does not respond, the master should at least log this and probably tear down the connection.
  2. The minion should log when it finds an unusable connection, when it tears it down and when it re-establishes the connection.
  3. The master should log why it closes a connection attempt like above.
  4. The master should not really deny a connection attempt by an authenticated host.

I hope this is not a dep bug in ZMQ, but something trivial to fix.

About this issue

  • Original URL
  • State: closed
  • Created 12 years ago
  • Comments: 17 (14 by maintainers)

Most upvoted comments

Suppose I want to use Salt to manage the network configuration of the minions; and have them change the network interface IP address being used by the Minion itself. The flow would be (1) change IP (2) restart minion (3) reconnect to master. Seems like a reasonable configuration management use case, I’m guessing it’s possible with Salt, but is it known/expected to work?