istio: The ingress gateway won't close the downstream connection when upstream backend server crashes

Bug Description

We had a client to list-watch to API server using HTTP 2.0 and for some reason the server restarted. The FIN packet (produced by server closing) only passed through to the ingress gateway and the ingress gateway didn’t hand over the FIN packet to client. The client still believed the connection is established, but it won’t receive any packets from the server.

Based on the above phenomenon, we did some experiments to figure out this. For easily testing, we disable sidecar injection and don’t configure any HTTP idle timeout in these two connections. image The client use HTTP 2.0 or HTTP 1.1 with Keep-Alive header to access server. After for a while, we kill the application process and observe the two connections. We found the upstream connection can be closed normally, but the downstream connection will be reserved because the ingress gateway didn’t tell the client that server crashed.

According to our test, we believe the sidecar envoy can pass through the FIN packet but ingress gateway envoy can not.

We do tcpdump on the ingress gateway node and get results listed below:

08:57:37.377633 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [S], seq 3541859721, win 65535, options [mss 1086,nop,wscale 6,nop,nop,TS val 1431424447 ecr 0,sackOK,eol], length 0
08:57:37.377683 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [S.], seq 2015933617, ack 3541859722, win 65160, options [mss 1460,sackOK,TS val 287268209 ecr 1431424447,nop,wscale 7], length 0
08:57:37.610415 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [.], ack 1, win 2064, options [nop,nop,TS val 1431424676 ecr 287268209], length 0
08:57:37.611209 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [P.], seq 1:229, ack 1, win 2064, options [nop,nop,TS val 1431424676 ecr 287268209], length 228: HTTP: GET / HTTP/1.1
08:57:37.611230 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [.], ack 229, win 508, options [nop,nop,TS val 287268442 ecr 1431424676], length 0
08:57:37.611592 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [S], seq 1754948961, win 64240, options [mss 1460,sackOK,TS val 814370055 ecr 0,nop,wscale 7], length 0
08:57:37.613273 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [S.], seq 1714948868, ack 1754948962, win 65160, options [mss 1460,sackOK,TS val 1870209083 ecr 814370055,nop,wscale 7], length 0
08:57:37.613315 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [.], ack 1, win 502, options [nop,nop,TS val 814370056 ecr 1870209083], length 0
08:57:37.613387 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [P.], seq 1:242, ack 1, win 502, options [nop,nop,TS val 814370056 ecr 1870209083], length 241: HTTP
08:57:37.613590 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [.], ack 242, win 508, options [nop,nop,TS val 1870209084 ecr 814370056], length 0
08:57:37.616947 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [P.], seq 1:5590, ack 242, win 508, options [nop,nop,TS val 1870209088 ecr 814370056], length 5589: HTTP
08:57:37.616971 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [.], ack 5590, win 479, options [nop,nop,TS val 814370060 ecr 1870209088], length 0
08:57:37.619442 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [P.], seq 242:5730, ack 5590, win 501, options [nop,nop,TS val 814370062 ecr 1870209088], length 5488: HTTP
08:57:37.619760 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [.], ack 5730, win 483, options [nop,nop,TS val 1870209091 ecr 814370062], length 0
08:57:37.620684 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [P.], seq 5590:10968, ack 5730, win 501, options [nop,nop,TS val 1870209092 ecr 814370062], length 5378: HTTP
08:57:37.620703 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [.], ack 10968, win 479, options [nop,nop,TS val 814370064 ecr 1870209092], length 0
08:57:37.621039 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [P.], seq 5730:8302, ack 10968, win 501, options [nop,nop,TS val 814370064 ecr 1870209092], length 2572: HTTP
08:57:37.621245 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [.], ack 8302, win 497, options [nop,nop,TS val 1870209092 ecr 814370064], length 0
08:57:37.621498 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [P.], seq 10968:12096, ack 8302, win 501, options [nop,nop,TS val 1870209092 ecr 814370064], length 1128: HTTP
08:57:37.621511 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [.], ack 12096, win 501, options [nop,nop,TS val 814370064 ecr 1870209092], length 0
08:57:37.621871 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [P.], seq 12096:12376, ack 8302, win 501, options [nop,nop,TS val 1870209093 ecr 814370064], length 280: HTTP
08:57:37.621879 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [.], ack 12376, win 501, options [nop,nop,TS val 814370065 ecr 1870209093], length 0
08:57:37.622444 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [P.], seq 1:284, ack 229, win 508, options [nop,nop,TS val 287268453 ecr 1431424676], length 283: HTTP: HTTP/1.1 200 OK
08:57:37.856270 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [.], ack 284, win 2059, options [nop,nop,TS val 1431424919 ecr 287268453], length 0
08:57:53.004357 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [.], ack 284, win 2059, length 0
08:57:53.004373 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [.], ack 229, win 508, options [nop,nop,TS val 287283835 ecr 1431424919], length 0
08:58:08.500918 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [.], ack 284, win 2059, length 0
08:58:08.500929 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [.], ack 229, win 508, options [nop,nop,TS val 287299332 ecr 1431424919], length 0
08:58:20.421068 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [P.], seq 12376:12407, ack 8302, win 501, options [nop,nop,TS val 1870251891 ecr 814370065], length 31: HTTP
08:58:20.421094 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [.], ack 12407, win 501, options [nop,nop,TS val 814412864 ecr 1870251891], length 0
08:58:20.421181 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [P.], seq 8302:8333, ack 12407, win 501, options [nop,nop,TS val 814412864 ecr 1870251891], length 31: HTTP
08:58:20.421218 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [F.], seq 8333, ack 12407, win 501, options [nop,nop,TS val 814412864 ecr 1870251891], length 0
08:58:20.421871 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [.], ack 8302, win 501, options [nop,nop,TS val 1870251893 ecr 814412864,nop,nop,sack 1 {8333:8334}], length 0
08:58:20.421875 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [.], ack 8334, win 501, options [nop,nop,TS val 1870251893 ecr 814412864], length 0
08:58:20.421918 IP 10.30.30.30.8080 > 10.20.20.20.46050: Flags [F.], seq 12407, ack 8334, win 501, options [nop,nop,TS val 1870251893 ecr 814412864], length 0
08:58:20.421948 IP 10.20.20.20.46050 > 10.30.30.30.8080: Flags [.], ack 12408, win 501, options [nop,nop,TS val 814412865 ecr 1870251893], length 0
08:58:24.308294 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [.], ack 284, win 2059, length 0
08:58:24.308310 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [.], ack 229, win 508, options [nop,nop,TS val 287315139 ecr 1431424919], length 0
08:58:39.850042 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [.], ack 284, win 2059, length 0
08:58:39.850056 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [.], ack 229, win 508, options [nop,nop,TS val 287330681 ecr 1431424919], length 0
08:58:55.308049 IP 10.10.10.10.62108 > 10.20.20.20.8080: Flags [.], ack 284, win 2059, length 0
08:58:55.308063 IP 10.20.20.20.8080 > 10.10.10.10.62108: Flags [.], ack 229, win 508, options [nop,nop,TS val 287346139 ecr 1431424919], length 0

Version

istio version: 1.10
kubectl version: 1.18.8

Additional Information

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 38 (20 by maintainers)

Most upvoted comments

Dropped FIN is quite common - and should not lead to problems. When the app attempts to send ( and it should periodically send at least for keepalive purpose ), after some time it should get RST or retry sending until max retry, and close.

This is a very tricky area - hard to test and reproduce, some networks intentionally throttle FIN/RST, and tcp timeout is pretty large. Http/2 has far better heartbeat, but http1 is still broadly used.