fluentd: sometimes out_forward heartbeat(tcp) failed with fluentd>0.14.9
sometimes out_forward heartbeat(tcp) failed
when sender fluentd: 0.14.11 and 0.14.13 aggregator fluentd: 0.14.11 and 0.14.13
This problem does not have occurred in sender v0.14.8.
fluentd.log
2017-03-01 06:49:37 +0000 [warn]: recovered forwarding server '' host="log-aggregator.local" port=10224
2017-03-01 06:49:37 +0000 [warn]: recovered forwarding server '' host="log-aggregator.local" port=10224
2017-03-01 06:49:45 +0000 [warn]: retry succeeded. chunk_id="549a573de08a4928f31d10f236155293"
2017-03-01 06:49:51 +0000 [warn]: retry succeeded. chunk_id="549a579f25b7d32e5db46d5b6ab57ce5"
2017-03-01 06:54:18 +0000 [warn]: detached forwarding server '' host="log-aggregator.local" port=10224 hard_timeout=true
2017-03-01 06:54:18 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.00606061401777 slow_flush_log_threshold=20.0 plugin_id="object:2b093717af00"
2017-03-01 06:55:18 +0000 [warn]: failed to flush the buffer. retry_time=0 next_retry_seconds=2017-03-01 06:55:19 +0000 chunk="549a5c80fadbc463279c0c9408e0c210" error_class=Fluent::Plugin::ForwardOutput::NoNodesAvailable error="no nodes are available"
2017-03-01 06:56:18 +0000 [warn]: suppressed same stacktrace
2017-03-01 06:56:18 +0000 [warn]: failed to flush the buffer. retry_time=0 next_retry_seconds=2017-03-01 06:55:19 +0000 chunk="549a5c80fadbc463279c0c9408e0c210" error_class=Fluent::Plugin::ForwardOutput::NoNodesAvailable error="no nodes are available"
2017-03-01 06:56:18 +0000 [warn]: suppressed same stacktrace
2017-03-01 06:57:18 +0000 [warn]: detached forwarding server '' host="log-aggregator.local" port=10224 hard_timeout=true
2017-03-01 07:00:18 +0000 [warn]: failed to flush the buffer. retry_time=1 next_retry_seconds=2017-03-01 06:56:19 +0000 chunk="549a5c80fadbc463279c0c9408e0c210" error_class=Fluent::Plugin::ForwardOutput::NoNodesAvailable error="no nodes are available"
2017-03-01 07:01:18 +0000 [warn]: suppressed same stacktrace
2017-03-01 07:07:18 +0000 [warn]: failed to flush the buffer. retry_time=2 next_retry_seconds=2017-03-01 07:03:20 +0000 chunk="549a5c80fadbc463279c0c9408e0c210" error_class=Fluent::Plugin::ForwardOutput::NoNodesAvailable error="no nodes are available"
2017-03-01 07:07:18 +0000 [warn]: suppressed same stacktrace
2017-03-01 07:08:18 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=780.0749890550505 slow_flush_log_threshold=20.0 plugin_id="object:2b093717af00"
2017-03-01 07:08:18 +0000 [warn]: failed to flush the buffer. retry_time=3 next_retry_seconds=2017-03-01 07:07:22 +0000 chunk="549a5c80fadbc463279c0c9408e0c210" error_class=Fluent::Plugin::ForwardOutput::NoNodesAvailable error="no nodes are available"
2017-03-01 07:08:18 +0000 [warn]: suppressed same stacktrace
tcpdump in sender-server
sender-server$ tcpdump -s 0 -nn port 10224
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
07:05:31.338389 IP 172.17.0.3.36562 > 10.80.1.45.10224: Flags [F.], seq 2513917648, ack 2404494309, win 229, options [nop,nop,TS val 136819588 ecr 32246235], length 0
07:05:44.186393 IP 172.17.0.3.36562 > 10.80.1.45.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136822800 ecr 32246235], length 0
07:06:09.850392 IP 172.17.0.3.36562 > 10.80.1.45.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136829216 ecr 32246235], length 0
07:06:18.528050 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 230348335, ack 1336453961, win 229, options [nop,nop,TS val 136831385 ecr 32070779], length 0
07:06:18.730397 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136831436 ecr 32070779], length 0
07:06:18.934388 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136831487 ecr 32070779], length 0
07:06:19.342383 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136831589 ecr 32070779], length 0
07:06:20.158387 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136831793 ecr 32070779], length 0
07:06:21.794381 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136832202 ecr 32070779], length 0
07:06:25.066536 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136833020 ecr 32070779], length 0
07:06:31.610412 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136834656 ecr 32070779], length 0
07:06:44.698389 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136837928 ecr 32070779], length 0
07:07:10.906398 IP 172.17.0.3.48160 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136844480 ecr 32070779], length 0
07:07:18.527420 IP 10.80.19.80.54804 > 10.80.31.242.10224: Flags [S], seq 1102294848, win 29200, options [mss 1460,sackOK,TS val 136846385 ecr 0,nop,wscale 7], length 0
07:07:18.527641 IP 10.80.31.242.10224 > 10.80.19.80.54804: Flags [S.], seq 2300741022, ack 1102294849, win 26847, options [mss 8961,sackOK,TS val 32290337 ecr 136846385,nop,wscale 8], length 0
07:07:18.527676 IP 10.80.19.80.54804 > 10.80.31.242.10224: Flags [.], ack 1, win 229, options [nop,nop,TS val 136846385 ecr 32290337], length 0
07:07:18.534713 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 1614423095, ack 1935553673, win 229, options [nop,nop,TS val 136846387 ecr 32071528], length 0
07:07:18.738405 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136846438 ecr 32071528], length 0
07:07:18.942403 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136846489 ecr 32071528], length 0
07:07:19.350413 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136846591 ecr 32071528], length 0
07:07:20.166387 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136846795 ecr 32071528], length 0
07:07:21.802406 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136847204 ecr 32071528], length 0
07:07:25.074384 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136848022 ecr 32071528], length 0
07:07:31.610408 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136849656 ecr 32071528], length 0
07:07:44.698388 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136852928 ecr 32071528], length 0
07:08:10.874391 IP 172.17.0.3.48310 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136859472 ecr 32071528], length 0
07:08:18.293572 IP 10.80.31.242.10224 > 10.80.19.80.54804: Flags [F.], seq 1, ack 1, win 105, options [nop,nop,TS val 32305278 ecr 136846385], length 0
07:08:18.294385 IP 10.80.19.80.54804 > 10.80.31.242.10224: Flags [.], ack 2, win 229, options [nop,nop,TS val 136861327 ecr 32305278], length 0
07:08:18.535001 IP 10.80.19.80.54804 > 10.80.31.242.10224: Flags [F.], seq 1, ack 2, win 229, options [nop,nop,TS val 136861387 ecr 32305278], length 0
07:08:18.535208 IP 10.80.31.242.10224 > 10.80.19.80.54804: Flags [.], ack 2, win 105, options [nop,nop,TS val 32305339 ecr 136861387], length 0
07:08:18.539957 IP 10.80.19.80.50448 > 10.80.1.45.10224: Flags [S], seq 2820230188, win 29200, options [mss 1460,sackOK,TS val 136861388 ecr 0,nop,wscale 7], length 0
07:08:18.542520 IP 10.80.1.45.10224 > 10.80.19.80.50448: Flags [S.], seq 2297201150, ack 2820230189, win 26847, options [mss 8961,sackOK,TS val 32306338 ecr 136861388,nop,wscale 8], length 0
07:08:18.542546 IP 10.80.19.80.50448 > 10.80.1.45.10224: Flags [.], ack 1, win 229, options [nop,nop,TS val 136861389 ecr 32306338], length 0
07:08:18.542739 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 2020798759, ack 1505973842, win 229, options [nop,nop,TS val 136861389 ecr 32072279], length 0
07:08:18.746408 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136861440 ecr 32072279], length 0
07:08:18.950400 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136861491 ecr 32072279], length 0
07:08:19.362390 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136861594 ecr 32072279], length 0
07:08:20.178382 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136861798 ecr 32072279], length 0
07:08:21.814400 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136862207 ecr 32072279], length 0
07:08:25.082385 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136863024 ecr 32072279], length 0
07:08:31.630389 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136864661 ecr 32072279], length 0
07:08:44.730388 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136867936 ecr 32072279], length 0
07:09:10.906437 IP 172.17.0.3.48410 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136874480 ecr 32072279], length 0
07:09:18.130913 IP 10.80.1.45.10224 > 10.80.19.80.50448: Flags [F.], seq 1, ack 1, win 105, options [nop,nop,TS val 32321235 ecr 136861389], length 0
07:09:18.134411 IP 10.80.19.80.50448 > 10.80.1.45.10224: Flags [.], ack 2, win 229, options [nop,nop,TS val 136876287 ecr 32321235], length 0
07:09:18.567265 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 1740187083, ack 3998838662, win 229, options [nop,nop,TS val 136876395 ecr 32074529], length 0
07:09:18.770410 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136876446 ecr 32074529], length 0
07:09:18.974390 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136876497 ecr 32074529], length 0
07:09:19.382381 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136876599 ecr 32074529], length 0
07:09:20.198396 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136876803 ecr 32074529], length 0
07:09:21.834405 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136877212 ecr 32074529], length 0
07:09:25.106399 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136878030 ecr 32074529], length 0
07:09:31.642416 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136879664 ecr 32074529], length 0
07:09:44.730373 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136882936 ecr 32074529], length 0
07:10:10.874392 IP 172.17.0.3.48670 > 10.80.31.242.10224: Flags [F.], seq 0, ack 1, win 229, options [nop,nop,TS val 136889472 ecr 32074529], length 0
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 16 (7 by maintainers)
Commits related to this issue
- Close socket before delete waiting socket from list. fix #1484 — committed to fluent/fluentd by repeatedly 7 years ago
Thanks for the report. So v0.14.10 changes cause this problem. I will check socket helper and related code.
I’m not sure the expected sequence on this situation, client side close, but
Problem2
sequence also happens on other situation. For example, python HTTP server and ruby TCP client also has same sequence.So with this situation, first FIN’s ACK is skipped on Linux? Mac has different sequence
@bungoume Sorry for the late. https://github.com/fluent/fluentd/pull/1581 I wrote a patch. Could you test this?