fluent-bit: Fluent bit stops sending logs to Loki when any of the outputs goes offline
Bug Report
Describe the bug When any one of the Elastic search output plugins goes offline there are no more logs available to query from the Loki instance (running in the container) When the offline output plugin is commented and restart fluent bit, the logs are resumed to be generated
To Reproduce
- Use a similar configuration as below
- For a few minutes initially some logs are produced after which within 20 -30 mins no more new logs are seen.
[SERVICE]
Flush 5
Daemon Off
Log_Level info
Parsers_File parsers-eos.conf
Plugins_File plugins.conf
HTTP_Server Off
HTTP_Listen 0.0.0.0
HTTP_Port 2020
[INPUT]
Name forward
Listen 0.0.0.0
Port 24224
Chunk_Size 32
Buffer_Size 64
[INPUT]
Name systemd
Path /run/log/journal/
Read_From_Tail On
Strip_Underscores On
Mem_Buf_Limit 10MB
Tag journal
[OUTPUT]
name loki
match *
host localhost
port 3100
labels job=fluentbit
# Elasticsearch running on Azure
[OUTPUT]
Name es
Match *
Host ..cloudapp.azure.com
Port 9200
Type docker
Logstash_Format On
Time_Key @fbTimestamp
Trace_Output Off
Trace_Error On
# Elasticsearch running on local machine
[OUTPUT]
Name es
Match *
Host 192.168.1.1
Port 9200
Type docker
Logstash_Format On
Time_Key @fbTimestamp
Retry_Limit 1
Expected behavior No stoppage of the logs to the connected loki output
Your Environment
- Version used: Fluent Bit v1.6.9 , /grafana-loki:2.1.0
- Operating System and version: Linux aarch64
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 5
- Comments: 24 (8 by maintainers)
Hi, still happenning with fluent-bit 1.9.5 and loki 2.5.0.
Config:
This issue was closed because it has been stalled for 5 days with no activity.
I believe this should be reopened and looked at thoroughly
This issue was closed because it has been stalled for 5 days with no activity.
Here is my fluent-bit conf:
The error 500 happens somewhat randomly (probably linked to https://github.com/grafana/loki/issues/6227) and makes the output stop permanently. If if delete the loki service for a few minutes, fluent-bit throws:
and the plugin does not recover. Errors 400 also happens for different reasons but the loki output plugins recovers afterwards.
Keeping this up because the issue is still present
Actually not, its working, nothing to do with the exponenetial retry strategy values but after a moment it reconnect successfully
Same here
I second this; experiencing the same issue with Loki.
Still interested in this issue
still interested in this issue