fluentd: memory usage seems to be growing indefinitely with simple forward configuration
Describe the bug
Memory usage is growing indefinitely with very simple @type forward configuration. I’m shipping about 10-30G of logs daily. Mostly during evening hours. td-agent at the beginning is using about 100M of RES memory but after about 12h it’s 500M and half of this time it’s mostly idled because there are not many logs during the night. Memory is never freed. When I switch to any ‘local’ output like file or stdout memory usage is very stable. I’ve seen the same behavior with elasticsearch output so I guess it can be something connected with network outputs … or just my stupidity 😃 There are no errors in the log file. I’ve tried to set RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=0.9 but it didn’t fix the problem. Memory usage is still growing. I have two workers in my configuration This problem only occurs on the one with a high amount of logs generated during day hours (worker 0). A worker is reading about 50 files at once. It may be relevant that I have a lot of pattern not matched
- I’m in the middle of standardizing log format for all apps.
To Reproduce Run with a high amount of not perfectly formate logs.
Expected behavior Stable memory usage.
Your Environment
- Fluentd or td-agent version:
td-agent 1.9.2
- Operating system:
NAME="Amazon Linux" VERSION="2" ID="amzn" ID_LIKE="centos rhel fedora" VERSION_ID="2" PRETTY_NAME="Amazon Linux 2" ANSI_COLOR="0;33" CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2" HOME_URL="https://amazonlinux.com/"
- Kernel version:
4.14.171-136.231.amzn2.x86_64
Your Configuration
<system>
workers 2
log_level warn
</system>
<worker 0>
<source>
@type tail
path "/somepath/*/*.log"
read_from_head true
pos_file /var/log/td-agent/mpservice-es-pos-file
tag mpservice-raw.*
enable_stat_watcher false
<parse>
@type multiline
time_key time
time_format %Y-%m-%d %H:%M:%S,%L
timeout 0.5
format_firstline /^\d{4}-\d{2}-\d{2}/
format1 /^(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) \| (?<level>.*) \| (?<class>.*) \| (?<thread>.*) \| (?<message>.*)/
</parse>
</source>
<match *.**>
@type forward
<server>
host ***
port 24224
</server>
<buffer>
@type memory
flush_interval 2s
flush_thread_count 2
</buffer>
</match>
</worker>
# webapps ES
<worker 1>
<source>
@type tail
path "/somepath/*/*[a-z].log"
read_from_head true
pos_file /var/log/td-agent/webapps-es-pos-file
tag webapps-raw.*
enable_stat_watcher false
<parse>
@type multiline
format_firstline /^\d{4}-\d{2}-\d{2}/
format1 /^(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) \| (?<level>.*) \| (?<class>.*) \| (?<thread>.*) \| (?<message>.*)/
</parse>
</source>
<match *.**>
@type forward
<server>
host ***
port 24224
</server>
<buffer>
@type memory
flush_interval 2s
flush_thread_count 2
</buffer>
</match>
</worker>
Your Error Log
no errors in logs
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 4
- Comments: 19 (11 by maintainers)
@4ndr4s Does this happen with elasticsearch and file buffer combo? Your graph shows it is not memory leak. Issue author and you use memory buffer. So if incoming speed is faster than output spped, memory usage is growing.