fluent-bit: Fluentbit stuck with high data load
Bug Report
Describe the bug
When providing a high data load( 2000 files, each file having 10000 1.5kb of multiline records) to tail plugin, fluent-bit slows down and there is a huge spike in memory, resulting in latency increase. The increase in memory is linear to the increase in down chunks in filesystem storage.
Once the tail plugin completes ingesting all the records, fluent-bit starts processing at the normal rate again. When the rss memory reaches the mem_buf_limit, the cache memory starts to spike up and cpu utilization goes down, resulting in the lower processing rate of input chunks.
To Reproduce
High influx of data to tail plugin leading to memory spike and slowing down of tail plugin’s processing of data.
- Steps to reproduce the problem:
git clone https://github.com/arpantw/fluent-bit
cd fluent-bit
git checkout bug/tail-plugin-stuck
cd lib/chunkio
docker-compose up --build -d
Navigate to localhost:9090 and monitor fluent bit with the given queries below in the screenshots
Expected behavior Fluentbit should not get stuck and continue processing data
Screenshots


When the ip plugin completes ingesting the data, the op plugin starts processing it at ~15:30-16:20



Your Environment
- Version used: 1.8.2
- Configuration:
[SERVICE]
Flush 1
Daemon off
Log_Level info
Parsers_File parser.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
storage.path /storage
storage.metrics on
storage.backlog.mem_limit 100M
[INPUT]
Name tail
Path /data/**/*.txt
Path_Key file_name
Tag perf
Buffer_Chunk_Size 96KB
Buffer_Max_Size 96KB
Skip_Long_Lines On
Multiline On
Rotate_Wait 9
Parser_Firstline multiline_perf
Refresh_Interval 1
Mem_Buf_Limit 100MB
storage.type filesystem
DB /storage/tail-0.db
DB.locking true
Read_from_Head True
[OUTPUT]
Name stdout
Match perf
[PARSER]
Name multiline_perf
Format regex
Regex /^(?<key>\d{10,}.*)/
- Environment name and version: Docker
- Operating System and version: mac OS 10.15.7
- Plugins: Tail
Additional context
- Used cio perf tool to generate the data load
- We’ve also seen that this allocation of memory to a chunk happens very frequently, and can be possibly optimised to allocate at a higher offset so that there are fewer memory re-allocations for a particular chunk.
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 4096 to 36864
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 36864 to 69632
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 69632 to 102400
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 102400 to 135168
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 135168 to 167936
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 167936 to 200704
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 200704 to 233472
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 233472 to 266240
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 266240 to 299008
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 299008 to 331776
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 331776 to 364544
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 364544 to 397312
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 397312 to 430080
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 430080 to 462848
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 462848 to 495616
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 495616 to 528384
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 528384 to 561152
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 561152 to 593920
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 593920 to 626688
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 626688 to 659456
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 659456 to 692224
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 692224 to 724992
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 724992 to 757760
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 757760 to 790528
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 790528 to 823296
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 823296 to 856064
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 856064 to 888832
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 888832 to 921600
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 921600 to 954368
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 954368 to 987136
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 987136 to 1019904
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1019904 to 1052672
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1052672 to 1085440
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1085440 to 1118208
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1118208 to 1150976
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1150976 to 1183744
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1183744 to 1216512
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1216512 to 1249280
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1249280 to 1282048
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1282048 to 1314816
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1314816 to 1347584
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1347584 to 1380352
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1380352 to 1413120
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1413120 to 1445888
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1445888 to 1478656
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1478656 to 1511424
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1511424 to 1544192
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1544192 to 1576960
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1576960 to 1609728
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1609728 to 1642496
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1642496 to 1675264
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1675264 to 1708032
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1708032 to 1740800
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1740800 to 1773568
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1773568 to 1806336
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1806336 to 1839104
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1839104 to 1871872
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1871872 to 1904640
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1904640 to 1937408
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1937408 to 1970176
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 1970176 to 2002944
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [input chunk] skip ingesting data with 0 bytes
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 2002944 to 2035712
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 2031616 to 2064384
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 2035712 to 2068480
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 2039808 to 2072576
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 2043904 to 2076672
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.215382300.flb mapped OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] alloc_size from 2048000 to 2080768
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] [cio file] synced at: tail.0/1-1628842060.215382300.flb
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.308437500.flb adjusting size OK
fluent-bit_1 | [2021/08/13 08:07:40] [debug] [storage] tail.0:1-1628842060.308437500.flb mapped OK
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 11
- Comments: 31 (20 by maintainers)
Commits related to this issue
- in-tail: release event loop between files (#3947) This patch fixes a problem where in_tail will block the event loop until all files have been processed. This problem becomes more obvious when the nu... — committed to pcolazurdo/fluent-bit by pcolazurdo 3 years ago
- in-tail: release event loop between files (#3947) This patch fixes a problem where in_tail will block the event loop until all files have been processed. This problem becomes more obvious when the nu... — committed to pcolazurdo/fluent-bit by pcolazurdo 3 years ago
- in-tail: release event loop between files (#3947) This patch fixes a problem where in_tail will block the event loop until all files have been processed. This problem becomes more obvious when the nu... — committed to PettitWesley/fluent-bit by pcolazurdo 3 years ago
- in_tail: enhance handling of high number of static files (#3947) When processing static files, we don't know what kind of content they have and what kind of 'latency' might add to process all of them... — committed to fluent/fluent-bit by edsiper 2 years ago
- in_tail: enhance handling of high number of static files (#3947) When processing static files, we don't know what kind of content they have and what kind of 'latency' might add to process all of them... — committed to fluent/fluent-bit by edsiper 2 years ago
- in_tail: enhance handling of high number of static files (#3947) When processing static files, we don't know what kind of content they have and what kind of 'latency' might add to process all of them... — committed to fluent/fluent-bit by edsiper 2 years ago
New PR has been submitted to address the issue: #4590
So one thing we thought of was that may be the disk is saturated and IO contention is a problem. So I tried disabling logging for fluent bit since that ultimately goes to disk via the json-file log driver. But this didn’t help. Results are exactly the same.