vector: High CPU and memory consumption with buffer enabled

Vector Version

vector 0.13.0 (v0.13.0 x86_64-unknown-linux-gnu 2021-04-21)

Vector Configuration File

[api]                                                                                                                                                                                                                                                                                       
  enabled = true                                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                            
[sources.journal]                                                                                                                                                                                                                                                                           
  type = "journald"                                                                                                                                                                                                                                                                         
  exclude_units = ["kernel","vector","telegraf","chronyd","crond","init.scope","dnf-makecache.service","rsyslog"]                                                                                                                                                                           
                                                                                                                                                                                                                                                                                            
[sources.app_std]                                                                                                                                                                                                                                                                         
  type = "file"                                                                                                                                                                                                                                                                             
  include = ["/opt/app/*/logs/stdout.log"]                                                                                                                                                                                                                                           
  fingerprint.strategy = "device_and_inode"                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                            
  [sources.app_std.multiline]                                                                                                                                                                                                                                                             
    start_pattern = '^[0-9]{4}-[0-9]{2}-[0-9]{2}'                                                                                                                                                                                                                                           
     mode = "halt_before"                                                                                                                                                                                                                                                                   
     condition_pattern = '^[0-9]{4}-[0-9]{2}-[0-9]{2}'                                                                                                                                                                                                                                      
     timeout_ms = 999999999                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                            
[sources.app]                                                                                                                                                                                                                                                                             
  type = "file"                                                                                                                                                                                                                                                                             
  include = ["/opt/app/*/logs/*/*.log"]                                                                                                                                                                                                                                              
  fingerprint.strategy = "device_and_inode"                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                            
  [sources.app.multiline]                                                                                                                                                                                                                                                                 
     start_pattern = '^[0-9]{4}-[0-9]{2}-[0-9]{2}'                                                                                                                                                                                                                                          
     mode = "halt_before"                                                                                                                                                                                                                                                                   
     condition_pattern = '^[0-9]{4}-[0-9]{2}-[0-9]{2}'                                                                                                                                                                                                                                      
     timeout_ms = 999999999                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                            
[transforms.app_filter]                                                                                                                                                                                                                                                                   
  type = "filter"                                                                                                                                                                                                                                                                           
  inputs = ["app"]                                                                                                                                                                                                                                                                        
  condition.type = "check_fields"                                                                                                                                                                                                                                                           
  condition."message.not_contains" = "NOT_FOUND - no exchange"                                                                                                                                                                                                                              
  condition."message.not_regex" = " (INFO|DEBUG|TRACE) "                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                            
[sinks.app_prod]                                                                                                                                                                                                                                                                          
  type = "vector"                                                                                                                                                                                                                                                                           
  inputs = ["app_filter","app_std"]                                                                                                                                                                                                                                                     
  address = "server:5001"                                                                                                                                                                                                                                                    
  healthcheck = true                                                                                                                                                                                                                                                                        
#  buffer.max_size = 10100000                                                                                                                                                                                                                                                               
#  buffer.type = "disk"                                                                                                                                                                                                                                                                     
#  buffer.when_full = "block"                                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                                                                            
[sinks.system]                                                                                                                                                                                                                                                                              
  type = "vector"                                                                                                                                                                                                                                                                           
  inputs = ["journal"]                                                                                                                                                                                                                                                                      
  address = "server:5000"                                                                                                                                                                                                                                                    
  healthcheck = true                                                                                                                                                                                                                                                                        
#  buffer.max_size = 10100000                                                                                                                                                                                                                                                               
#  buffer.type = "disk"                                                                                                                                                                                                                                                                     
#  buffer.when_full = "block"

Debug Output

https://gist.github.com/alpi-ua/0a128a6868f20595538908a50e8e5871

Expected Behavior

Vector releasing resources while not producing logs; Vector working properly with buffers

Actual Behavior

Vector consuming resources while idle when using buffers. If I’m not using disk buffer - Vector is working fine

Additional info:

With lsof I can see that Vector is holding old buffers which supposed to be deleted when it have a stable connection with an output. After purging buffers directory, new files are created.

image image

References

https://github.com/timberio/vector/issues/3594 https://github.com/timberio/vector/discussions/7159

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 23 (12 by maintainers)

Commits related to this issue

Most upvoted comments

@jszwedko forgive me - as I was saying above - I switched to a memory based buffer and cpu usage appears to be much improved. I never noted issues with memory - but wasn’t looking. I would like to note - though - since moving to memory buffer a number of days ago - cpu appears consistent, whereas before much more cpu would have been chewed up by now.

@jszwedko - while my source (file) is able to be checkpointed - my files rotate out quickly and I was depending on vector to lower the possibility of losing out on data. For now - I have recently switched to a memory based buffer and cpu usage appears to be much improved. That said - this is not a viable for us for long term usage.

@ktff do you mind taking a look at this one since you’ve been working with the buffers? I wasn’t able to reproduce it, but maybe you have some insight.

current data in the directory is only about 30MB

I am seemingly having the same issue with 0.13.1 (v0.13.1 x86_64-unknown-linux-musl 2021-04-29)