fluent-bit: [http_client] cannot increase buffer on k8s

Bug Report

Describe the bug We occasionally get warnings like this thrown by fluent-bit 1.8.2:

[ warn] [http_client] cannot increase buffer: current=512000 requested=544768 max=512000

Note that we do not use kinesis firehose.

To Reproduce

  • Indeterminable. It does not seem to depend on the number of pods or the general load of the cluster, and no source of this could be determined.

Expected behavior

  • No warnings are thrown.

Your Environment

  • Version used: 1.8.2 (installed via official helm chart)
  • Configuration:
service: |
      [SERVICE]
          Flush 5
          Daemon Off
          Log_Level info
          Parsers_File parsers.conf
          Parsers_File custom_parsers.conf
          HTTP_Server On
          HTTP_Listen 0.0.0.0
          HTTP_Port 2020
          storage.path /var/log/fluentbit-buffer/
          storage.metrics On
    inputs: |
      [INPUT]
          Name tail
          Path /var/log/containers/*.log
          Exclude_Path /var/log/containers/calico-node*.log
          storage.type  filesystem
          Parser my-parser
          Tag kube.*
          Mem_Buf_Limit 5MB
          Skip_Long_Lines On
    filters: |
      [FILTER]
          Name kubernetes
          Match kube.*
          Merge_Log On
          Keep_Log Off
          Annotations Off
          K8S-Logging.Parser On
          K8S-Logging.Exclude On
    outputs: |
      [OUTPUT]
          Name es
          Match kube.*
          Host <redacted>
          Logstash_Format On
          Retry_Limit 10
          Trace_Error On
          Replace_Dots On
          Suppress_Type_Name On
    customParsers: |
      [PARSER]
          Name my-parser
          Format regex
          Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$
  • Environment name and version (e.g. Kubernetes? What version?): Kubernetes 1.19.11
  • Server type and version: MS AKS cluster
  • Filters and plugins: tail, kubernetes, es

Additional context I’m starting to think the fluent-bit respone buffer might be a little too small for some of the responses elasticsearch sends, especially when an error occurs - elasticsearch’s JSON responses can get quite large…

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 9
  • Comments: 20 (4 by maintainers)

Most upvoted comments

you’re hitting this error in the flb_http_client code:

https://github.com/fluent/fluent-bit/blob/651e330609182970f11d14603bf6d58ae64f513e/src/flb_http_client.c#L1199-L1202

flb_http_client is used in a few different places by various fluentbit components (e.g. when shipping logs to stackdriver/splunk/newrelic/etc. or when interrogating the kubernetes api).

only a few of the modules that use flb_http_client override the default buffer size (of 4k):

https://github.com/fluent/fluent-bit/search?q=flb_http_buffer_size

one of them is the filter_kubernetes module:

https://github.com/fluent/fluent-bit/blob/1fa0e94a09e4155f8a6d8a0efe36a5668cdc074e/plugins/filter_kubernetes/kube_meta.c#L387

in our case the problem was that we have a fairly large kubernetes cluster. so when retrieving pod metadata from the kubernetes api, the flb_http_client ran out of buffer space.

the solution was:

[FILTER]
    Name         kubernetes
    Buffer_Size  512k   # was 32k
    ...

(512k is probably a bit overkill but it’ll allow us to grow the cluster without ever having to worry about this again)

@gabileibo - yep! We noticed that and also that INPUT.tail accepts both a Buffer_Chunk_Size and Buffer_Max_Size. We ended up with:

    inputs: |
      [INPUT]
          Name tail
          Path /var/log/containers/*.log
          multiline.parser docker, cri
          Tag kube.*
          Mem_Buf_Limit 5MB
          Skip_Long_Lines On
          Buffer_Chunk_Size 64KB
          Buffer_Max_Size 128KB
      [INPUT]
          Name systemd
          Tag host.*
          Systemd_Filter _SYSTEMD_UNIT=kubelet.service
          Read_From_Tail On
    filters: |
      [FILTER]
          Name kubernetes
          Match kube.*
          Merge_Log On
          Keep_Log Off
          K8S-Logging.Parser On
          K8S-Logging.Exclude On
          Buffer_Size 256KB

Also found the upstream values.yaml helpful.

Not ideal since it involves some code duplication, but it makes the warnings & errors go bye-bye.

This should be reopened - still persists on fluent-bit helm chart 0.19.20 (app version 1.8.13).

I don’t believe that OUTPUT.Buffer_Size is the culprit here.

From https://docs.fluentbit.io/manual/pipeline/outputs/elasticsearch, Elasticsearch output Buffer_Size is

the buffer size used to read the response from the Elasticsearch HTTP service.

It seems apparent the buffer in question is a different one.

For example, we have set Buffer_Size 64KB for Elasticsearch output, re-deployed and still see:

[2022/03/09 17:49:31] [ warn] [http_client] cannot increase buffer: current=32000 requested=64768 max=32000

all over the logs.

Notice that max=32000 is displayed here which indicates it’s not influenced by the Elasticsearch output Buffer_Size at all…

on a related note i’d like to give kudos to the maintainers/contributors of fluentbit! 💪

i found the fluentbit source code to be very well organized and therefore very easy to understand.

it took me no more than half an hour to track down what our problem was by looking through the source code… and i’m an absolute n00b when it comes to fluentbit.

I don’t understand. Isn’t 2MB valid?

Looks correct to me: image