aws-for-fluent-bit: Frequent errors when using output plugin cloudwatch_logs

Describe the question/issue

I know this is not a new issue. But I’m wondering whether the root cause is identified? We are using cloudwatch_logs plugin to send logs from our Pods to AWS cloud watch. The errors look like below:

[2021/12/01 01:53:50] [ warn] [engine] failed to flush chunk '1-1638323628.679735747.flb', retry in 10 seconds: task_id=0, input=emitter_for_rewrite_tag.5 > output=http.2 (out_id=2) [2021/12/01 02:02:41] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096 [2021/12/01 02:02:41] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Recieved code 200 but response was invalid, x-amzn-RequestId header not found [2021/12/01 02:02:41] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:02:41] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send events [2021/12/01 02:02:41] [ warn] [engine] failed to flush chunk '1-1638324161.187277303.flb', retry in 7 seconds: task_id=0, input=tail.0 > output=cloudwatch_logs.3 (out_id=3) [2021/12/01 02:02:55] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096 [2021/12/01 02:02:55] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Recieved code 200 but response was invalid, x-amzn-RequestId header not found [2021/12/01 02:02:55] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:02:55] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send events [2021/12/01 02:02:55] [ warn] [engine] failed to flush chunk '1-1638324175.42999390.flb', retry in 7 seconds: task_id=0, input=tail.0 > output=cloudwatch_logs.3 (out_id=3) [2021/12/01 02:04:29] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096 [2021/12/01 02:04:29] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Recieved code 200 but response was invalid, x-amzn-RequestId header not found [2021/12/01 02:04:29] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:04:29] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send events [2021/12/01 02:04:29] [ warn] [engine] failed to flush chunk '1-1638324268.678872073.flb', retry in 9 seconds: task_id=0, input=tail.0 > output=cloudwatch_logs.3 (out_id=3) [2021/12/01 02:04:38] [ warn] [http_client] malformed HTTP response from logs.eu-central-1.amazonaws.com:443 on connection #34 [2021/12/01 02:04:38] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:04:38] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:04:38] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send events [2021/12/01 02:04:38] [ warn] [engine] chunk '1-1638324268.678872073.flb' cannot be retried: task_id=0, input=tail.0 > output=cloudwatch_logs.3 [2021/12/01 02:07:49] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096 [2021/12/01 02:07:49] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Recieved code 200 but response was invalid, x-amzn-RequestId header not found [2021/12/01 02:07:49] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:07:49] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send events [2021/12/01 02:07:49] [ warn] [engine] failed to flush chunk '1-1638324469.352128961.flb', retry in 11 seconds: task_id=0, input=tail.0 > output=cloudwatch_logs.3 (out_id=3) [2021/12/01 02:08:00] [ warn] [http_client] malformed HTTP response from logs.eu-central-1.amazonaws.com:443 on connection #102 [2021/12/01 02:08:00] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:08:00] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:08:00] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send events [2021/12/01 02:08:00] [ warn] [engine] chunk '1-1638324469.352128961.flb' cannot be retried: task_id=0, input=tail.0 > output=cloudwatch_logs.3 [2021/12/01 02:10:31] [ warn] [http_client] malformed HTTP response from logs.eu-central-1.amazonaws.com:443 on connection #102 [2021/12/01 02:10:31] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:10:31] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send log events [2021/12/01 02:10:31] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Failed to send events [2021/12/01 02:10:31] [ warn] [engine] failed to flush chunk '1-1638324630.866492143.flb', retry in 9 seconds: task_id=0, input=tail.0 > output=cloudwatch_logs.3 (out_id=3) [2021/12/01 02:11:41] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096 [2021/12/01 02:11:41] [error] [output:cloudwatch_logs:cloudwatch_logs.3] Recieved code 200 but response was invalid, x-amzn-RequestId header not found

` [INPUT] Name tail Path /var/log/containers/.log Parser custom_cri Tag kube. Mem_Buf_Limit 5MB Skip_Long_Lines On [INPUT] Name systemd Tag host.* Systemd_Filter _SYSTEMD_UNIT=kubelet.service Read_From_Tail On

[FILTER]
    Name record_modifier
    Match *
    Record cluster_identifier {{ .Values.cluster }}
[FILTER]
    Name kubernetes
    Match kube.*
    Merge_Log On
    Keep_Log Off
    K8S-Logging.Parser On
    K8S-Logging.Exclude On
[FILTER]
    Name                lua
    Match               kube.*
    script              kubernetes_map_keys.lua
    call                kubernetes_map_keys
[FILTER]
    Name                lua
    Match               *
    script              kubernetes_map_keys.lua
    call                add_source_type
[FILTER]
    Name                lua
    Match               *
    script              kubernetes_map_keys.lua
    call                add_aligned_sent_received
[FILTER]
    Name                rewrite_tag
    Match               kube.*
    Rule                $source_type ^(istio-envoy-proxy-access-log)$ istio-envoy false

[OUTPUT]
    Name s3
    Match kube.*
    bucket upscale-sre-fluentbit-logs-dev-use1
    region {{ .Values.region }}
    total_file_size 100M
    upload_timeout 5m
    s3_key_format /%Y/%m/%d/%H/
    s3_key_format_tag_delimiters .

[OUTPUT]
    Name                cloudwatch_logs
    Match               kube.*
    region              {{ .Values.region }}
    log_group_name      fluent-bit-cloudwatch/{{ .Values.clusterName }}
    log_stream_prefix   from-fluent-bit-
    auto_create_group   true`

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 3
  • Comments: 29 (15 by maintainers)

Most upvoted comments

@vkadi the recommendations and fluent bit config options can be used in EKS as well. Fluent Bit config language is the same no matter where you deploy it.

@yangyang919 Sorry I’m not really sure what’s causing this for you.

For another issue, I recently did some testing/benchmarking and memory leak checking, and I found that our latest stable version is very stable in memory usage and does not have any leaks: https://github.com/aws/aws-for-fluent-bit/blob/mainline/AWS_FOR_FLUENT_BIT_STABLE_VERSION

Please try that version if you haven’t already.

Hi @PettitWesley ,

We have upgraded fluent-bit to version 1.8.10. New issues appeared: Pod gets OOMKilled, and Back-off restarted frequently. If removing cloud-watch output part, then everything is normal.

Our config is like below, do you see any problem?

` [SERVICE] Daemon Off Flush 1 Log_Level warn Parsers_File parsers.conf Parsers_File custom_parsers.conf HTTP_Server On HTTP_Listen 0.0.0.0 HTTP_Port 2020 Health_Check On

[INPUT]
    Name tail
    Path /var/log/containers/*.log
    Parser custom_cri
    Tag kube.*
    Mem_Buf_Limit 128MB
    Skip_Long_Lines On

[INPUT]
    Name systemd
    Tag host.*
    Systemd_Filter _SYSTEMD_UNIT=kubelet.service
    Read_From_Tail On

[FILTER]
    Name record_modifier
    Match *
    Record cluster_identifier stage-use1-saml

[FILTER]
    Name kubernetes
    Match kube.*
    Merge_Log On
    Keep_Log Off
    K8S-Logging.Parser On
    K8S-Logging.Exclude On

[FILTER]
    Name                lua
    Match               kube.*
    script              kubernetes_map_keys.lua
    call                kubernetes_map_keys

[FILTER]
    Name                lua
    Match               *
    script              kubernetes_map_keys.lua
    call                add_source_type

[FILTER]
    Name                lua
    Match               *
    script              kubernetes_map_keys.lua
    call                add_aligned_sent_received

[FILTER]
    Name                rewrite_tag
    Match               kube.*
    Rule                $source_type ^(istio-envoy-proxy-access-log)$ istio-envoy false

[OUTPUT]
    Name http
    Match kube.*
    Retry_Limit False
    Host host
    port 443
    uri /
    http_User user
    http_Passwd pwd
    format json
    tls on
    tls.debug 1

[OUTPUT]
    Name http
    Match approuter
    Retry_Limit False
    Host host
    port 443
    uri /
    http_User user
    http_Passwd pwd
    format json
    tls on
    tls.debug 1

[OUTPUT]
    Name http
    Match istio-envoy
    Retry_Limit False
    Host host
    port 443
    uri /customindex/istio-envoy
    http_User user
    http_Passwd pwd
    format json
    tls on
    tls.debug 1






[OUTPUT]
    Name                cloudwatch_logs
    Match               kube.*
    region              us-east-1
    log_group_name      fluent-bit-cloudwatch/stage-use1
    log_stream_prefix   from-fluent-bit-
    auto_create_group   true

[OUTPUT]
    Name                cloudwatch_logs
    Match               approuter
    region              us-east-1
    log_group_name      fluent-bit-cloudwatch/stage-use1
    log_stream_prefix   from-fluent-bit-
    auto_create_group   true`