fluent-bit: S3 output intermittently fails with errors SignatureDoesNotMatch, broken pipe, or HTTP version error

Bug Report

Describe the bug I have Fluent Bit deployed in a kubernetes cluster sending a large volume of logs to an S3 bucket. Most logs are transmitted successfully, however Fluent Bit regularly logs the “PutObject request failed” error. (This is odd because use_put_object is set to false) Fluent Bit logs the HTTP 403 response it got from S3, which has this error text: “The request signature we calculated does not match the signature you provided. Check your key and signing method.”

There are also a lot of “broken pipe” errors although it is unclear if they are related.

Even more bizarrely, there are intermittent HTTP 505 Version not supported errors. I have no idea how these could be intermittent, surely the HTTP version is always the same?

To Reproduce

Rubular link if applicable: N/A
Example log message if applicable:

[2023/02/28 23:16:52] [error] [output:s3:s3.1] PutObject request failed
[2023/02/28 23:16:52] [error] [output:s3:s3.1] PutObject API responded with error='SignatureDoesNotMatch', message='The request signature we calculated does not match the signature you provided. Check your key and signing method.'

[2023/02/28 23:16:52] [error] [/src/fluent-bit/src/flb_http_client.c:1201 errno=32] Broken pipe

[2023/02/28 23:31:05] [error] [output:s3:s3.1] PutObject API responded with error='HttpVersionNotSupported', message='The HTTP version specified is not supported.'
[2023/02/28 23:31:05] [error] [output:s3:s3.1] Raw PutObject response: HTTP/1.1 505 HTTP Version not supported

Steps to reproduce the problem: I’m afraid I don’t have a concise set of steps to reproduce, our environment is fairly large and complex and this issue only seems to appear under heavy load.

Expected behavior Fluent Bit sends log files to S3 bucket

Screenshots N/A

Your Environment

Version used: 2.0.9
Configuration:

    [OUTPUT]
        Name s3
        Match *
        bucket ${log_bucket_name}
        region ${log_bucket_region}
        total_file_size 10M
        s3_key_format /${cluster_name}/$TAG/%Y/%m/%d/%H/%M/%S
        use_put_object false

Environment name and version (e.g. Kubernetes? What version?): AWS EKS cluster, Kubernetes 1.22
Server type and version: EC2 instances of various types
Operating System and version: Bottlerocket 1.12.0
Filters and plugins: tail input, modify filter, kubernetes filter

Additional context I know that Fluent Bit retries requests to S3 but I am seeing occasional messages like this:

[2023/02/28 23:30:56] [ warn] [output:s3:s3.1] Chunk file failed to send 5 times, will not retry

So I am concerned that I am losing log messages.

I realize that these could be 3 different issues however they seem to occur together and I’m wondering if they could have a common cause.

Incidentally, I found this comment (over a year old) which reports the same behavior: https://github.com/fluent/fluent-bit/issues/4505#issuecomment-1000376903

About this issue

Original URL
State: closed
Created a year ago
Comments: 25 (9 by maintainers)

Most upvoted comments

Could it be that when a chunk is added to the upload queue by add_to_queue a raw copy of the tag is made in line 1584 with a buffer that’s insufficient to hold the terminator?

leonardo-albertovich on Mar 9, 2023

I will investigate if this report is related: https://github.com/aws/aws-for-fluent-bit/issues/541

PettitWesley on Mar 9, 2023