fluent-bit: S3 Output Compression not working
Bug Report
Describe the bug
Using td-agent-bit version 1.7.8 with the S3 output, the compression setting seems to be ignored, even when using use_put_object true
To Reproduce Here is my configuration of the output s3 block.
[OUTPUT]
name s3
match *
region us-east-2
bucket my-bucket-name
s3_key_format /fluent-bit-logs/$TAG/%Y/%m/%d/%H/%M/%S/$UUID.gz
use_put_object On
total_file_size 40M
upload_timeout 1m
compression gzip
Regardless if compression setting is missing (inferring none) or present with gzip, the uploaded files are always cleartext / uncompressed.
Expected behavior Logs uploaded would be compressed with gzip before upload.
Your Environment
- Version used: 1.7.8
- Configuration: (See above)
- Environment name and version (e.g. Kubernetes? What version?): RPM install
- Server type and version: AWS t3a instance
- Operating System and version: Centos 8, fully patched as of 2021-06-23
- Filters and plugins: none
I can find nothing in the error logs about a failed compression. Every upload, I get a ‘happy’ message: Successfully uploaded object
. However, the file is still cleartext. I saw references in @PettitWesley thread in #2700 that this was working, so I am unsure if this is a regression or something else.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 2
- Comments: 31 (13 by maintainers)
This is occurred by the attribute
Content-Encoding: gzip
tagged with the log.gz file fluent-bit have uploaded.When you download the file tagged as
Content-Encoding: gzip
, user agent (e.g. Chrome, curl) will automatically decode the content as same as downloading gzipped stream on HTTP sinceContent-Encoding: gzip
header has been appended to the response header. Yes, it’s obviously been compressed on S3.An easy solution is to just remove off
.gz
extension froms3_key_format
.There seems to be no way to turn off
Content-Encoding: gzip
.WHY CHROME, WHY!?
I don’t think this is working. I have a similar configuration to the ones reported before:
And I got my files in S3. I.E. s3://mybucket/fluentbit/log/kube/2022/02/23/01/35/48/15BRQR03.gz
Then I select the file in S3 and in the object actions I select “Query with S3 Select”
So in the S3 select I configure like:
Now you will notice I select one JSON per line and gzip compression as it is the expected output, however it returns an error that says GZIP is not applicable.
However, if I change the compression to None, I get a proper response on the same query:
While I got a Mac, these queries are being run inside AWS and the files won’t touch my laptop so I can say with a level of certainty the files are not being gzipped.