fluent-bit: Azure storage output append blob fails after 50,000 blocks

Bug Report

Describe the bug Followed these instructions on how to send logs to an append blob in Azure Storage.

Works great! However, when the blobs reach about 20-30MB, they are no longer being appended. Restarting the pods will create a new blob and it works until this happens again.

To Reproduce

  • Example log message:
[2020/10/16 13:11:34] [error] [output:azure_blob:azure_blob.0] cannot append content to blob
<?xml version="1.0" encoding="utf-8"?><Error><Code>BlockCountExceedsLimit</Code><Message>The committed block count cannot exceed the maximum limit of 50,000 blocks.
RequestId:15cdaf1a-f01e-004f-28bd-a3bd01000000
Time:2020-10-16T13:11:34.7990633Z</Message></Error>
[2020/10/16 13:11:34] [ warn] [engine] chunk '1-1602853882.622679709.flb' cannot be retried: task_id=0, input=tail.0 > output=azure_blob.0

Expected behavior After an append blob reaches the max number of blocks, it should create a new blob and not fail.

 /// <summary>
    /// Gets or sets a value for a condition that specifies the maximum size allowed for an append blob when a new block is committed. The append
    /// will succeed only if the size of the blob after the append operation is less than or equal to the specified size.
    /// </summary>
    /// <value>The maximum size in bytes, or <c>null</c> if no value is set.</value>
    /// <remarks>This condition only applies to append blobs.</remarks>
    public long? IfMaxSizeLessThanOrEqual

Reference: https://stackoverflow.com/questions/49627812/how-to-handle-append-blob-exception

Your Environment

Additional context

Was going to use the ideal and official Azure storage plugin to send logs to an append blob. At this point, it will not be possible to use fluent-bit if append operations fail.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 16
  • Comments: 45 (15 by maintainers)

Most upvoted comments

Assigned to milestone Fluent Bit v1.7, the fix might come earlier…

hi, I will take a look at this issue

The actual plugin should be marked stale because it really is useless while this bug is present

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

No, the issue hasn’t gone away just because it has been ignored for a month…

Assigned to milestone Fluent Bit v1.7, the fix might come earlier…

What’s the status on this? From what I can tell it doesn’t seem to be included in v1.7 (it’s not in the relase notes and there doesn’t seem to be any commits in the repo that seems relevant either).

Bump.

Hitting this issue with version 2.1.10.

After the 50.000 chunks is reached for a particular blob object, the only way to recover is to recreate the object (download/upload) with larger chunks (max 4Mb/chunk). I would like to suggest the implementation of some options to configure the official plugin so this condition can be avoided:

  • Configure key name, like in the S3 plugin, as to distribute data
  • Configure chunk size (min/max) to reduce number of append operations
  • Configure max file size to avoid reaching the limits on Azure
  • Configure compression (gzip) to save cost on Blob Storage

Ubereil’s suggestion is helpful here: https://github.com/fluent/fluent-bit/issues/2692#issuecomment-1061701119

Edit: Reference for Scalability and performance targets for Blob storage, from where the numbers above came from: https://learn.microsoft.com/azure/storage/blobs/scalability-targets#scale-targets-for-blob-storage

Thanks for support.

Still facing the same in 2.1.2

Still broken in 2.0.8

It baffles me that such a broken plugin is still listed as “Official and Microsoft Certified Azure Storage Blob connector” in the fluent-bit-docs website https://docs.fluentbit.io/manual/pipeline/outputs/azure_blob. People believe the plugin is actually certified and working…

Are there other workarounds, like for example having a filter appending a timestamp to the tag to for blobs rotate every hour?

This is the workaround we use, using a filter to attach the date to the tag. It’s not all that nice that it’s needed (and it wasn’t trivial to figure out how to do it) but it works reasonably, as long as we don’t get too many logs.

Here’s the filter we use (IIRC we have fields in the logs called LogType and Source):

[FILTER] Name rewrite_tag Match * Rule $LogTimeReadable ^(\d\d\d\d-\d\d-\d\d).*$ azureblob.$TAG.$LogType.$Source.$1 true

Time for the monthly bump.