azure-sdk-for-go: CreateBlockBlobFromReader fails due to missing content length parameter
Usage:
blob.CreateBlockBlobFromReader(reader, nil)
Output:
storage: service returned error: StatusCode=411, ErrorCode=MissingContentLengthHeader, ErrorMessage=Content-Length HTTP header is missing
fmt.Printf("Content length: %d", blob.Properties.ContentLength) - this outputs 0
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 25 (13 by maintainers)
Azure Storage blobs and AWS do not allow an arbitrarily large file to be uploaded in one HTTP operation. Therefore, uploading in blocks is required by both services. Furthermore, I/O operations can always fail (for many reasons) and therefore I/O operations must retried. In order to retry an upload, the data must be in a memory buffer and there must be a way to seek back to the beginning of the buffer. So, for data coming in from a non-seekable source, that data must first be buffered in memory (a seekable source) before upload to azure storage operation is initiated. This is the required, mandatory building block. By the way, each upload block operation only needs the size (content length) of the block; not the length of the full file. Once Azure’s PutBlockList operation is called, Azure will assume all the blocks into a single blob and set the full-size content length automatically.
Now, on top of this building block, a function can be easily implemented that streams a large mass by splitting it into blocks, uploading each block, and then after the last block, calls PutBlockList to assemble the full-size blob. I don’t know if the current Go SDK for azure storage has this function but our future Go SDK will definitely have it.
What happens if the reader i pass is huge? I see that you do:
That means reading the entire reader into memory, just to compute the content length. If azure allows uploads up to ~4TB in size, shouldn’t the sdk allow that too?
The whole go-philosophy of Readers is to stream data, not read it all at once (think Unix pipes). Otherwise I’d simply send the whole byte array, not a io.Reader.
Not to make service comparisons, but aws-s3 sdk takes the approach of reading from the Reader, chunking the input every N bytes and uploading the parts much like what the AppendBlob would do.
Our use case is streaming database backups (several GBs) to azure on the fly (without saving to disk), so ReadSeeker is not an option either.
What do you guys think? Many thanks!
it’s working for me now, thanks!
@Blackbaud-ChrisJenkins - until there is a fix in the SDK you can pass the size yourself, as in https://github.com/Azure/azure-sdk-for-go/pull/627
@marstr I just ran into this issue, running
so I think the answer to your question is no