azure-sdk-for-java: [BUG] Random Blob Upload Errors for Moderate Workloads

Describe the bug Uploading files randomly fails when sequentially uploading 30 files. Largest file is 100 MB, some are 1-100MB, and some are less than 1MB. Error is always “Request body emitted ${n+1} bytes, more than the expected ${n} bytes.” Uploading all 30 files occasionally works, but typically not.

Exception or Stack Trace

2019-12-05 20:22:58 ERROR Managed to upload 17 files 
2019-12-05 20:22:58 ERROR com.azure.core.exception.UnexpectedLengthException: Request body emitted 7962674 bytes, more than the expected 7962673 bytes.
com.nielsen.redacted.InputUploadException: com.azure.core.exception.UnexpectedLengthException: Request body emitted 7962674 bytes, more than the expected 7962673 bytes.
	at com.nielsen.Redacted.method(Redacted.java:496)

To Reproduce Steps to reproduce the behavior:

  1. SDK Versions: both 12.0.0 and 12.0.0-preview-4
  2. Java 11, including containerized openjdk:11-jre-slim
  3. Setup multiple InputStream (particularly, from a remote connection instead of file system).
  4. Synchronously and sequentially invoke BlockBlobClient.upload(inputStream, size)

Code Snippet

if (contentStream.getSize() < TWO_HUNDRED_FORTY_MB) { // api limit is 256MB
				BlockBlobItem blobItem = blockBlobClient.upload(stream, contentStream.getSize());

Expected behavior Expected higher success rate; 95% would be fine. Observed rate is under 50%.

Screenshots N/A

Setup (please complete the following information):

  • OS: [Linux]
  • IDE : [IntelliJ]
  • Version of the Library used: 12.0.0

Additional context Things are more reliable when using blockBlobClient.getBlobOutputStream(), but that is much slower, unless there’s a trick to using it with arbitrary input streams.

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 23 (13 by maintainers)

Most upvoted comments

Thanks, and likewise. I’ll see if I can catch it.

I did try another experiment where I wrapped the input stream with something that copies bytes to the filesystem as well as blob, and noticed strange behavior such as the files having a lot of empty data written at the end, even up to a few GB. That was probably an implementation problem on my end though.

Similar to your suggestion, I’ll also see if I can break at the call to available() when time permits.