aws-sdk-js: Multipart upload failing after recover from retry

Confirm by changing [ ] to [x] below to ensure that it’s a bug:

Describe the bug The upload() function fails after coming out of a retry state (e.g. triggered by network issues/oscillations) during a multipart upload process.

It seems that it never recovers from the retry state and new retry attempts keep coming up (on the previously failing parts) even after the conditions are good again (e.g. network is back).

My starting point was the Developer Guide: https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/s3-example-creating-buckets.html

Is the issue in the browser/Node.js? Node.js

If on Node.js, are you running this on AWS Lambda? No

Details of the browser/Node.js version 7.8.0

SDK version number 2.603.0

To Reproduce (observed behavior)

  1. Upload a file big enough to get split and trigger a multipart upload (reproduced with a 380 MB file):
const AWS = require('aws-sdk');
const fs = require('fs');

const s3 = new AWS.S3();

s3.upload({
  Bucket: '...',
  Key: 'path_to_file',
  Body: fs.createReadStream('path_to_file')
}, ...);
  1. Drop all outgoing packets to S3 (just targeting port 443 worked for me):
iptables -I OUTPUT -p tcp --dport 443 -j DROP
  1. Wait for the retry process to get triggered a few times to get closer to its limit (maxRetries param)
  2. Re-enable network:
iptables -D OUTPUT -p tcp --dport 443 -j DROP
  1. Observe that new retry attempts keep getting triggered and upload does not complete. The following errors were observed:
  • InvalidPart: One or more of the specified parts could not be found. The part may not have been uploaded, or the specified entity tag may not match the part's entity tag
  • NoSuchKey: The specified key does not exist.
  • NetworkingError: write EPIPE

Expected behavior The failing parts to be re-uploaded after conditions are good again (considering retry didn’t reach the maxRetries limit during the interruption).

As per documentation, I’m assuming multipart related failures should be handled automatically (as I’m sticking to the defaults):

leavePartsOnError (Boolean) — default: false — whether to abort the multipart upload if an error occurs. Set to true if you want to handle failures manually.

https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3/ManagedUpload.html#constructor-property

Screenshots n/a

Additional context n/a

This question was also posted on Gitter: https://gitter.im/aws/aws-sdk-js?at=5e29dca7dc07667042dd79de

Any help is much appreciated, thanks!

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (10 by maintainers)

Most upvoted comments

@lzanolcastanheira Will close this issue for now, please open a new issue referring to this one in future and I can re-open it.

@ajredniwja Sounds good, please let me know whenever you have an update.

As a follow up, I’ve tested the same scenario (bringing the connection down during a multipart upload) using Python/boto3 (upload_file()) and it always recovers and succeeds. Also tried with a newer Node.js version (10) and the same errors were observed.

Sure. Sometimes the process also aborts sooner even before getting to all the parts.

Hey @lzanolcastanheira,

I was able to reproduce this but that wasn’t the case every time I ran the code but I surely saw the some of the errors that you mentioned. Would need more input from the people from the team to find the root cause for it. Will get back to you as soon as I have an update.