aws-sdk-js-v3: ECONNRESET exceptions when running in Lambda environment
Describe the bug
import { S3 } from '@aws-sdk/client-s3';
import { Handler, Context, S3Event } from 'aws-lambda';
const s3 = new S3({})
export const handler: Handler = async (event: S3Event, context: Context) => {
await s3.getObject({
Bucket: event.Records[0].s3.bucket.name,
Key: event.Records[0].s3.object.key,
});
}
We have this very basic lambda function that reads the file from S3 when a new file is uploaded (we actually consume the Body stream too, left that out for brevity). The function is called intermittently meaning that sometimes we get a new Lambda function (i.e. cold) sometimes the Lambda container is reused. When the container is reused, we sometimes see a ECONNRESET
exception such as this one
2020-05-20T16:50:28.107Z d7a43394-afad-4267-a4a4-5ad3633a1db8 ERROR Error: socket hang up
at connResetException (internal/errors.js:608:14)
at TLSSocket.socketOnEnd (_http_client.js:460:23)
at TLSSocket.emit (events.js:322:22)
at endReadableNT (_stream_readable.js:1187:12)
at processTicksAndRejections (internal/process/task_queues.js:84:21) {
code: 'ECONNRESET',
'$metadata': { retries: 0, totalRetryDelay: 0 }
}
I’m pretty confident that this is due to the keep-alive nature of the https connection. Lambda processes are frozen after they execute and their host seems to terminate open sockets after ~10 minutes. The next time the S3 client tries to reuse the socket, the exception is thrown.
We are running into similar issues with connections to our Aurora database which also terminates intermittently with the same error message (see https://github.com/brianc/node-postgres/issues/2112). It’s an error we can easily recover from if we try to reopen the socket but aws-sdk-v3 seems to prefer to throw an error message instead.
Is the issue in the browser/Node.js? Node.js 12.x on AWS Lambda
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 20
- Comments: 35 (8 by maintainers)
I’ve managed to work around this using this configuration (updated for gamma):
It would be nice if this was built-in to
defaultRetryDecider
. Although, is there an argument for this being built-in to the NodeHttpHandler, as this is a node-specific error, and one where the handler should probably “just work”?Info AWS lambda Node.js 12.x “@aws-sdk/client-dynamodb”: “^1.0.0-gamma.1”
Lambda
Local
Produces the following errors consistently when run with
5 minute90 sec intervals. First call works, second call after5 minutes90 seconds produce 1 of the following 2 errors. Error logs are from CloudWatch.Works as expected when run with 1 minute intervals.
This issue is fixed in https://github.com/aws/aws-sdk-js-v3/pull/1693, and will be published in rc.7 on Thursday 11/19
Hi @rraziel, I’m currently looking into how JS SDK v2 handles this and will provide a fix in v3 accordingly.
The current behavior in undesirable, and the SDK should retry the error instead of asking user to do it.
Using the fix of
serverless-nextjs
fixed it for me. This is not at all a permanent solution as it will requery continuously when the matched status code will get returned.TS implementation:
the
retryStrategy
prop is available in all clients AFAIK.Hoping for an actual fix in the next RC
So we are at release candidate 4 and this problem has not even been acknowledged 😞
Has anyone from AWS or a maintainer even commented on this issue? This seems like this should be a priority given it happens in most use-cases unless you rarely call your lambdas.
iam testing
1.0.0-gamma.10
in production with loging over custom retry strategyIssues are still happening in
1.0.0-gamma.6
😕clients in 1.0.0-gamma.3 now retry in case of Transient Errors
It doesn’t check for ECONNRESET, ETIMEDOUT or EPIPE though
@studds Thanks for the elegant solution. This is working perfect for me now.