aws-sdk-js-v3: EMFIL (too many open files) still exists when making mass S3 calls in JS V3 @aws-sdk/client-s3
Checkboxes for prior research
- I’ve gone through Developer Guide and API reference
- I’ve checked AWS Forums and StackOverflow.
- I’ve searched for previous similar issues and didn’t find any solution.
Describe the bug
Running a lambda script that processes a large amount of artifacts from a CodePipeline into S3. The copying process is asynchronous to improve speed and I am hitting descriptor limits within the lambda.
The lambda has hard limit of 1024 descriptors - https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
This issue previously was brought up in version 3.40.0 and up to 3.52.0 and was previously described in this issue log/forum : https://github.com/aws/aws-sdk-js-v3/issues/3019
I have tried implementing the solutions suggested in the issue log, but none have worked
Stack Trace:
2023-01-19T00:18:48.903Z f9f83cda-da62-41dc-8e2a-240a3bc67b12 ERROR Invoke Error {
"errorType": "SystemError",
"errorMessage": "A system error occurred: uv_os_homedir returned EMFILE (too many open files)",
"code": "ERR_SYSTEM_ERROR",
"info": {
"errno": -24,
"code": "EMFILE",
"message": "too many open files",
"syscall": "uv_os_homedir"
},
"errno": -24,
"syscall": "uv_os_homedir",
"stack": [
"SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)",
" at getHomeDir (/var/task/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/getHomeDir.js:14:29)",
" at getCredentialsFilepath (/var/task/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/getCredentialsFilepath.js:7:128)",
" at loadSharedConfigFiles (/var/task/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/loadSharedConfigFiles.js:11:76)",
" at /var/task/node_modules/@aws-sdk/node-config-provider/dist-cjs/fromSharedConfigFiles.js:8:102",
" at /var/task/node_modules/@aws-sdk/property-provider/dist-cjs/chain.js:11:28",
" at runMicrotasks (<anonymous>)",
" at processTicksAndRejections (internal/process/task_queues.js:95:5)",
" at async coalesceProvider (/var/task/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:14:24)",
" at async /var/task/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:26:28",
" at async resolveParams (/var/task/node_modules/@aws-sdk/middleware-endpoint/dist-cjs/adaptors/getEndpointFromInstructions.js:29:40)"
]
}
SDK version number
@aws-sdk/client-s3@3.252.0
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
Node v14.x - Lambda provided version
Reproduction Steps
I can’t share my works code unfortunatly, but snippets from https://github.com/aws/aws-sdk-js-v3/issues/3019 will reproduce the error. I have essentially a lambda that is downloading 5 or 6 artifacts from s3, unzipping them in memory and then uploading the results to s3 buckets located in every available AWS region. This uploads every file recursively within the zip which in a couple projects includes node_modules
folders. This is heavily asynchronous to optimise performance
Observed Behavior
Part way through processing the uploading fails and dumps the following stack trace:
2023-01-19T00:18:48.903Z f9f83cda-da62-41dc-8e2a-240a3bc67b12 ERROR Invoke Error {
"errorType": "SystemError",
"errorMessage": "A system error occurred: uv_os_homedir returned EMFILE (too many open files)",
"code": "ERR_SYSTEM_ERROR",
"info": {
"errno": -24,
"code": "EMFILE",
"message": "too many open files",
"syscall": "uv_os_homedir"
},
"errno": -24,
"syscall": "uv_os_homedir",
"stack": [
"SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)",
" at getHomeDir (/var/task/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/getHomeDir.js:14:29)",
" at getCredentialsFilepath (/var/task/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/getCredentialsFilepath.js:7:128)",
" at loadSharedConfigFiles (/var/task/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/loadSharedConfigFiles.js:11:76)",
" at /var/task/node_modules/@aws-sdk/node-config-provider/dist-cjs/fromSharedConfigFiles.js:8:102",
" at /var/task/node_modules/@aws-sdk/property-provider/dist-cjs/chain.js:11:28",
" at runMicrotasks (<anonymous>)",
" at processTicksAndRejections (internal/process/task_queues.js:95:5)",
" at async coalesceProvider (/var/task/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:14:24)",
" at async /var/task/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:26:28",
" at async resolveParams (/var/task/node_modules/@aws-sdk/middleware-endpoint/dist-cjs/adaptors/getEndpointFromInstructions.js:29:40)"
]
}
I tried implementing a number of solutions offered within the issue log previously mentioned. But nothing has changed hitting this limit.
I have tried reducing my client socket timeouts, modified the config file reader to not read. But I am still hitting this error
Expected Behavior
Ultimatly, to not have this error as I believe it is the SDK client leaking descriptors. But some way to control or modify how the SDK handles its descriptors would be good
Possible Solution
No response
Additional Information/Context
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 5
- Comments: 36 (14 by maintainers)
We’ll aim to publish the fix with
v3.405.0
on Friday September 1st.But it can get delayed to
v3.406.0
on Tuesday September 5th as:@smithy
repo, and update@aws-sdk
clients.It’s also long weekend in United States.
Closing, as the specific issue which throws
EMFILE
errors onuv_os_homedir
mentioned in the main issue description has been fixed. You need to have SDK >=3.378.0 with lockfile updated or >=3.405.0 (releasing on Fri, Sep 1).If you need a workaround for other EMFILE issues, we’ve provided three recommendations in https://github.com/aws/aws-sdk-js-v3/issues/4345#issuecomment-1699733113
Please create a new issue if you’re still getting EMFILE errors with the provided fix.
It works great mate, many thanks to your quick action!
Thanks for your quick action, I have deployed the new version to the testing environment, and my team will test it in short. Will let you know how it goes 😃 Thanks again!
We’ve posted a fix at https://github.com/awslabs/smithy-typescript/pull/903
The present working directory is attached to Effective user ID, as per the documentation and source code.
Docs: https://nodejs.org/api/os.html#oshomedir
Source code: https://github.com/nodejs/node/blob/a39b8a2a3e8414c8a54650bdfe4a46998282d88a/deps/uv/src/unix/core.c#L1193
Maybe we can cache results of
os.homedir()
with process.geteuid()Adding in client.destroy() just before resolving my data seemed to fix the issue for me. Here is an example of the majority of my calls and no longer getting this file error or even rate limit issues that I has seen before. Hope this may help someone else: