nodejs-storage: Storage: lots of "socket hang up" errors
From @ovaris on September 21, 2017 8:31
Environment details
- OS:
- Node.js version: 8.5.0
- npm version: 5.3.0
- google-cloud-node/storage version: 1.2.1
I have a utlity nodejs script that checks existence of few thousands of files in cloud storage.
I run script locally, so not in Cloud environment.
I’m executing those checks (bucket.file(fileName).exists()
) in batch of 20, so not all checks are fired concurrently.
I’m seeing lots of these errors when trying to run script:
{ Error: socket hang up
at TLSSocket.onHangUp (_tls_wrap.js:1140:19)
at Object.onceWrapper (events.js:314:30)
at emitNone (events.js:110:20)
at TLSSocket.emit (events.js:207:7)
at endReadableNT (_stream_readable.js:1059:12)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9)
code: 'ECONNRESET',
path: null,
host: 'accounts.google.com',
port: 443,
localAddress: undefined }
and these:
{ Error: read ECONNRESET
at _errnoException (util.js:1026:11)
at TLSWrap.onread (net.js:606:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
aaand these:
{ Error: socket hang up
at createHangUpError (_http_client.js:345:15)
at TLSSocket.socketOnEnd (_http_client.js:437:23)
at emitNone (events.js:110:20)
at TLSSocket.emit (events.js:207:7)
at endReadableNT (_stream_readable.js:1059:12)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9) code: 'ECONNRESET' }
I have added this fix (suggested here: https://github.com/GoogleCloudPlatform/google-cloud-node/issues/2254):
const gcs = storage();
//https://github.com/GoogleCloudPlatform/google-cloud-node/issues/2254
gcs.interceptors.push({
request: function(reqOpts) {
reqOpts.forever = false;
return reqOpts
}
});
I have tried to reduce the check batch size, but it didn’t have any effect.
Copied from original issue: GoogleCloudPlatform/google-cloud-node#2623
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 6
- Comments: 42 (18 by maintainers)
Commits related to this issue
- fix: remove timeout rule from streaming uploads (#365) Fixes https://github.com/googleapis/nodejs-storage/issues/27 — committed to googleapis/nodejs-common by stephenplusplus 5 years ago
- deps: update @google-cloud/common (#596) Fixes #27 This updates @google-cloud/common, which includes the `timeout: 0` fix for streaming file uploads. — committed to googleapis/nodejs-storage by stephenplusplus 5 years ago
- fix: remove timeout rule from streaming uploads (#365) Fixes https://github.com/googleapis/nodejs-storage/issues/27 — committed to robertomartinez09515/nodejs-common-create by robertomartinez09515 5 years ago
still getting this issue with the last version. Any workaround?
Just updating anyone else still waiting: v2.3.3 still suffers from the
FetchError: network timeout
bug that I mentioned above. v2.1.0 is the latest version that even has a chance of working. v2.1.0 fails for me about 30% of the time, but the later versions fail 100% of the time.I upgraded to
@google-cloud/storage
v2.2.0 and while I no longer see the ECONNRESET issue, I now get the following when writing files to Cloud Storage from Cloud Functions:It seems to happen 100% of the time, whereas the old ECONNRESET error was probably 50%. The files I’m writing are large’ish, around 2GB. I am reading a compressed
.tar.gz
file and writing out the individual entries to Cloud Storage.Is there some way to change the timeout settings? Make it wait longer before timing out? Any other ideas? I’m glad that I can now see (I think?) the actual error, instead of the inscrutable ECONNRESET, but I’m not sure how to deal with the fact that it occurs 100% of the time, making my previous “retry until it finally works” strategy worthless.
This is a stupid user error and can be closed. Of course code above will fire ALL requests at the same time
@kinwa91 @stephenplusplus this one has been going on for a while now, and I’m concerned about the whole downgrading to 2.1 thing. Can y’all prioritize an investigation for this tomorrow?
@micahwedemeyer @stephenplusplus yeah, I’m still seeing two issues on master (3e5a196) with all the latest dependencies:
Streamed uploads are still being retried. These were supposed to be disabled by the earlier fix. See more info below.
node-fetch’s default 60 second timeout seems to mean that the entire request and response must be completed within 60 seconds. See https://github.com/bitinn/node-fetch/issues/446. I think this is a regression introduced by c2c1382a2d11d271c5ef8b58c263d72db88ca4d8 in nodejs-storage@2.2.0.
Repro:
And add logging statements to node_modules/node-fetch/lib/index.js where the timeout is set and cleared (around line 1336):
We experience the issue with GET requests and the PR doesn’t seem to address this case.
I believe a fix has been found, (thanks, @zbjornson!), and a PR has been sent here: https://github.com/googleapis/nodejs-common/pull/268
Here is the code producing the error:
The code runs on GCE:
It reliably times out after 60 seconds of streaming, only a single stream is opened at the same time.
Let me know if you need any further information.
I ran another test today with 7 files and they all successfully streamed to GCS. Great work.
Tomorrow came earlier than expected-- v2.4.2 is out now! Please update and report back any lingering issues.
We are running in a k8s environment on GCP. We are getting the same issue from all pods and across different language stacks (Node and Ruby).