undici: Leak and very long hanging when fetch throws `ConnectTimeoutError(UND_ERR_CONNECT_TIMEOUT)`
Version
v21.0.0
Platform
Linux archhost 6.5.7-arch1-1 nodejs/node#1 SMP PREEMPT_DYNAMIC Tue, 10 Oct 2023 21:10:21 +0000 x86_64 GNU/Linux
Subsystem
No response
What steps will reproduce the bug?
- pull https://github.com/nikelborm/fetch_timeout_poc
- enable wifi and connect to network with internet
- connect to wireguard vpn with sending all trafic through this vpn server
- disable wifi
- run
npm start
index.js
"use strict";
const startedAt = Date.now();
console.log('started fetch at: ', new Date(startedAt));
try {
await fetch('https://jsonplaceholder.typicode.com/todos/1');
} catch (error) {
console.log(error);
}
const finishedAt = Date.now();
console.log('finished fetch at: ', new Date(finishedAt))
console.log(`fetch took ${finishedAt - startedAt} ms`)
console.log('timestamp when script should finish: ', Date.now())
package.json
{
"name": "poc",
"version": "1.0.0",
"description": "",
"main": "index.js",
"type": "module",
"scripts": {
"start": "node index.js; node -e \"console.log('timestamp when script actually finished: ', Date.now())\""
}
}
How often does it reproduce? Is there a required condition?
It reproduces every time with described network configuration. No other condition. Same behavior in node v20.7.0
and v18.18.0
What is the expected behavior? Why is that the expected behavior?
Fetch waits 10 seconds, throws error UND_ERR_CONNECT_TIMEOUT
. Then script exits immediately after it completed. Difference between timestamp when script should finish
and timestamp when script actually finished
should be near zero.
What do you see instead?
Fetch is being executed as expected. It kills itself after about 10 seconds and then returns control back to the script. But the problem is that something in the background is still blocking node.js and does not allow process to exit. It just hangs waiting for something to be finished even if the script was completed.
started fetch at: 2023-10-21T00:37:09.239Z
TypeError: fetch failed
at Object.fetch (node:internal/deps/undici/undici:11372:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async file:///home/nikel/projects/alarm/index.js:7:3 {
cause: _ConnectTimeoutError: Connect Timeout Error
at onConnectTimeout (node:internal/deps/undici/undici:6616:28)
at node:internal/deps/undici/undici:6574:50
at Immediate._onImmediate (node:internal/deps/undici/undici:6605:13)
at process.processImmediate (node:internal/timers:478:21) {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}
finished fetch at: 2023-10-21T00:37:19.391Z
fetch took 10152 ms
timestamp when script should finish: 1697848639391
timestamp when script actually finished: 1697848749482
After all code in the script been executed it takes 110 seconds (1697848749482 - 1697848639391 = 110091
) to exit from process.
But enabling wifi back (restoring internet connection) during process’s hanging (after the moment fetch already failed and logged error) magically speeds up exit from process.
Additional information
Running completely offline
If we skip 2-3 steps and just run script offline we get other kind of error and expected behavior.
Following is being printed to console:
started fetch at: 2023-10-21T00:54:31.085Z
TypeError: fetch failed
at Object.fetch (node:internal/deps/undici/undici:11372:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async file:///home/nikel/projects/alarm/index.js:7:3 {
cause: Error: getaddrinfo EAI_AGAIN jsonplaceholder.typicode.com
at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:118:26) {
errno: -3001,
code: 'EAI_AGAIN',
syscall: 'getaddrinfo',
hostname: 'jsonplaceholder.typicode.com'
}
}
finished fetch at: 2023-10-21T00:54:31.250Z
fetch took 165 ms
timestamp when script should finish: 1697849671250
timestamp when script actually finished: 1697849671310
Difference between timestamp when script should finish
and timestamp when script actually finished
is 60 ms what is fine.
Running online
If we skip 4 step and just run script online we get no errors and expected behavior.
Following is being printed to console:
started fetch at: 2023-10-21T00:58:19.347Z
finished fetch at: 2023-10-21T00:58:19.995Z
fetch took 648 ms
timestamp when script should finish: 1697849899995
timestamp when script actually finished: 1697849900162
Difference between timestamp when script should finish
and timestamp when script actually finished
is 167 ms what also seems fine.
Running using node-fetch with described network configuration
If we install node-fetch
with npm i node-fetch
and then import it in the beginning using import fetch from 'node-fetch';
it will replace original fetch and will give us different behavior:
started fetch at: 2023-10-21T01:48:05.984Z
FetchError: request to https://jsonplaceholder.typicode.com/todos/1 failed, reason: getaddrinfo EAI_AGAIN jsonplaceholder.typicode.com
at ClientRequest.<anonymous> (file:///home/nikel/projects/alarm/node_modules/node-fetch/src/index.js:108:11)
at ClientRequest.emit (node:events:517:28)
at TLSSocket.socketErrorListener (node:_http_client:501:9)
at TLSSocket.emit (node:events:517:28)
at emitErrorNT (node:internal/streams/destroy:151:8)
at emitErrorCloseNT (node:internal/streams/destroy:116:3)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
type: 'system',
errno: 'EAI_AGAIN',
code: 'EAI_AGAIN',
erroredSysCall: 'getaddrinfo'
}
finished fetch at: 2023-10-21T01:50:06.143Z
fetch took 120159 ms
timestamp when script should finish: 1697853006143
timestamp when script actually finished: 1697853006236
When fetch
from node-fetch
fails, it logs last few lines to console and script exits immediately. Difference between timestamp when script should finish
and timestamp when script actually finished
is 93 ms what also seems fine.
The interesting thing is that fetch
from node-fetch
throws the same error as error thrown by original fetch when vpn disabled and notebook completely offline.
fetch
from node-fetch
returns control after 120 seconds. Original fetch returns control after 10 seconds and then hangs in the background 110 seconds. Looks very suspicious (120 == 110 + 10
).
About this issue
- Original URL
- State: open
- Created 8 months ago
- Comments: 26 (24 by maintainers)
downgrading node version with nvm solved for me