undici: Multiple parallel long fetches with overridden, non-zero, timeouts fails all the time
Bug Description
When running multiple LONG running fetches in parallel, some of them seem to just die and hit a timeout that doesnt really seem relevant
name: HeadersTimeoutError
message: Headers Timeout Error
stack: >-
HeadersTimeoutError: Headers Timeout Error
at onParserTimeout (/src/archive/undici/lib/client.js:908:28)
at callTimer (/src/archive/undici/node_modules/@sinonjs/fake-timer
s/src/fake-timers-src.js:743:24)
at doTickInner (/src/archive/undici/node_modules/@sinonjs/fake-tim
ers/src/fake-timers-src.js:1311:29)
at doTick (/src/archive/undici/node_modules/@sinonjs/fake-timers/s
rc/fake-timers-src.js:1392:20)
at Object.tick (/src/archive/undici/node_modules/@sinonjs/fake-tim
ers/src/fake-timers-src.js:1400:20)
at Server.<anonymous>
(/src/archive/undici/test/fetch/fetch-timeouts.js:32:11)
at Server.emit (node:events:513:28)
at parserOnIncoming (node:_http_server:1065:12)
at HTTPParser.parserOnHeadersComplete (node:_http_common:117:17)
code: UND_ERR_HEADERS_TIMEOUT
Reproducible By
Tweaking minutesPerRequest, numParallelRequests, and timeout here makes for some pretty confusing results. When timeout is 0, everything seems to work correctly. I suspect there are a few timers that are overlapping eachother or getting shared somehow.
Simply set timeout to 0 and see that the test passes, and set timeout to 30 minutes and see that it fails
"use strict";
const { test } = require("tap");
const { fetch, Agent } = require("../..");
const { createServer } = require("http");
const FakeTimers = require("@sinonjs/fake-timers");
// Tweaking request length can have an impact too potentially (Setting to 1 seems to pass, and setting it to 10 min fails on my machine)
const minutesPerRequest = 10;
// On my machine setting this to 0 passes, and setting it to something very high fails with a UND_ERR_HEADERS_TIMEOUT
const timeout = 1000 * 60 * (minutesPerRequest + 5);
// const timeout = 0;
// Tweaking num requests to either 1 or multiple has different impacts
const numParallelRequests = 5;
test("Fetch very long parallel requests, timeout overridden", (t) => {
const msToDelay = 1000 * 60 * minutesPerRequest;
t.setTimeout(undefined);
t.plan(numParallelRequests);
const clock = FakeTimers.install();
t.teardown(clock.uninstall.bind(clock));
const server = createServer((req, res) => {
setTimeout(() => {
res.end("hello");
}, msToDelay);
clock.tick(msToDelay + 1);
});
t.teardown(server.close.bind(server));
server.listen(0, () => {
for (let i = 0; i < numParallelRequests; i++) {
fetch(`http://localhost:${server.address().port}`, {
path: "/",
method: "GET",
dispatcher: new Agent({
headersTimeout: timeout,
connectTimeout: timeout,
bodyTimeout: timeout,
}),
})
.then((response) => response.text())
.then((response) => {
t.equal("hello", response);
})
.catch((err) => {
// This should not happen, a timeout error should not occur
t.error(err);
});
}
clock.tick(msToDelay - 1);
});
});
Expected Behavior
The fetches to succeed
Logs & Screenshots
Environment
Node v18.13.0
Additional context
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 22 (22 by maintainers)
setting these in socket = tls.connect({}) doesnt work because we are upgrading/assigning the socket, but manually setting socket.setKeepAlive with them after connect() works, so I think this will still require a code change
@xconverge would you mind opening a PR?
It should be a new parameter:
tcpKeepAliveInterval: number | null - Default: null
. @xconverge you can go ahead and start a PR and we can discuss about the default value for this later.WDYT @ronag? Should this be enabled by default? It might actually help with some ECONNRESETs.
My experience with these kind of issues tells me the fakeTimers problem is not really related.
What this looks like: a firewall is truncating the connection in a way that we do not recognize, generating an error that we ignore.