node-postgres: Curious behavior: `idleTimeoutMillis` works locally, but not on "serverless" environments
I was skeptical to write this issue since it might involve other things, not just pg
, but I will give it a try:
So, everything works as expected on localhost
: after I release a client to the Pool with client.release()
and idleTimeoutMillis
reaches its limit, my Pool correctly emits remove
event (signaling that it correctly destroyed the client and also the real connection to the Database). Also, I can use my localhost to connect to the same Database I’m using in Production (RDS Postgres) and it works as expected.
But on environments like Vercel (which uses AWS Lambas under the hood), for some very curious reason, idleTimeoutMillis
seems to not work as expected. Looks like after releasing a Client, it stays in the Pool as idle
forever, not destroying the client/connection after reaching idleTimeoutMillis
and leaving a lot of idle
connections hanging on RDS.
To make things more strange, if I force a client to close using client.release(true)
, the remove
event from the Pool is emitted and the connection is closed as expected… but forcing this to every client ruins the purpose of using a Pool in the first place.
I don’t know if there’s some different behavior of the eventloop in this kind of environment, and then the internal pg
timers don’t get run or something like this.
import { Pool } from 'pg';
import retry from 'async-retry';
// ...
const configurations = {
// ...
connectionTimeoutMillis: 5000,
idleTimeoutMillis: 5000,
max: 1,
allowExitOnIdle: true // I've also tried with false
};
const pool = new Pool(configurations);
async function query(query, params) {
let client;
try {
client = await tryToGetNewClientFromPool();
return await client.query(query, params);
} catch (error) {
throw new ServiceError({
// ...
});
} finally {
if (client) {
client.release();
}
}
}
// If all Database connections are maxed out, this function will
// keep retrying until it gets one Client from Pool and returns
// the Promise with it.
async function tryToGetNewClientFromPool() {
const clientFromPool = await retry(newClientFromPool, {
retries: 15,
minTimeout: 0,
factor: 2,
});
return clientFromPool;
async function newClientFromPool() {
return await pool.connect();
}
}
// ...
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 1
- Comments: 21 (7 by maintainers)
Final solution (already working in production) is to use Pool as much as you can, but keep checking “opened connections” versus “available connections” and if they’re too close, begins to
end()
the Pool onfinally
instead of justrelease()
th client.My repository is still private, but here’s the full solution:
Final results
First strategy
Open and close connection for every single query:
Last strategy
Manage Pool based on opened vs available connections:
@filipedeschamps Thank you so much for investigating this issue. had the same problem and am quite happy with the following solution:
This configuration effectively deactivates the reap cycle in tarn.js, meaning that idle connections are never cleaned up, regardless of whether the lambda process is hot or cold. Given the operational characteristics of lambdas, server-side cleanup does not work reliably, so disabling it seems sensible.
Additionally, I set the
idle_session_timeout
in PostgreSQL to 30,000 milliseconds (30 seconds) to cleanup idle connections on the server side (PostgreSQL Documentation).I also implemented graceful shutdown listeners in the lambda to ensure all remaining connections are cleaned up before the process is terminated, although this step seems optional since the server would clean these up after 30 seconds anyway. More details can be found in this Stack Overflow discussion.
A good option is also to look into RDS Proxy to achieve an external db connection pool.
Btw, there is no concurrent client sharing in this case (and the
poolUsageCount
would always be 0 at the end of @sehrope’s handler), because the AWS Lambda will only send a single request at a time to each instance.My mistake, the connection timeout sets the
client.connection._ending
flag, not theclient._ending
: https://github.com/brianc/node-postgres/blob/21ccd4f1b6e66774bbf24aecfccdbfe7c9b49238/packages/pg/lib/client.js#L103-L107 In this case, I don’t think connection timeout would result in that exact error message. If you don’t use pool, that error should only be triggered by using a client after callingclient.end()
. The connection timeout might result inConnection terminated unexpectedly
errors.