prisma: Prisma client not attempting to reconnect on "Can't reach database server"

Bug description

Is it possible for a prisma client to get “stuck” and not attempt to reconnect to a database server after failing to connect once?

I find that when my dev aurora serverless DB goes to sleep and a lambda fails to connect once, it seems to refuse to connect even when the DB comes back up. I have to wait for the warm lambda to go away or do a deployment before it connects again. I can reliably fix the error by changing an environment variable and waiting for the new lambda to get deployed, and it connects to the newly-awake DB fine.

"\nInvalid `prisma.user.findUnique()` invocation:\n\n\n  Can't reach database server at `platform-mish.cluster-cflui618uuxz.eu-west-1.rds.amazonaws.com`:`5432`\n\nPlease make sure your database server is running at `platform-mish.cluster-cflui618uuyz.eu-west-1.rds.amazonaws.com`:`5432`."

How to reproduce

Create a lambda function that connects to a DB. Turn the DB off, then attempt to run the lambda and connect. It will fail (as expected) with this error message.

Start the DB and then run the lambda, it will fail immediately with the same error. Change an environment variable and wait about a minute (for it to re-deploy in your VPC) and run the lambda again. It will succeed.

Expected behavior

If the client has failed to connect, it should retry connecting, maybe after a couple seconds. Consider failover scenarios as well.

Prisma information

Any query

Environment & setup

  • OS:
  • Database:
  • Node.js version:

Prisma Version

prisma                  : 3.2.1
@prisma/client          : 3.2.1
Current platform        : darwin
Query Engine (Node-API) : libquery-engine b71d8cb16c4ddc7e3e9821f42fd09b0f82d7934c (at ../../node_modules/@prisma/engines/libquery_engine-darwin.dylib.node)
Migration Engine        : migration-engine-cli b71d8cb16c4ddc7e3e9821f42fd09b0f82d7934c (at ../../node_modules/@prisma/engines/migration-engine-darwin)
Introspection Engine    : introspection-core b71d8cb16c4ddc7e3e9821f42fd09b0f82d7934c (at ../../node_modules/@prisma/engines/introspection-engine-darwin)
Format Binary           : prisma-fmt b71d8cb16c4ddc7e3e9821f42fd09b0f82d7934c (at ../../node_modules/@prisma/engines/prisma-fmt-darwin)
Default Engines Hash    : b71d8cb16c4ddc7e3e9821f42fd09b0f82d7934c
Studio                  : 0.435.0

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 11
  • Comments: 17 (11 by maintainers)

Most upvoted comments

https://github.com/prisma/prisma/pull/12066 fixes this.

You can test it out with prisma@dev and will be out in the next release. Would be great if you had the time to test this and let us know if this is indeed fixed for you now. Thanks!

@janpio Apologies for the confusion. It’s the very same locally (with a normal postgres) and therefore not related to Lambda/Aurora. That just happens to be our production setup.

https://github.com/prisma/prisma/issues/9420 mentions this also independently from Lambda/Aurora.

To reproduce:

  1. Start a service with Prisma client but have database shutdown (throws ‘can’t reach database’ error as expected)
  2. Start the database and try again without restarting your service (still throws)

For some reason this does not happen if the first ever request was successful. To reproduce:

  1. Start a service with Prisma client but have database running (works fine)
  2. Shutdown database and try again (throws ‘can’t reach database’ error as expected)
  3. Restart database and try again (works fine)

I hope this makes more sense now.

Then this seems to be about the definition of “recover” I guess? I would definitely expect an app to be able to handle a “Can’t reach database server” by e.g. just trying again, or wait a bit and try again, and so on. If that is not the case, I think we have a bug we should confirm. And if I understood your issues correctly, that is what you are reporting, right?

Is there a way to implement reconnecting at least? Something like this?: prisma.on("error", error => prisma.$connect())

We seem to be having the same issue here as well with a sleeping db causing a broken API. Only restarting solves the problem…

@janpio That is exactly my understanding of the issue and the same issue as I am experiencing. Nothing about automatically trying to reconnect, just that if a connection is attempted and it fails due to the database not being available and some time later another connection attempt is made it still fails even if the database is now available.

The problem is not about performing automatic retries. The problem is that if one request comes in and gets the error, subsequent requests get the same error even when the database is back up.