drizzle-orm: [BUG]: drizzle-orm/planetscale-serverless reuses connection, leading to inconsistent issues
What version of drizzle-orm are you using?
0.29…3
What version of drizzle-kit are you using?
0.20.9
Describe the Bug
When using planetscale with drizzle, the planetscale connection is reused for subsequent queries to the db. This is an incorrect usage of the planetscale/database driver and can lead to issues with queries/returns in parallel transactions/queries. see: https://github.com/planetscale/database-js/issues/139 & https://github.com/mattrobenolt/ps-http-sim/issues/1
The issue was highlighted here: https://github.com/mattrobenolt/ps-http-sim/issues/7#issuecomment-1864636196
In my local testing: when using the planetscale/database driver directly, for each query a new connections+session is created when using via drizzle-orm/planetscale-serverless, a single connection is created and reused for each subsequent query until the connection is closed by the PS server
how this was discovered: A planetscale simulator was created by @mattrobenolt who also works at PS. the logs and errors returned by the simulator highlighted this discussion here: https://github.com/mattrobenolt/ps-http-sim/issues/7
how to test/replicate: use the planetscale simulator docker compose example and monitor logs: https://github.com/mattrobenolt/ps-http-sim run multiple drizzle queries in quick succession check docker logs for how many sessionIds were created
Run the same queries directly using @planetscale/database
recheck docker logs and count sessionIds
example log
planetscale-simulator-proxy | 13m32.708207828s |INFO| ok caller="ps-http-sim/main.go:252" method="Execute"
.........
session="lU3bPM37IYbc_7IR1p_nz" client_session=true
Expected behavior
a new PS connection to be created for each db query as per @planetscale/database default
Environment & setup
No response
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Reactions: 1
- Comments: 18 (11 by maintainers)
Commits related to this issue
- planetscale-serverless: allow Client as a valid type I have no actually knowledge of TypeScript, but this should mitigate the issues in https://github.com/drizzle-team/drizzle-orm/issues/1743 Passin... — committed to mattrobenolt/drizzle-orm by mattrobenolt 6 months ago
- planetscale-serverless: allow Client as a valid type I have no actually knowledge of TypeScript, but this should mitigate the issues in https://github.com/drizzle-team/drizzle-orm/issues/1743 Passin... — committed to mattrobenolt/drizzle-orm by mattrobenolt 6 months ago
This is a lot of text, so bear with me.
So after a little bit of investigation, I think the Drizzle docs set us up for failure here and encourage misuse unintentionally.
Their example is this:
As seen on: https://orm.drizzle.team/docs/get-started-mysql
This right here is encouraging one global connection for the lifetime of your application, which then encourages no conscious behavior or when to create a new connection vs when to reuse a connection.
I’m not a JavaScript developer myself and don’t fully understanding all the contexts here, but having one global connection is going to be problematic for many reasons. If this is in a long lived application and this shared connection is used within say, some HTTP request handler, you’re inherently sharing state between discrete requests and whether intentionally or not, you’re going to be leaking state between different requests if the requests are happening in parallel, even if within a single request, you’re using it sequentially. The state is going to leak across different connections.
Take an example (ignore my ignorance in JavaScript patterns here:
This is very naive and innocent, but we can assume in a long running applicaiton,
handleHTTPRequestmight be running on multiple requests at the same time concurrently. So our code inside the handler is not, but the runtime is invoking it in parallel.In this example, this is technically fine. But let’s make it more complicated:
So here’s where things get interesting. This request handler translates into 4 sequential queries. Without this single handler, we are doing the correct thing, but the fact that this is sharing a global connection is where we have an issue. So if we think of what this is doing, this translates into these queries:
The BEGIN and COMMIT are implied as a part of our
db.transaction()but you could have done this manually all onawait db.execute("BEGIN"); await db.execute("COMMIT"), etc.Now, these 4 queries require to be done over the same connection, and the order matters because the transaction state is maintained on the connection.
So if this is happening in parallel, what happens if queries start getting interleaved?
Now this gets quite bad. We can kinda extrapolate into how this can be even worse since we’re crossing transactiosn wiht different data, COMMITs and ROLLBACK’s happening over the same transaction implicitly between discrete requests.
So personally, I think uses of Drizzle need to adopt a our Client model.
So something more like this:
So key distinction here,
clientis fine to be global and shared. Client doesn’t store any state. Whenever you want a new connection, usingdb()gets a fresh one. In our example, we can translate that to something like:This alone solves a cross-request contamination and state being intermingled between multiple discrete requests.
Similarly, this helps provide the building block to doing parallelism correctly as well.
now we are correctly executing these queries in parallel because each one gets a unique Connection and their state isn’t clobbering each other.
It’s probably worth noting that it’s the developer’s responsibility. to know when to create a unique Connection and when it can be reused.
If you’re intentionally sharing state, the most common case is transactions, you need to reuse the same connection. If you’re setting variables with
SET, switching database withUSE, etc, these all require the state to be persisted on the Connection or it’s lost. For example:Will not do anything useful since the INSERT won’t be done within the transaction.
is now correct.
Similarly it’s not possible to do concurrency within a transaction, such as:
This fundamentally does not work as you want and breaks a lot of guarantees around transactional safety. Operations within a transaction are required to be executed sequentially against the database. They ultimately are, but the order isn’t predictable and it’s potential for undefined behaviors.
I hope this helps, but in summary, I think we need to encourage more of a “connection factory” pattern, rather than a single global shared connection, and this would help mitigate a lot of misuse. So using a
db()type function, rather than aconst dbthat is reused over and over. A new connection in our driver is essentially free, so there should be no concern with doing this egregiously. A Connection is just an object with a unique internal State object. It does not translate to initiating a new TCP connection or anything like that.