prisma: Enums loaded as part of transaction after `UPDATE`, slowing down transaction.

Bug description

Running on Cockroach, we’re noticing times when a transaction gets created that leads to row contention and takes 1-2s to complete. Here’s the transaction:

SELECT public."GatherEvent".id FROM public."GatherEvent" WHERE public."GatherEvent".id = $1
UPDATE public."GatherEvent" SET progress = $1, "updatedAt" = $2 WHERE public."GatherEvent".id IN ($3,)
# I believe these are all enum lookups, a little surprising that's not cached? We're not changing the enums ever.
SELECT t.typname, t.typtype, t.typelem, r.rngsubtype, t.typbasetype, n.nspname, t.typrelid FROM pg_catalog.pg_type AS t LEFT JOIN pg_catalog.pg_range AS r ON r.rngtypid = t.oid INNER JOIN pg_catalog.pg_namespace AS n ON t.typnamespace = n.oid WHERE t.oid = $1
SELECT enumlabel FROM pg_catalog.pg_enum WHERE enumtypid = $1 ORDER BY enumsortorder
SELECT t.typname, t.typtype, t.typelem, r.rngsubtype, t.typbasetype, n.nspname, t.typrelid FROM pg_catalog.pg_type AS t LEFT JOIN pg_catalog.pg_range AS r ON r.rngtypid = t.oid INNER JOIN pg_catalog.pg_namespace AS n ON t.typnamespace = n.oid WHERE t.oid = $1
SELECT enumlabel FROM pg_catalog.pg_enum WHERE enumtypid = $1 ORDER BY enumsortorder
SELECT t.typname, t.typtype, t.typelem, r.rngsubtype, t.typbasetype, n.nspname, t.typrelid FROM pg_catalog.pg_type AS t LEFT JOIN pg_catalog.pg_range AS r ON r.rngtypid = t.oid INNER JOIN pg_catalog.pg_namespace AS n ON t.typnamespace = n.oid WHERE t.oid = $1
SELECT enumlabel FROM pg_catalog.pg_enum WHERE enumtypid = $1 ORDER BY enumsortorder
SELECT t.typname, t.typtype, t.typelem, r.rngsubtype, t.typbasetype, n.nspname, t.typrelid FROM pg_catalog.pg_type AS t LEFT JOIN pg_catalog.pg_range AS r ON r.rngtypid = t.oid INNER JOIN pg_catalog.pg_namespace AS n ON t.typnamespace = n.oid WHERE t.oid = $1
SELECT enumlabel FROM pg_catalog.pg_enum WHERE enumtypid = $1 ORDER BY enumsortorder
SELECT t.typname, t.typtype, t.typelem, r.rngsubtype, t.typbasetype, n.nspname, t.typrelid FROM pg_catalog.pg_type AS t LEFT JOIN pg_catalog.pg_range AS r ON r.rngtypid = t.oid INNER JOIN pg_catalog.pg_namespace AS n ON t.typnamespace = n.oid WHERE t.oid = $1
SELECT enumlabel FROM pg_catalog.pg_enum WHERE enumtypid = $1 ORDER BY enumsortorder
# then one SELECT at the end from our table
SELECT public."GatherEvent".id, ... FROM public."GatherEvent" WHERE public."GatherEvent".id = $1 LIMIT $2 OFFSET $3

Our application code is extremely simple, there’s only one place where we’re doing an UPDATE:

  const event = await prismaDb.gatherEvent.update({ ...byId(eventId), data: patch });

which we then await in a controller, where that is the only interaction with prisma.

Our HTTP servers only create one Prisma client per instance as suggested in the docs, and reuses the client across all connections to that server instance.

I dug in with the Cockroach team and they’re fairly confident it’s not due to anything weird on their end.

So now I’m checking in here, wondering if it’s possible that Prisma would be taking that UPDATE query and lumping it into a transaction with aSELECT query by mistake?

How to reproduce

Unclear, we’re seeing it occur regularly in prod, but didn’t notice this behavior in testing.

Expected behavior

The UPDATE runs alone in a transaction.

Prisma information

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "cockroachdb"
  url      = env("DATABASE_URL")
}

model GatherEvent {
  id String @id @default(dbgenerated("replace(replace(encode(substr(gen_random_uuid()::BYTES, 0, 16), 'base64'::STRING), '/'::STRING, '_'::STRING), '+'::STRING, '-'::STRING)"))
  // ...
}

enum GatherEventCategory {
  // ...
}

// we have to combine all possible enum values into one since SQL doesn't support multiple enum types for a column
enum GatherEventAttendanceNumber {
  // ...
}

enum GatherEventPrivacy {
  // ...
}

enum GatherEventProfessionalCategory {
  // ...
}

enum GatherEventUseCase {
  // ...
}

Environment & setup

  • OS: Occurs presumably on linux instances, it’s the CRDB cloud servers
  • Database: Cockroach 22.1.5
  • Node.js version: 12.17.4

Prisma Version

prisma                  : 4.2.1
@prisma/client          : 4.2.1
Current platform        : darwin
Query Engine (Node-API) : libquery-engine 2920a97877e12e055c1333079b8d19cee7f33826 (at node_modules/@prisma/engines/libquery_engine-darwin.dylib.node)
Migration Engine        : migration-engine-cli 2920a97877e12e055c1333079b8d19cee7f33826 (at node_modules/@prisma/engines/migration-engine-darwin)
Introspection Engine    : introspection-core 2920a97877e12e055c1333079b8d19cee7f33826 (at node_modules/@prisma/engines/introspection-engine-darwin)
Format Binary           : prisma-fmt 2920a97877e12e055c1333079b8d19cee7f33826 (at node_modules/@prisma/engines/prisma-fmt-darwin)
Default Engines Hash    : 2920a97877e12e055c1333079b8d19cee7f33826
Studio                  : 0.469.0

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 16 (15 by maintainers)

Most upvoted comments

I haven’t dug into this issue recently, I imagine #5043 helps but it seems there’s still many heavy queries against for metadata about enums and types. They’re in the top 20-ish in terms of execution count…

Statements___Cockroach_Console

and in 4th and 11th place in terms of runtime execution (relative amount of compute time)

Statements___Cockroach_Console

Linking this other investigation that landed on the same issue: https://github.com/prisma/prisma/issues/11317#issuecomment-1065414426

It also looks like “pre-loading” the enums by grabbing a random record off the table fixes the issue. The problem was that the enum load when it comes from .update will occur after the UPDATE has created a write intent and locked the row, and then it sits there for ~3s loading enums one at a time, and so the whole transaction blocks. By doing a simple load before any other calls, the enums appear to be cached into the prisma and the subsequent .update call doesn’t require the enums to be loaded.

 const run = async () => {
+  await db.gatherEvent.findFirst(); // pre-load enums
   const { id } = await db.gatherEvent.create({ data: event });
 
   let updateCount = 0;
 
   const update = async () => {
     updateCount++;
     const start = Date.now();
     await db.gatherEvent.update({ where: { id }, data: { progress: updateCount } });
     console.log("done in", Date.now() - start);
   };
 
   update();
 };

Here’s what the transactions look like with that one line tweak:

image

Notice how the pg_catalog.pg_type call (the one calling for 420KB of memory) is now split away from the UPDATE transaction, and the slowest transaction is now 1.2 sec, down from 3.5-4.5 sec.

It looks like the enums are consistently what slows the transaction so much on the initial call. If I create the same table but without enums:

model EventWithoutEnum {
  id String @id @default(dbgenerated("replace(replace(encode(substr(gen_random_uuid()::BYTES, 0, 16), 'base64'::STRING), '/'::STRING, '_'::STRING), '+'::STRING, '-'::STRING)"))
  category String? // enum
  coverImage String? // url
  description String?
  draft Boolean @default(true)
  endDate DateTime
  hostId String
  hostName String?
  minutesEarlyToOpen Int?
  minutesLateToClose Int?
  name String
  numAttendees String?
  privacy String?
  professionalCategory String?
  progress Int @default(0) // 0 - 100
  spaceId String?
  startDate DateTime
  type String?

  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt
}

and then run the script, it shows that the first .update still has a slower-than-normal transaction time (~1400ms vs ~900ms), but not nearly the same as the equivalent call but for a table with enums.

So I guess the immediate question then becomes: can we get confirmation that prisma is loading each enum separately over the wire? If so, is there a way to change that behavior?