prisma: Postgres Full-Text Search Index is not used

Bug description

Current FTS appear to not use defined GIN or GIST index

Recommended GIN index in doc is also not working

ERROR:  data type text has no default operator class for access method "gin"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.

As far as I know it should be something like this since body is of type String in schema

CREATE INDEX IF NOT EXISTS "post_body_index" ON "Posts" USING GIN (to_tsvector('english', "body"));
-- Or/And
CREATE INDEX IF NOT EXISTS "post_body_index" ON "Posts" USING GIST (to_tsvector('english', "body"));

And request should be something like this

SELECT "public"."Posts".*
FROM "public"."Posts" 
WHERE to_tsvector('english', "public"."Posts"."body") @@ to_tsquery('test | test');

english in to_tsvector could be another language so it should be dynamic from @prisma/client

There is also a concatenation problem when you do

const result = await prisma.posts.findMany({
  where: {
    status: 'Draft',
    OR: [
      { body: { search: 'cat & dog' }},
      { anotherField: { search: 'test | test' }}
    ]
  },
})

Where the body and anotherField get concatenated in sql

SELECT "public"."Posts".*
FROM "public"."Posts" 
WHERE to_tsvector("public"."Posts"."body"|| ' ' ||"public"."Posts"."anotherField") @@ to_tsquery($1)

instead of

SELECT "public"."Posts".*
FROM "public"."Posts" 
WHERE to_tsvector('english', "public"."Posts"."body" || "public"."Posts"."anotherField") @@ to_tsquery($1)

This problem also prevent using GIN/GIST index with two field

How to reproduce

See bug description

Expected behavior

No response

Prisma information

full-text search exemple in doc https://www.prisma.io/docs/concepts/components/prisma-client/full-text-search

Environment & setup

  • OS: Mac OS
  • Database: PostgreSQL
  • Node.js version: v16.7.0

Prisma Version

prisma                : 2.30.0
@prisma/client        : 2.30.0
Current platform      : darwin-arm64
Query Engine (Binary) : query-engine 60b19f4a1de4fe95741da371b4c44a92f4d1adcb (at node_modules/@prisma/engines/query-engine-darwin-arm64)
Migration Engine      : migration-engine-cli 60b19f4a1de4fe95741da371b4c44a92f4d1adcb (at node_modules/@prisma/engines/migration-engine-darwin-arm64)
Introspection Engine  : introspection-core 60b19f4a1de4fe95741da371b4c44a92f4d1adcb (at node_modules/@prisma/engines/introspection-engine-darwin-arm64)
Format Binary         : prisma-fmt 60b19f4a1de4fe95741da371b4c44a92f4d1adcb (at node_modules/@prisma/engines/prisma-fmt-darwin-arm64)
Default Engines Hash  : 60b19f4a1de4fe95741da371b4c44a92f4d1adcb
Studio                : 0.422.0
Preview Features      : orderByRelation, selectRelationCount, orderByAggregateGroup, filterJson, referentialActions, fullTextSearch

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 106
  • Comments: 45 (7 by maintainers)

Most upvoted comments

2 years

The workaround is pretty reasonable, do a manual query for the IDs, then use Prisma to convert the resulting list of IDs to hydrated objects – not that difficult, hopefully. Perhaps rather than bumping this issue to “complain”, ya’ll could work on the PR to fix it yourselves 😃

@floelhoeffel I think your analogy with view is quite good, maybe using a similar approach would make things easier.

  • Trusting devs to build their own unmanaged indexes in migration files (like people can write view, they can write complex indexes).
    • people are free to use different approaches for different needs (aggregation, normalization, transformation, …)
    • people are free to use native db extension if they want
    • keeping @@index simple for basic usage
    • it just need to be well documented
  • Provide a way for fulltextsearch query to use those indexes

Here is my idea of a "good enough" solution


Current Full Text Search

For the moment at runtime, we can do the following query:

where: {
    body: {
      search: 'cat | dog',
    },
}

Somewhere down the line, this search becomes the following SQL body @@ to_tsquery('cat | dog').

Problem

At the end most people in this thread are fine with the prisma api, they just want to be able to customize a bit the generated sql query (to use an index, another column or an extension). For better performance but also to get better results.

  • Like the case above, people just want to use to_tsquery('english', 'cat | dog') but have no way to do it
  • In my case, I end up using raw SQL only to have access to unaccent and lower and handle non english search (to_tsquery(unaccent('cat | dog'))) (Related to https://github.com/prisma/prisma/issues/6428#issuecomment-1425057106)
  • Or more advanced cases, simple to do ahead of time with a custom index but harder and slower to do at runtime:
    • Searching multiples fields at once (like a title field, stored as a multilanguage JSON object)
    • Using weight on queries setweight(to_tsvector(...), 'A')
    • Using custom SQL functions to_tsvector('english', normalize_string(title))

Everytime It’s just a small variation of what prisma already does. It’s really frustrating to write hundred of lines of big raw queries and typings, just because prisma doesn’t let us put ~10 characters at the right place in the SQL query 😄

Proposal

So my idea would be to either extend the usage of the existing @@fulltext or create a new keyword like @fulltextsearch in the schema, to explicit the kind of FTS query wanted.

Something along those lines could probably solve most cases:

# same behavior than now (similar to `@@fulltext([ body ])`)
body    String    @fulltextsearch # -> `body @@ to_tsquery(search)`

# using ts_vector config
body    String    @fulltextsearch(config: 'english') # -> `body @@ to_tsquery('english', search)`

# using another pre-computed column or index
body    String    @fulltextsearch(fields: [body_vec]) # -> `body_vec @@ to_tsquery(search)`
body_vec    Unsupported('tsvector')

# using native extensions or custom sql functions
body    String    @fulltextsearch(extensions: ['lower','unaccent']) # -> `body @@ to_tsquery(lower(unaccent(search)))`

# and mixing everything
body    String    @fulltextsearch(fields: [body_vec], config: 'english', extensions: ['unaccent'])
  # -> `body_vec @@ to_tsquery('english', unaccent(search))`

I guess this solution wouldn’t be too complicated to implement, almost no runtime modification required, and it would give us the flexibility needed to avoid raw sql. Not perfect but a big step forward compare to the current situation.

Hello everyone 👋

We are starting design work to improve Prisma’s support for Full Text Search (FTS)!

If you would like to help, please tell us about your needs via a short FTS user research survey.

Thank you!

It’s been 2 years and still no progress. While searching a table where I have 500,000 products, I have to use Elastic Search because you do not support index and it confuses our workflow a lot. We enjoy using Prisma, but it’s really interesting that such an important feature is still not supported.

Another postgres workaround I used few times, completely skipping the prisma full text search API for text field, using the standard Like/ILike. And trusting postgres with a gin index and trigram extension

title         String?

@@index([title(ops: raw("gin_trgm_ops"))], type: Gin, name: "title_idx")

You just need to squeeze in one of the migration

CREATE EXTENSION pg_trgm;

Pro

  • Really simple
  • From Prisma perspective, queries are normal and readable with simple where title: { contains: 'bob' }
  • Postgres is happy with a usable index for partial search
  • Fast on big table with <1ms queries search.

Cons

  • More limited feature wise, doesn’t replace full text search
  • gin index doesn’t always work for 1-2 characters search, because of trigram usage
  • only works with scalar types, for example if you have a JSON or text[] column, this solution doesn’t work

When can we expect this change by? This seems to be a blocking issue.

This issue should be main priority on the roadmap as full text search is useless without it. Bit wild it’s been 2 years.

@janpio thank you for letting me know I created a duplicate issue. I read over this one and did some more benchmarking, and I am not convinced this is such an ambiguous problem to solve. Currently, you’re directing users to create text columns. Some quick benchmarks show this does not scale, and simply using a tsvector column and ensuring the index is actually used results in a 1,000x speed up. Perhaps there are some edge cases, but if this speeds up a “simple” query by 1,000x I do not see any reason not to add support for it in Prisma. Some people’s queries can become quite complex, and rewriting a large query in raw SQL is not super tenable. By supporting the “tsvector” column type for Postgres the query can be sped up over one thousand times.

Test 1 (your currently documented approach) ~ 100ms

ALTER TABLE "products" ADD COLUMN  "name" TEXT;
CREATE INDEX products_name
   ON products USING GIN (to_tsvector('english', name));
select * from products where name @@ to_tsquery('appl:*')

Test 2 (actually using the index by passing “english” to to_tsquery) ~40ms

CREATE INDEX products_name
   ON products USING GIN (to_tsvector('english', name));
select * from products where name @@ to_tsquery('english', 'appl:*')

Takes ~40ms

Test 3 (actually using a precomputed tsvector): ~0.08ms

ALTER TABLE "products" ADD COLUMN  "name_vec" tsvector;
UPDATE products set name_vec = to_tsvoctor("english", name)
CREATE INDEX products_name on products USING GIN (name_vec);
select * from products where name @@ to_tsquery('english', 'appl:*')

Then you actually have a feature request, to support precomputed tsvector columns. That is of course a good request and you should create a new feature request for that.

(That currently the index is not used at all is still a bug of course and tracked here.)

Hello there,

Is there any progress on supporting full-text indexes on Postgres tables? I am very new to prisma and postgres. It’s important for my client to do full-text searches on their database.

I just wanted to be clear that, as of now I can create full-text indexes directly on the database without primsa but use the full-text search through prisma from my NestJS project?

Thanks for your help!

🤷 I basically just want it to not be super slow, however that happens 😆 I created the feature request 😃 thank you!

So… any chance of native GIN/GiST support coming to Prisma some time soon? Or is it naive to ask 😅

Support for the PostgreSQL index creation via Prisma Migrate is tracked in https://github.com/prisma/prisma/issues/10386

@@fulltext seem to be only for MySQL & MongoDB, unfortunately I am on PostgreSQL…

3.6.0 still concatenate in query when using two field thought, but I am not sure if that’s a problem or not

For folks looking for a well implemented workaround, I recommend having a look at implementation of folks at linen.dev

https://github.com/Linen-dev/linen.dev/blob/79e9acb670a444fa8190015e9257b7218f51d339/packages/database/prisma/schema.prisma#L36

@BradNut Custom column types are a good solution. Here’s a good thread on tsvector specifically

Yes it should, trigram includes exact match.

For exact match, a btree should be slightly faster, but the difference should be really minimal. So imho, no need to have two indexes on the same field, sounds even like a good way to confuse postgres and end up with query using the wrong index.

As usual for indexing, avoid early optimization and test with your real needs. You will quickly realize when data grows if indexes are working or not. Indexes are not free, they take space and slow down insert/update, to use with moderation 😄

P.s. to test just create 500k seed data entries in your db, and run your queries to see where is the bottleneck

The workaround is to let the query run slower due to not leveraging the index, or run a raw SQL query [making sure to either not select, or otherwise transform any unsupported field types to supported fields in the SQL]

I went the raw SQL route but I’m not happy with it. This workaround feels like less of an escape hatch in prisma and more of a massive hole in the ecosystem. Would really like to see this fixed.

Hey 👋

Any news on this one ?

So how is everyone here actually accomplishing this if the GitHub issue is still open after over 2 years? Are people just using MySQL since that is what is supported? Creating indexes manually? Or dropping use of Prisma altogether until that is fixed?

Another postgres workaround I used few times, completely skipping the prisma full text search API for text field, using the standard Like/ILike. And trusting postgres with a gin index and trigram extension

title         String?

@@index([title(ops: raw("gin_trgm_ops"))], type: Gin, name: "title_idx")

You just need to squeeze in one of the migration

CREATE EXTENSION pg_trgm;

Pro

  • Really simple
  • From Prisma perspective, queries are normal and readable with simple where title: { contains: 'bob' }
  • Postgres is happy with a usable index for partial search
  • Fast on big table with <1ms queries search.

Cons

  • More limited feature wise, doesn’t replace full text search
  • gin index doesn’t always work for 1-2 characters search, because of trigram usage
  • only works with scalar types, for example if you have a JSON or text[] column, this solution doesn’t work

I wonder will it work on equal search? or I need to make new index for this

@kefniark Thank you so much for this, this worked perfectly for my use case.

To expand further for anyone trying to use this, this workaround just uses the normal prisma contains methods which generate queries using LIKE in postgres. This does not allow using the fulltext search which usests_vector/ts_query queries generated by prisma’s fulltext search operator.

However, again, it’s frustrating that none of these text search index features are natively available in prisma after so long. It would be great to maybe have a page with at least some documented workarounds at least developers can use depending on what they’re trying to achieve, like the above.

Hello World! any updates on this issue? i am struggling with the same issue

Interesting. You would suggest to add an attribute @fulltextsearch to a field in a model to influence the query that is created when one tries to run a search query, and assume that the matching index would be migrated by the user?

Any updates ? Specially for including ts_vector etc.

I am also facing this issue, waiting for a fix. Unless this is fixed, full text search is not usable for us because we have large tables.