haystack: Operational error due to too many SQL variables while updating embeddings with DP retriever

Describe the bug While following the guide on setting up the DensePassage Retriever on a local machine, updating the embeddings of the document store with document_store.update_embeddings(retriever) fails with an OperationalError: too many SQL variables error.

Error message

11/15/2020 21:55:10 - INFO - haystack.document_store.faiss -   Updating embeddings for 2497 docs...
Inferencing Samples: 100%|██████████| 157/157 [00:11<00:00, 14.11 Batches/s]
11/15/2020 21:55:25 - INFO - haystack.document_store.faiss -   Indexing embeddings and updating vectors_ids...
  0%|          | 0/1 [00:00<?, ?it/s]
sql---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
/work/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1276                     self.dialect.do_execute(
-> 1277                         cursor, statement, parameters, context
   1278                     )
/work/lib/python3.6/site-packages/sqlalchemy/engine/default.py in do_execute(self, cursor, statement, parameters, context)
    592     def do_execute(self, cursor, statement, parameters, context=None):
--> 593         cursor.execute(statement, parameters)
    594 

OperationalError: too many SQL variables

Expected behavior Embedding update with no issues.

Additional context To add I was able to execute the command fine on Google Colab, however with the same dev environment I am not sure why it is failing on my system. I would be very grateful if someone could lead me in the right direction.

System:

  • OS: Linux
  • GPU/CPU: V100 GPU
  • Haystack version (commit or version number): latest
  • DocumentStore: FAISS
  • Reader: FARM
  • Retriever: DPR

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

Is the issue fixed and can we close this @Graduo ?

OK👌.I just solve it by setting batch_size as lalitpagaria said document_store.update_embeddings(retriever, batch_size=999) thanks for all your patient assistance!

@tholor and @lalitpagaria thanks for the help! I was poking around the sql settings and updating sqllite to 3.32 fixed this. Thanks!