langchain: SQLDatabaseChain - Questions/error

System Info

While trying to run example from https://python.langchain.com/docs/modules/chains/popular/sqlite I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[42], line 1
----> 1 db_chain.run("What is the expected assisted goals by Erling Haaland")

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:273, in Chain.run(self, callbacks, tags, *args, **kwargs)
    271     if len(args) != 1:
    272         raise ValueError("`run` supports only one positional argument.")
--> 273     return self(args[0], callbacks=callbacks, tags=tags)[_output_key]
    275 if kwargs and not args:
    276     return self(kwargs, callbacks=callbacks, tags=tags)[_output_key]

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:149, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, include_run_info)
    147 except (KeyboardInterrupt, Exception) as e:
    148     run_manager.on_chain_error(e)
--> 149     raise e
    150 run_manager.on_chain_end(outputs)
    151 final_outputs: Dict[str, Any] = self.prep_outputs(
    152     inputs, outputs, return_only_outputs
    153 )

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:143, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, include_run_info)
    137 run_manager = callback_manager.on_chain_start(
    138     dumpd(self),
    139     inputs,
    140 )
    141 try:
    142     outputs = (
--> 143         self._call(inputs, run_manager=run_manager)
    144         if new_arg_supported
    145         else self._call(inputs)
    146     )
    147 except (KeyboardInterrupt, Exception) as e:
    148     run_manager.on_chain_error(e)

File ~/.local/lib/python3.10/site-packages/langchain/chains/sql_database/base.py:105, in SQLDatabaseChain._call(self, inputs, run_manager)
    103 # If not present, then defaults to None which is all tables.
    104 table_names_to_use = inputs.get("table_names_to_use")
--> 105 table_info = self.database.get_table_info(table_names=table_names_to_use)
    106 llm_inputs = {
    107     "input": input_text,
    108     "top_k": str(self.top_k),
   (...)
    111     "stop": ["\nSQLResult:"],
    112 }
    113 intermediate_steps: List = []

File ~/.local/lib/python3.10/site-packages/langchain/sql_database.py:289, in SQLDatabase.get_table_info(self, table_names)
    287     table_info += f"\n{self._get_table_indexes(table)}\n"
    288 if self._sample_rows_in_table_info:
--> 289     table_info += f"\n{self._get_sample_rows(table)}\n"
    290 if has_extra_info:
    291     table_info += "*/"

File ~/.local/lib/python3.10/site-packages/langchain/sql_database.py:311, in SQLDatabase._get_sample_rows(self, table)
    308 try:
    309     # get the sample rows
    310     with self._engine.connect() as connection:
--> 311         sample_rows_result = connection.execute(command)  # type: ignore
    312         # shorten values in the sample rows
    313         sample_rows = list(
    314             map(lambda ls: [str(i)[:100] for i in ls], sample_rows_result)
    315         )

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1413, in Connection.execute(self, statement, parameters, execution_options)
   1411     raise exc.ObjectNotExecutableError(statement) from err
   1412 else:
-> 1413     return meth(
   1414         self,
   1415         distilled_parameters,
   1416         execution_options or NO_OPTIONS,
   1417     )

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:483, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
    481     if TYPE_CHECKING:
    482         assert isinstance(self, Executable)
--> 483     return connection._execute_clauseelement(
    484         self, distilled_params, execution_options
    485     )
    486 else:
    487     raise exc.ObjectNotExecutableError(self)

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1629, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
   1621 schema_translate_map = execution_options.get(
   1622     "schema_translate_map", None
   1623 )
   1625 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
   1626     "compiled_cache", self.engine._compiled_cache
   1627 )
-> 1629 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
   1630     dialect=dialect,
   1631     compiled_cache=compiled_cache,
   1632     column_keys=keys,
   1633     for_executemany=for_executemany,
   1634     schema_translate_map=schema_translate_map,
   1635     linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
   1636 )
   1637 ret = self._execute_context(
   1638     dialect,
   1639     dialect.execution_ctx_cls._init_compiled,
   (...)
   1647     cache_hit=cache_hit,
   1648 )
   1649 if has_events:

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:684, in ClauseElement._compile_w_cache(self, dialect, compiled_cache, column_keys, for_executemany, schema_translate_map, **kw)
    682 else:
    683     extracted_params = None
--> 684     compiled_sql = self._compiler(
    685         dialect,
    686         cache_key=elem_cache_key,
    687         column_keys=column_keys,
    688         for_executemany=for_executemany,
    689         schema_translate_map=schema_translate_map,
    690         **kw,
    691     )
    693     if not dialect._supports_statement_cache:
    694         cache_hit = dialect.NO_DIALECT_SUPPORT

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:288, in CompilerElement._compiler(self, dialect, **kw)
    286 if TYPE_CHECKING:
    287     assert isinstance(self, ClauseElement)
--> 288 return dialect.statement_compiler(dialect, self, **kw)

File ~/.local/lib/python3.10/site-packages/pybigquery/sqlalchemy_bigquery.py:137, in BigQueryCompiler.__init__(self, dialect, statement, column_keys, inline, **kwargs)
    135 if isinstance(statement, Column):
    136     kwargs['compile_kwargs'] = util.immutabledict({'include_table': False})
--> 137 super(BigQueryCompiler, self).__init__(dialect, statement, column_keys, inline, **kwargs)

TypeError: SQLCompiler.__init__() got multiple values for argument 'cache_key'

I use a VertexAI model (MODEL_TEXT_BISON_001) as the LLM.

Some essential library versions: langchain == 0.0.206 SQLAlchemy == 2.0.11 ipython == 8.12.1 python == 3.10.10 google-cloud-bigquery == 3.10.0 google-cloud-bigquery-storage == 2.16.2

Who can help?

@hwchase17

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

db = SQLDatabase.from_uri(f"bigquery://{project_id}/{dataset}") toolkit = SQLDatabaseToolkit(llm=llm, db=db) db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True) db_chain.run(“What is the sport that generates highest revenue”)`

Expected behavior

When running the db_chain, I expect to get an answer from the bigquery database

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 1
  • Comments: 18

Most upvoted comments

🤖

It seems like you’re encountering a TypeError when trying to run a query using the SQLDatabaseChain with a BigQuery database. The error message indicates that there’s an issue with the SQLCompiler.__init__() method receiving multiple values for the argument ‘cache_key’.

This error might be due to a compatibility issue between the versions of SQLAlchemy and google-cloud-bigquery you’re using. The SQLAlchemy version you’re using is 2.0.11, which is quite recent, and there might be some changes in this version that are not yet fully compatible with google-cloud-bigquery.

As a first step, I would recommend trying to downgrade your SQLAlchemy version to 1.4.x and see if the issue persists. You can do this by running the following command in your terminal:

pip install SQLAlchemy==1.4.22

If downgrading SQLAlchemy doesn’t resolve the issue, you might also want to try downgrading google-cloud-bigquery to a version that is known to work well with SQLAlchemy 1.4.x, such as 2.6.2. You can do this by running:

pip install google-cloud-bigquery==2.6.2

Please try these steps and let me know if the issue persists. If it does, it would be helpful if you could provide more information about your environment, such as the operating system you’re using and the version of Python. This will help us better understand the issue and provide a more accurate solution.

For more information about the compatibility between SQLAlchemy and google-cloud-bigquery, you can refer to the SQLAlchemy Changelog and the google-cloud-bigquery Changelog.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

That was the exact problem!! The pybigquery was messing everything!

I’m facing the same problem. Using 1.7.0 version for “sqlalchemy-bigquery” is not working and still same persists.

Any alternate solution will be helpful.

Downgrading “sqlalchemy-bigquery” to 1.7.0 gets it to work, but with the latest version (1.8.0) the same problem occurs.

pip install sqlalchemy-bigquery==1.7.0

Answer generated by a 🤖

Answer

The error you’re encountering is due to a conflict between the versions of SQLAlchemy and google-cloud-bigquery libraries you’re using. The error message TypeError: SQLCompiler.__init__() got multiple values for argument 'cache_key' indicates that the cache_key argument is being passed multiple times when initializing the SQLCompiler, which is not allowed. This is likely due to changes in the SQLAlchemy library’s API in version 2.0.11 that are not compatible with the version of google-cloud-bigquery you’re using.

To resolve this issue, you can try downgrading your SQLAlchemy library to a version that is compatible with google-cloud-bigquery. For example, you can try downgrading to SQLAlchemy version 1.4.22. You can do this by running the following command in your terminal:

pip install SQLAlchemy==1.4.22

After downgrading SQLAlchemy, try running your script again. If the problem persists, you may need to adjust the versions of your other libraries to ensure compatibility. Please note that the version 1.4.22 is just an example, you might need to find the exact version that is compatible with your google-cloud-bigquery version. You can find this information in the google-cloud-bigquery’s documentation or you can try different versions until you find one that works. [docs/snippets/modules/chains/popular/sqlite.mdx, docs/extras/guides/evaluation/sql_qa_benchmarking_chinook.ipynb, tests/integration_tests/chains/test_sql_database.py, langchain/chains/sql_database/base.py, docs/extras/ecosystem/integrations/databricks/databricks.ipynb]

Sources


This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.