langchain: SQLDatabaseChain - Questions/error
System Info
While trying to run example from https://python.langchain.com/docs/modules/chains/popular/sqlite I get the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[42], line 1
----> 1 db_chain.run("What is the expected assisted goals by Erling Haaland")
File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:273, in Chain.run(self, callbacks, tags, *args, **kwargs)
271 if len(args) != 1:
272 raise ValueError("`run` supports only one positional argument.")
--> 273 return self(args[0], callbacks=callbacks, tags=tags)[_output_key]
275 if kwargs and not args:
276 return self(kwargs, callbacks=callbacks, tags=tags)[_output_key]
File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:149, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, include_run_info)
147 except (KeyboardInterrupt, Exception) as e:
148 run_manager.on_chain_error(e)
--> 149 raise e
150 run_manager.on_chain_end(outputs)
151 final_outputs: Dict[str, Any] = self.prep_outputs(
152 inputs, outputs, return_only_outputs
153 )
File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:143, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, include_run_info)
137 run_manager = callback_manager.on_chain_start(
138 dumpd(self),
139 inputs,
140 )
141 try:
142 outputs = (
--> 143 self._call(inputs, run_manager=run_manager)
144 if new_arg_supported
145 else self._call(inputs)
146 )
147 except (KeyboardInterrupt, Exception) as e:
148 run_manager.on_chain_error(e)
File ~/.local/lib/python3.10/site-packages/langchain/chains/sql_database/base.py:105, in SQLDatabaseChain._call(self, inputs, run_manager)
103 # If not present, then defaults to None which is all tables.
104 table_names_to_use = inputs.get("table_names_to_use")
--> 105 table_info = self.database.get_table_info(table_names=table_names_to_use)
106 llm_inputs = {
107 "input": input_text,
108 "top_k": str(self.top_k),
(...)
111 "stop": ["\nSQLResult:"],
112 }
113 intermediate_steps: List = []
File ~/.local/lib/python3.10/site-packages/langchain/sql_database.py:289, in SQLDatabase.get_table_info(self, table_names)
287 table_info += f"\n{self._get_table_indexes(table)}\n"
288 if self._sample_rows_in_table_info:
--> 289 table_info += f"\n{self._get_sample_rows(table)}\n"
290 if has_extra_info:
291 table_info += "*/"
File ~/.local/lib/python3.10/site-packages/langchain/sql_database.py:311, in SQLDatabase._get_sample_rows(self, table)
308 try:
309 # get the sample rows
310 with self._engine.connect() as connection:
--> 311 sample_rows_result = connection.execute(command) # type: ignore
312 # shorten values in the sample rows
313 sample_rows = list(
314 map(lambda ls: [str(i)[:100] for i in ls], sample_rows_result)
315 )
File /opt/conda/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1413, in Connection.execute(self, statement, parameters, execution_options)
1411 raise exc.ObjectNotExecutableError(statement) from err
1412 else:
-> 1413 return meth(
1414 self,
1415 distilled_parameters,
1416 execution_options or NO_OPTIONS,
1417 )
File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:483, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
481 if TYPE_CHECKING:
482 assert isinstance(self, Executable)
--> 483 return connection._execute_clauseelement(
484 self, distilled_params, execution_options
485 )
486 else:
487 raise exc.ObjectNotExecutableError(self)
File /opt/conda/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1629, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
1621 schema_translate_map = execution_options.get(
1622 "schema_translate_map", None
1623 )
1625 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
1626 "compiled_cache", self.engine._compiled_cache
1627 )
-> 1629 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
1630 dialect=dialect,
1631 compiled_cache=compiled_cache,
1632 column_keys=keys,
1633 for_executemany=for_executemany,
1634 schema_translate_map=schema_translate_map,
1635 linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
1636 )
1637 ret = self._execute_context(
1638 dialect,
1639 dialect.execution_ctx_cls._init_compiled,
(...)
1647 cache_hit=cache_hit,
1648 )
1649 if has_events:
File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:684, in ClauseElement._compile_w_cache(self, dialect, compiled_cache, column_keys, for_executemany, schema_translate_map, **kw)
682 else:
683 extracted_params = None
--> 684 compiled_sql = self._compiler(
685 dialect,
686 cache_key=elem_cache_key,
687 column_keys=column_keys,
688 for_executemany=for_executemany,
689 schema_translate_map=schema_translate_map,
690 **kw,
691 )
693 if not dialect._supports_statement_cache:
694 cache_hit = dialect.NO_DIALECT_SUPPORT
File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:288, in CompilerElement._compiler(self, dialect, **kw)
286 if TYPE_CHECKING:
287 assert isinstance(self, ClauseElement)
--> 288 return dialect.statement_compiler(dialect, self, **kw)
File ~/.local/lib/python3.10/site-packages/pybigquery/sqlalchemy_bigquery.py:137, in BigQueryCompiler.__init__(self, dialect, statement, column_keys, inline, **kwargs)
135 if isinstance(statement, Column):
136 kwargs['compile_kwargs'] = util.immutabledict({'include_table': False})
--> 137 super(BigQueryCompiler, self).__init__(dialect, statement, column_keys, inline, **kwargs)
TypeError: SQLCompiler.__init__() got multiple values for argument 'cache_key'
I use a VertexAI model (MODEL_TEXT_BISON_001) as the LLM.
Some essential library versions: langchain == 0.0.206 SQLAlchemy == 2.0.11 ipython == 8.12.1 python == 3.10.10 google-cloud-bigquery == 3.10.0 google-cloud-bigquery-storage == 2.16.2
Who can help?
Information
- The official example notebooks/scripts
- My own modified scripts
Related Components
- LLMs/Chat Models
- Embedding Models
- Prompts / Prompt Templates / Prompt Selectors
- Output Parsers
- Document Loaders
- Vector Stores / Retrievers
- Memory
- Agents / Agent Executors
- Tools / Toolkits
- Chains
- Callbacks/Tracing
- Async
Reproduction
db
= SQLDatabase.from_uri(f"bigquery://{project_id}/{dataset}")
toolkit = SQLDatabaseToolkit(llm=llm, db=db)
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
db_chain.run(“What is the sport that generates highest revenue”)`
Expected behavior
When running the db_chain, I expect to get an answer from the bigquery database
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 1
- Comments: 18
🤖
It seems like you’re encountering a
TypeError
when trying to run a query using theSQLDatabaseChain
with a BigQuery database. The error message indicates that there’s an issue with theSQLCompiler.__init__()
method receiving multiple values for the argument ‘cache_key’.This error might be due to a compatibility issue between the versions of
SQLAlchemy
andgoogle-cloud-bigquery
you’re using. TheSQLAlchemy
version you’re using is2.0.11
, which is quite recent, and there might be some changes in this version that are not yet fully compatible withgoogle-cloud-bigquery
.As a first step, I would recommend trying to downgrade your
SQLAlchemy
version to1.4.x
and see if the issue persists. You can do this by running the following command in your terminal:If downgrading
SQLAlchemy
doesn’t resolve the issue, you might also want to try downgradinggoogle-cloud-bigquery
to a version that is known to work well withSQLAlchemy 1.4.x
, such as2.6.2
. You can do this by running:Please try these steps and let me know if the issue persists. If it does, it would be helpful if you could provide more information about your environment, such as the operating system you’re using and the version of Python. This will help us better understand the issue and provide a more accurate solution.
For more information about the compatibility between
SQLAlchemy
andgoogle-cloud-bigquery
, you can refer to the SQLAlchemy Changelog and the google-cloud-bigquery Changelog.I hope this helps! If you have any other questions or need further clarification, feel free to ask.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
That was the exact problem!! The pybigquery was messing everything!
I’m facing the same problem. Using 1.7.0 version for “sqlalchemy-bigquery” is not working and still same persists.
Any alternate solution will be helpful.
Downgrading “sqlalchemy-bigquery” to 1.7.0 gets it to work, but with the latest version (1.8.0) the same problem occurs.
Answer generated by a 🤖
Answer
The error you’re encountering is due to a conflict between the versions of SQLAlchemy and google-cloud-bigquery libraries you’re using. The error message
TypeError: SQLCompiler.__init__() got multiple values for argument 'cache_key'
indicates that thecache_key
argument is being passed multiple times when initializing the SQLCompiler, which is not allowed. This is likely due to changes in the SQLAlchemy library’s API in version 2.0.11 that are not compatible with the version of google-cloud-bigquery you’re using.To resolve this issue, you can try downgrading your SQLAlchemy library to a version that is compatible with google-cloud-bigquery. For example, you can try downgrading to SQLAlchemy version 1.4.22. You can do this by running the following command in your terminal:
After downgrading SQLAlchemy, try running your script again. If the problem persists, you may need to adjust the versions of your other libraries to ensure compatibility. Please note that the version 1.4.22 is just an example, you might need to find the exact version that is compatible with your google-cloud-bigquery version. You can find this information in the google-cloud-bigquery’s documentation or you can try different versions until you find one that works. [docs/snippets/modules/chains/popular/sqlite.mdx, docs/extras/guides/evaluation/sql_qa_benchmarking_chinook.ipynb, tests/integration_tests/chains/test_sql_database.py, langchain/chains/sql_database/base.py, docs/extras/ecosystem/integrations/databricks/databricks.ipynb]
Sources
This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.