langchain: sqlalchemy.exc.InvalidRequestError: Table 'langchain_pg_collection' is already defined for this MetaData instance.

Issue you’d like to raise.

what is this issue, and how can i resolve it:

    os.environ["AZURE_OPENAI_API_KEY"] = AZURE_OPENAI_API_KEY
    os.environ["AZURE_OPENAI_ENDPOINT"] = AZURE_OPENAI_ENDPOINT
    os.environ["OPENAI_API_TYPE"] = "azure"
    os.environ["OPENAI_API_VERSION"] = "2023-05-15"
    os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

    embedding = OpenAIEmbeddings()

    COLLECTION_NAME = "network_team_documents"
    CONNECTION_STRING = PGVector.connection_string_from_db_params(
        driver=os.environ.get(DB_DRIVER, DB_DRIVER),
        host=os.environ.get(DB_HOST, DB_HOST),
        port=int(os.environ.get(DB_PORT, DB_PORT)),
        database=os.environ.get(DB_DB, DB_DB),
        user=os.environ.get(DB_USER, DB_USER),
        password=os.environ.get(DB_PASS, DB_PASS),
    )

    store = PGVector(
        collection_name=COLLECTION_NAME,
        connection_string=CONNECTION_STRING,
        embedding_function=embedding,
        extend_existing=True,
    )

    gpt4 = AzureChatOpenAI(
        azure_deployment="GPT4",
        openai_api_version="2023-05-15",
    )
    retriever = store.as_retriever(search_type="similarity", search_kwargs={"k": 10})

    qa_chain = RetrievalQA.from_chain_type(llm=gpt4,
                                           chain_type="stuff",
                                           retriever=retriever,
                                           return_source_documents=True)
    return qa_chain
Traceback (most recent call last):
  File "/opt/network_tool/chatbot/views.py", line 21, in chat
    chat_object = create_session()
  File "/opt/network_tool/chatbot/chatbot_functions.py", line 95, in create_session
    store = PGVector(
  File "/opt/klevernet_venv/lib/python3.10/site-packages/langchain_community/vectorstores/pgvector.py", line 199, in __init__
    self.__post_init__()
  File "/opt/klevernet_venv/lib/python3.10/site-packages/langchain_community/vectorstores/pgvector.py", line 207, in __post_init__
    EmbeddingStore, CollectionStore = _get_embedding_collection_store()
  File "/opt/klevernet_venv/lib/python3.10/site-packages/langchain_community/vectorstores/pgvector.py", line 66, in _get_embedding_collection_store
    class CollectionStore(BaseModel):
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/orm/decl_api.py", line 195, in __init__
    _as_declarative(reg, cls, dict_)
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/orm/decl_base.py", line 247, in _as_declarative
    return _MapperConfig.setup_mapping(registry, cls, dict_, None, {})
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/orm/decl_base.py", line 328, in setup_mapping
    return _ClassScanMapperConfig(
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/orm/decl_base.py", line 578, in __init__
    self._setup_table(table)
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/orm/decl_base.py", line 1729, in _setup_table
    table_cls(
  File "", line 2, in __new__
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 281, in warned
    return fn(*args, **kwargs)  # type: ignore[no-any-return]
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/sql/schema.py", line 436, in __new__
    return cls._new(*args, **kw)
  File "/opt/klevernet_venv/lib/python3.10/site-packages/sqlalchemy/sql/schema.py", line 468, in _new
    raise exc.InvalidRequestError(
sqlalchemy.exc.InvalidRequestError: Table 'langchain_pg_collection' is already defined for this MetaData instance.  Specify 'extend_existing=True' to redefine options and columns on an existing Table object.

Suggestion:

No response

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 15 (8 by maintainers)

Commits related to this issue

Most upvoted comments

🤖

Hi @arezazadeh, nice to meet you! I’m Dosu, a friendly bot here to assist you with bugs, answer questions, and guide you in contributing to our repository. Feel free to ask me anything you’d like about our code. I’m here to make your life easier while we wait for a human maintainer to pop in. Looking forward to helping you with your issue!

The error you’re encountering, “sqlalchemy.exc.InvalidRequestError: Table ‘langchain_pg_collection’ is already defined for this MetaData instance”, is caused by SQLAlchemy trying to define a table that has already been defined in the same MetaData instance. This can occur if the code that defines the table is executed more than once in the same process.

In your code, the table ‘langchain_pg_collection’ is defined in the class ‘CollectionStore’. If this class is instantiated more than once in the same process, it will cause the error.

To resolve this issue, you can modify the code to ensure that the table definition is only executed once. One way to do this is to check if the table already exists before defining it. You can use the has_table method of the MetaData class in SQLAlchemy to check if a table already exists.

Here is an example of how you can modify the ‘CollectionStore’ class to check if the table already exists:

from sqlalchemy import MetaData

class CollectionStore(BaseModel):
    """Collection store."""

    metadata = MetaData()

    if not metadata.tables.get('langchain_pg_collection'):
        __tablename__ = "langchain_pg_collection"

        name = sqlalchemy.Column(sqlalchemy.String)
        cmetadata = sqlalchemy.Column(JSON)

        embeddings = relationship(
            "EmbeddingStore",
            back_populates="collection",
            passive_deletes=True,
        )

In this modified code, the metadata.tables.get('langchain_pg_collection') method is used to check if the ‘langchain_pg_collection’ table already exists. If it does not exist, the table is defined. If it does exist, the table definition is skipped. This ensures that the table is only defined once, preventing the “Table ‘langchain_pg_collection’ is already defined for this MetaData instance” error.

For more information, you can refer to the LangChain codebase and the SQLAlchemy documentation.

Please let me know if this resolves your issue or if you need further assistance.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

0.0.351

Instantiating multiple PGVectors with multiple collection_names may result in a software bug.

raise exc.InvalidRequestError( sqlalchemy.exc.InvalidRequestError: Multiple classes found for path “EmbeddingStore” in the registry of this declarative base. Please use a fully module-qualified path.

I added it in this file: /lib/python3.10/site-packages/langchain/vectorstores/_pgvector_data_models.py

class CollectionStore(BaseModel):
    """Collection store."""

    __tablename__ = "langchain_pg_collection"
    __table_args__ = {'extend_existing': True}

Thanks for the report folks! I’ll look into this today

Downgrading to 0.0.348 worked for me without adding the extend_existing argument!