langchain: Issue: Abstract Class Implementation problem in Retrievers
Issue you’d like to raise.
Hi everyone,
I’ve encountered an issue while trying to instantiate the ConversationalRetrievalChain in the Langchain library. It seems to be related to the abstract class BaseRetriever and the required method _get_relevant_documents.
Here’s my base code:
import os
from langchain.vectorstores import FAISS
from langchain.chat_models import ChatOpenAI
from langchain.chains.question_answering import load_qa_chain
from langchain.embeddings.openai import OpenAIEmbeddings
from dotenv import load_dotenv
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
# Load environment variables
load_dotenv("info.env")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
# Initialize the FAISS vector database
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
index_filenames = ["index1", "index2"]
dbs = {f"category{i+1}": FAISS.load_local(filename, embeddings) for i, filename in enumerate(index_filenames)}
# Load chat model and question answering chain
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0, openai_api_key=OPENAI_API_KEY)
chain = load_qa_chain(llm, chain_type="map_reduce")
# Initialize conversation memory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
# Set the role of the AI
role = "Assistant"
# Write the role of the AI
memory.save_context({"input": f"AI: As an {role}, I can help you with your questions based on the retrievers and my own data."}, {"output": ""})
# Initialize the ConversationalRetrievalChain with memory and custom retrievers
qa = ConversationalRetrievalChain.from_llm(llm, retriever=dbs, memory=memory)
# Pass in chat history using the memory
while True:
user_input = input("User: ")
# Add user input to the conversation history
memory.save_context({"input": f"User: {user_input}"}, {"output": ""})
# Check if the user wants to exit
if user_input == "!exit":
break
# Run the chain on the user's query
response = qa.run(user_input)
# Print the response with a one-line space and role indication
print("\n" + f"{role}:\n{response}" + "\n")
And here’s the error message I received:
pydantic.v1.error_wrappers.ValidationError: 1 validation error for ConversationalRetrievalChain
retriever
Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents (type=type_error)
It appears that the retriever I provided doesn’t fully implement the required methods. Could someone provide guidance on how to properly implement a retriever for ConversationalRetrievalChain and help update the code based on that?
Any help would be greatly appreciated. Thank you!
Suggestion:
No response
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Reactions: 1
- Comments: 21 (3 by maintainers)
🤖
Hello,
The error message you’re seeing is due to the fact that the
BaseRetrieverclass in LangChain is an abstract base class, and it requires any subclasses to implement the_get_relevant_documentsmethod. This method is designed to retrieve documents that are relevant to a given query.Based on the code you’ve shared, it seems you’re trying to use a dictionary of
FAISSinstances as the retriever for theConversationalRetrievalChain. However, theseFAISSinstances are not subclasses ofBaseRetrieverand do not implement the_get_relevant_documentsmethod, which is why you’re seeing this error.To resolve this issue, you need to create a custom retriever that is a subclass of
BaseRetrieverand implements the_get_relevant_documentsmethod. Here’s an example of how you can do this:In this example,
CustomRetrieveris a subclass ofBaseRetrieverand it provides an implementation for the_get_relevant_documentsmethod. This method uses theFAISSinstances in thedbsdictionary to retrieve the relevant documents based on the provided query.You can then use this custom retriever in your
ConversationalRetrievalChainlike this:Please note that this is just a basic example and you may need to adjust the
_get_relevant_documentsmethod to suit your specific needs.I hope this helps! If you have any further questions, please let me know.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Hello,
From the error message you provided, it seems like you’re trying to initialize the
MultiRetrievalQAChainclass directly with theretriever_infosparameter. However, theMultiRetrievalQAChainclass doesn’t have aretriever_infosparameter in its constructor. Instead, it has a class method calledfrom_retrieversthat takes in a list of retrievers and an instance ofBaseLanguageModel.Here’s how you can use it:
This will create an instance of
MultiRetrievalQAChainthat uses the provided retrievers. TheLLMRouterChainwill decide which retriever to use based on the input. If the router doesn’t map the input to one of the retrievers, aConversationChainwithChatOpenAIas the language model will be used as the default chain.As for the
ValueErroryou’re getting when running the chain, it’s because therunmethod of theMultiRetrievalQAChainclass only supports one positional argument. You’re passing in two arguments:user_inputandmemory. Ifmemoryis a keyword argument, you should pass it as such:I hope this helps! Let me know if you have any other questions.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Hello,
From the error message you provided, it seems like you’re trying to initialize the
MultiRetrievalQAChainclass directly with theretriever_infosparameter. However, theMultiRetrievalQAChainclass doesn’t have aretriever_infosparameter in its constructor. Instead, it hasrouter_chain,destination_chains, anddefault_chainparameters, which is why you’re getting aValidationErrorfrom Pydantic.To correctly initialize the
MultiRetrievalQAChainclass with a custom retriever, you should use thefrom_retrieversclass method. This method requires aBaseLanguageModelinstance, a list of dictionaries containing retriever information, and optionally a default retriever, default prompt, and default chain.Each dictionary in the
retriever_infoslist should contain the following keys:name: The name of the retriever.description: A description of the retriever.retriever: An instance of the retriever.prompt(optional): APromptTemplateinstance for the retriever.Here is an example of how to initialize the
MultiRetrievalQAChain:This will create a
MultiRetrievalQAChaininstance that uses the provided language model and retrievers. The chain will route inputs to the appropriate retriever based on the output of the language model.I hope this helps! If you have any other questions, feel free to ask.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Hello,
From the error message you provided, it seems like you’re trying to initialize the
MultiRetrievalQAChainclass directly with theretriever_infosparameter. However, theMultiRetrievalQAChainclass doesn’t have aretriever_infosparameter in its constructor. Instead, it hasrouter_chain,destination_chains, anddefault_chainparameters, which is why you’re getting aValidationErrorfrom Pydantic.To correctly initialize the
MultiRetrievalQAChainclass with a custom retriever, you should use thefrom_retrieversclass method. This method requires aBaseLanguageModelinstance, a list of dictionaries containing retriever information, and optionally a default retriever, default prompt, and default chain.Here is an example of how to initialize the
MultiRetrievalQAChain:This will create a
MultiRetrievalQAChaininstance that uses the provided language model and retrievers. The chain will route inputs to the appropriate retriever based on the output of the language model.You can find more information about the
MultiRetrievalQAChainclass in the LangChain documentation and the source code.I hope this helps! Let me know if you have any other questions.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Hello,
The error message you’re seeing is due to the Pydantic model validation failing for the
CustomRetrieverclass. Pydantic is a data validation library that LangChain uses to ensure that the data being passed into classes and functions matches the expected types and structures.In your
CustomRetrieverclass, you’ve definedretriever_instancesas a class variable, but it seems like Pydantic is expecting it to be an instance variable. This is likely because Pydantic validates instance variables, not class variables.To fix this, you should move the
retriever_instancesvariable into the__init__method of yourCustomRetrieverclass, like so:This should resolve the
ValidationErroryou’re seeing.However, I noticed that you’re trying to use multiple FAISS retrievers with the
ConversationalRetrievalChain. Based on the context provided, it seems like theConversationalRetrievalChainclass doesn’t directly support multiple retrievers. Instead, you might want to use theMultiRetrievalQAChainclass, which is designed to handle multiple retrievers.The
MultiRetrievalQAChainclass uses aretriever_infosargument, which is a list of dictionaries. Each dictionary should contain information about a retriever, including the retriever instance itself. Here’s an example of how you might structure this argument:In this example,
retriever1andretriever2would be your FAISS retriever instances, andprompt1andprompt2would be the corresponding prompts for each retriever.I hope this helps! Let me know if you have any other questions.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Hello,
The error you’re encountering is indeed due to the fact that the
BaseRetrieverclass in LangChain is an abstract base class, and it requires any subclasses to implement the_get_relevant_documentsmethod.To resolve this, you need to create a custom retriever that is a subclass of
BaseRetrieverand implements the_get_relevant_documentsmethod. Here is an example of how you can create a custom retriever:In this code, the
CustomRetrieverclass inherits fromBaseRetrieverand overrides the_get_relevant_documentsmethod. This method uses your existing retriever to get the documents and then sorts them by the “source” attribute in the metadata.You can then use this custom retriever in your
RetrievalQAchain like this:Please note that the
get_relevant_documentsandaget_relevant_documentsmethods in theBaseRetrieverclass are now deprecated and the_get_relevant_documentsand_aget_relevant_documentsmethods should be implemented instead. These methods also take an additionalrun_managerargument.For more details, you can refer to these similar solved issues in the LangChain repository:
I hope this helps! If you have any other questions, feel free to ask.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Just do response = chain.run(user_input). Internally, the multiretrievalqa chain uses a conversation chain and maintains memory of the conversation.
On Thu, Oct 5, 2023 at 1:25 PM murasz @.***> wrote: