langchain: Document Comparison toolkit is not working

System Info

First, Thank you so much for your work on Langchain, it’s very good.

I am trying to compare two documents following the guide from langchain https://python.langchain.com/docs/integrations/toolkits/document_comparison_toolkit

I have done exactly the same code: I have one class to use for the args_schema on the tools creation:

class DocumentInput(BaseModel):
    question: str = Field()

I have created the tools :

tools.append(
                   Tool(
                       args_schema=DocumentInput,
                       name=file_name,
                       description=f"useful when you want to answer questions about {file_name}",
                       func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
                   )
               )
agent = initialize_agent(
       agent=AgentType.OPENAI_FUNCTIONS,
       tools=tools,
       llm=llm,
       verbose=True,
   )

And here I am getting the error:

“1 validation error for Tool\nargs_schema\n subclass of BaseModel expected (type=type_error.subclass; expected_class=BaseModel)”,

I have changed the args_schema class to :

from abc import ABC
from langchain.tools import BaseTool
from pydantic import Field

class DocumentInput(BaseTool, ABC):
    question: str = Field()

And now the error, I am getting is:

(“Value not declarable with JSON Schema, field: name=‘_callbacks_List[langchain.callbacks.base.BaseCallbackHandler]’ type=BaseCallbackHandler required=True”,)

I only want to compare the content between two documents, Do you have some example to compare two files which is working? Maybe I am calling wrong the creation of the tools.

Who can help?

@yajunDai

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

I have follow exactly the guide for ->https://python.langchain.com/docs/integrations/toolkits/document_comparison_toolkit to compare two documents, and I am getting the error :

“1 validation error for Tool\nargs_schema\n subclass of BaseModel expected (type=type_error.subclass; expected_class=BaseModel)”,

The args_schema class is :

class DocumentInput(BaseModel):
    question: str = Field()

I trying to create the tools :

tools.append(
                   Tool(
                       args_schema=DocumentInput,
                       name=file_name,
                       description=f"useful when you want to answer questions about {file_name}",
                       func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
                   )
               )

and here I am getting the error: “1 validation error for Tool\nargs_schema\n subclass of BaseModel expected (type=type_error.subclass; expected_class=BaseModel)”,

After that I am doing this, but I can’t get to initialize_agent

 agent = initialize_agent(
        agent=AgentType.OPENAI_FUNCTIONS,
        tools=tools,
        llm=llm,
        verbose=True,
    )

It’s exactly the guide for the Document comparsion on Langchain

Expected behavior

Expected behaviour, that I can compare the content for two documents without error

About this issue

  • Original URL
  • State: open
  • Created 10 months ago
  • Comments: 15

Most upvoted comments

🤖

Hello,

Thank you for your detailed explanation of the issue you’re facing. It seems like you’re having trouble with the args_schema parameter in the Tool class.

The args_schema parameter is expected to be a Pydantic model class that is used to validate and parse the tool’s input arguments. In your case, you’ve defined DocumentInput as a subclass of BaseModel which is correct.

class DocumentInput(BaseModel):
    question: str = Field()

However, the error message you’re seeing suggests that the Tool class is not recognizing DocumentInput as a subclass of BaseModel. This could be due to a variety of reasons, such as an import error or a naming conflict.

As for the second error you’re seeing, it seems like there’s an issue with the BaseCallbackHandler class. Without more context, it’s hard to say what the exact issue is. However, it’s possible that there’s a problem with the way the BaseCallbackHandler class is being serialized to JSON.

Unfortunately, without more information, it’s hard to provide a more specific solution. Could you please provide more details about your environment? Specifically, it would be helpful to know:

  • The version of LangChain you’re using
  • The version of Python you’re using
  • Any other relevant packages or libraries you’re using

This information will help us better understand the issue and provide a more accurate solution.

Thank you for your patience and cooperation.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Encountering the same issue here. The examples in the documentation don’t work. If you start from a clean virtualenv, install langchain, and then run code from the documentation, it fails:

https://python.langchain.com/docs/modules/agents/tools/custom_tools

class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")


@tool("search", return_direct=True, args_schema=SearchInput)
def search_api(query: str) -> str:
    """Searches the API for the query."""
    return "Results"

throws the error:

pydantic.v1.error_wrappers.ValidationError: 1 validation error for StructuredTool                                       
args_schema                                                                                                             
  subclass of BaseModel expected (type=type_error.subclass; expected_class=BaseModel)

Reverting Pydantic back to a 1.x version as mentioned above fixes it:

pip install pydantic==1.10.13

I’m not sure if there are any downsides, but another solution for now might be to let it install the newer Pydantic, but use the v1 BaseModel.

from pydantic.v1 import BaseModel, Field                                                                                

!pip install pydantic==1.10.13 I meet the same problem, and I solve it by run the above code.

By the way, my previous pydantic version is 2.3.0 which is not ok.

same issue for me: pydantic.v1.error_wrappers.ValidationError: 1 validation error for Tool args_schema subclass of BaseModel expected (type=type_error.subclass; expected_class=BaseModel)

I also had the problem until today. For me the problem was the new pydantic==2.4.2. I downgraded to pydantic==1.10.12 and it works for me with langchain==0.0.304.

same issue with me python 3.10 pydantic 2.3.0 pydantic-core 2.6.3 langchain 0.0.281