langchain: SelfQueryRetriever gives error for some queries

System Info

Langchain Version: 0.0.245 model: vicuna-13b-v1.5

Who can help?

No response

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

db = Chroma(persist_directory=persist_directory, embedding_function=embeddings, client_settings=CHROMA_SETTINGS)    
metadata_field_info = [
   AttributeInfo(
        name='lesson',
        description="Lesson Number of Book",
        type="integer",
    )
]


document_content_description = "English Books"
retriever = SelfQueryRetriever.from_llm(
        llm, db, document_content_description, metadata_field_info, verbose=True
    )
    # llm_chain.predict(context = context, question=question)
    qa = RetrievalQA.from_chain_type(llm=llm ,chain_type="stuff", retriever=retriever, return_source_documents=True, chain_type_kwargs={"prompt": PROMPT})

res = qa(query)

this is one of the documents image

 File "/home/roger/Documents/GitHub/RGRgithub/roger/testllm/main.py", line 122, in qanda
    res = qa(query)
          ^^^^^^^^^
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/chains/base.py", line 258, in __call__
    raise e
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/chains/base.py", line 252, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/chains/retrieval_qa/base.py", line 130, in _call
    docs = self._get_docs(question, run_manager=_run_manager)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/chains/retrieval_qa/base.py", line 210, in _get_docs
    return self.retriever.get_relevant_documents(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/schema/retriever.py", line 193, in get_relevant_documents
    raise e
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/schema/retriever.py", line 186, in get_relevant_documents
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py", line 100, in _get_relevant_documents
    self.llm_chain.predict_and_parse(
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/chains/llm.py", line 282, in predict_and_parse
    return self.prompt.output_parser.parse(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Documents/GitHub/RGRgithub/roger/testgenv/lib/python3.11/site-packages/langchain/chains/query_constructor/base.py", line 52, in parse
    raise OutputParserException(
langchain.schema.output_parser.OutputParserException: Parsing text
```json
{
    "query": "phoneme",
    "filter": "eq(lesson, 1)"
}

 raised following error:
Unexpected token Token('COMMA', ',') at line 1, column 10.
Expected one of: 
	* LPAR
Previous tokens: [Token('CNAME', 'lesson')]

Expected behavior

  • Error caused while using SelfQueryRetriever with RetrieverQA.
  • showing only with some queries
  • I found that in langchain/chains/llm.py, line 282, in predict_and_parse
result = self.predict(callbacks=callbacks, **kwargs)
if self.prompt.output_parser is not None:
    return self.prompt.output_parser.parse(result)
else:
    return result
  • when result is, the error occurs(Q: what is a phoneme) '```json\n{\n "query": "phoneme",\n "filter": "eq(lesson, 1)"\n}\n```'

  • it doesn’t occur when result is , (Q: What is a phoneme) '```json\n{\n "query": "phoneme",\n "filter": "eq(\\"lesson\\", 1)"\n}\n```'

  • I can’t change the version right now.

About this issue

  • Original URL
  • State: open
  • Created 10 months ago
  • Comments: 17

Most upvoted comments

🤖

Hello,

Thank you for providing detailed information about the issue you’re experiencing. It seems like the problem is related to the parsing of the result returned by the predict_and_parse method in the llm.py file.

The error message indicates that the parser is expecting a left parenthesis ‘(’ but instead it encounters a comma ‘,’. This is happening because the parser is trying to parse the string 'eq(lesson, 1)' as a function call with ‘lesson’ as the function name and ‘1’ as the argument, which is not valid.

The difference between the two results you provided is that in the second one, the string ‘lesson’ is escaped with double quotes, making it a valid string argument for the ‘eq’ function. This is why the second result does not raise an error.

As for why this is happening, it’s hard to say without more information. It could be a bug in the predict_and_parse method or in the OutputParser class, or it could be an issue with the specific language model you’re using.

One thing you could try is to update your LangChain version to the latest one (v0.0.266) and see if the issue persists. There might have been fixes or improvements made in the newer versions that could resolve your issue.

If updating LangChain does not solve the problem, it would be helpful if you could provide more information about the specific queries that are causing the error, as well as the exact version of the language model you’re using. This would help in identifying the root cause of the issue.

I hope this helps. Please let me know if you have any other questions or if there’s anything else I can do to assist you.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Using v0.0.352 Same error, i am using SelfQueryRetriever with ChromaTranslator

`langchain_core.exceptions.OutputParserException: Parsing text

{
    "query": "chicken",
    "filter": "eq(\"category\", \"food\") and contain(content, \"chicken\")"
}

raised following error: Unexpected token Token(‘CNAME’, ‘and’) at line 1, column 24. Expected one of: * $END Previous tokens: [Token(‘RPAR’, ‘)’)]`