langroid: Question: 'history is longer than the max chat context' error

Hi, i try to build a simple RAG script to load a pdf file (~8 pages, that is not very large but maybe i’m wrong), at the first question i ask, i’ve the the error: The message history is longer than the max chat context, length allowed, and we have run out of messages to drop.

Code:

llm = lr.language_models.OpenAIGPTConfig(
        api_base="http://localhost:8000/v1",
        chat_model="TheBloke/Mistral-7B-Instruct-v0.2-AWQ",
        completion_model="TheBloke/Mistral-7B-Instruct-v0.2-AWQ",
        use_chat_for_completion=False,
        max_output_tokens=4096,
        temperature=0.2
)

config = DocChatAgentConfig(
    default_paths= [],
    max_context_tokens=8192,
    conversation_mode=True,
    llm = llm,
    relevance_extractor_config = lr.agent.special.RelevanceExtractorAgentConfig(
        llm=llm
    ),
    vecdb=lr.vector_store.QdrantDBConfig(
        collection_name="test1",
        replace_collection=True, 
        embedding=hf_embed_config
    ),
    doc_paths= [
        "./test1/2312.17238.pdf"
    ],
)

I tried to change some variables but without any effect (max_context_tokens, max_output_tokens …) even with a max context tokens at 32000.

Did i forget something or doing something wrong on the doc load ? Or my pdf is too large ?

Thanks for your work 👍

About this issue

  • Original URL
  • State: open
  • Created 6 months ago
  • Comments: 18 (10 by maintainers)

Most upvoted comments

Can you check with other documents and/or more variety of questions and see if you’re still not getting a response? You can also try the bare chat mode (I e not document q/a) as suggested in the comments in the script, to ensure that your local LLM setup is working.

And finally, if you can make a reproducible example (or give me a specific doc and specific question) I can see if I can reproduce this issue

Ok I’ve setup a working example using ollama to run your RAG example with mistral:7b-instruct-v0.2-q4_K_M see this example script in the langroid-examples repo: https://github.com/langroid/langroid-examples/blob/main/examples/docqa/rag-local-simple.py

This works for me on an M1 Mac. Here’s a sample run:

image