langroid: Question: 'history is longer than the max chat context' error
Hi,
i try to build a simple RAG script to load a pdf file (~8 pages, that is not very large but maybe i’m wrong), at the first question i ask, i’ve the the error:
The message history is longer than the max chat context, length allowed, and we have run out of messages to drop.
Code:
llm = lr.language_models.OpenAIGPTConfig(
api_base="http://localhost:8000/v1",
chat_model="TheBloke/Mistral-7B-Instruct-v0.2-AWQ",
completion_model="TheBloke/Mistral-7B-Instruct-v0.2-AWQ",
use_chat_for_completion=False,
max_output_tokens=4096,
temperature=0.2
)
config = DocChatAgentConfig(
default_paths= [],
max_context_tokens=8192,
conversation_mode=True,
llm = llm,
relevance_extractor_config = lr.agent.special.RelevanceExtractorAgentConfig(
llm=llm
),
vecdb=lr.vector_store.QdrantDBConfig(
collection_name="test1",
replace_collection=True,
embedding=hf_embed_config
),
doc_paths= [
"./test1/2312.17238.pdf"
],
)
I tried to change some variables but without any effect (max_context_tokens, max_output_tokens …) even with a max context tokens at 32000.
Did i forget something or doing something wrong on the doc load ? Or my pdf is too large ?
Thanks for your work 👍
About this issue
- Original URL
- State: open
- Created 6 months ago
- Comments: 18 (10 by maintainers)
Can you check with other documents and/or more variety of questions and see if you’re still not getting a response? You can also try the bare chat mode (I e not document q/a) as suggested in the comments in the script, to ensure that your local LLM setup is working.
And finally, if you can make a reproducible example (or give me a specific doc and specific question) I can see if I can reproduce this issue
Ok I’ve setup a working example using
ollamato run your RAG example withmistral:7b-instruct-v0.2-q4_K_Msee this example script in thelangroid-examplesrepo: https://github.com/langroid/langroid-examples/blob/main/examples/docqa/rag-local-simple.pyThis works for me on an M1 Mac. Here’s a sample run: