gpt4all: ERROR: The prompt size exceeds the context window size and cannot be processed.
Issue you’d like to raise.
I’m getting the following error:
ERROR: The prompt size exceeds the context window size and cannot be processed. GPT-J ERROR: The prompt is 9884 tokens and the context window is 2048!
You can reproduce with the following code:
from gpt4all import GPT4All
gpt = GPT4All("ggml-gpt4all-j-v1.3-groovy", "./models/")
messages = []
text = "HERE A LONG BLOCK OF CONTENT (MORE THAN 2K CHARACTERS)"
messages.append({"role": "user", "content": "summarize :" + text})
gpt.chat_completion(messages=messages, streaming=True)
I noticed that when the model is initialised the n_ctx
value is set to 2048
Found model file.
gptj_model_load: loading model from '.../models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx = 2048 <<<<<<<<<<<<<<<<<< HERE
gptj_model_load: n_embd = 4096
gptj_model_load: n_head = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot = 64
gptj_model_load: f16 = 2
gptj_model_load: ggml ctx size = 5401.45 MB
gptj_model_load: kv self size = 896.00 MB
gptj_model_load: ................................... done
gptj_model_load: model size = 3609.38 MB / num tensors = 285
but is not possible to change this value. I tried sending the n_ctx
parameter to the chat_completion
method but it doesn’t work:
gpt.chat_completion(messages=messages, streaming=True, n_ctx = 4096)
I get the same error:
ERROR: The prompt size exceeds the context window size and cannot be processed.GPT-J ERROR: The prompt is 9980 tokens and the context window is 2048!
Any Idea?
Suggestion:
Is the n_ctx
value hardcoded in the model itself, or is it something that can be specified when loading the model?
Having a character/token limit in the prompt input is very limiting specially when you try to provide long context to improve the output or to build a plugin to browse the web and so on.
Any help would be very appreciated.
Thanks!
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 13
- Comments: 20 (1 by maintainers)
Note, I’m pretty new to all this myself, so I might not arrive at the right answer. What I can see when I look at some of the code is: The main interface seems to be
gpt4all-backend/llmodel_c.h
&gpt4all-backend/llmodel_c.c
and it defines the struct llmodel_prompt_context, which is passed into llmodel_prompt() That interface, or rather, thatstruct
includes then_ctx
field.In the logic itself, context data is first passed to an individual backend, then a prompt is run, then the values are passed back out again. See gpt4all-backend/llmodel_c.cpp. So I guess a backend could do whatever it wants with the context.
There are preset, i.e. default values in the individual backends, e.g. for
groovy
, which is aGPT-J
model, its values are initialised in thestruct gptj_hparams
in gpt4all-backend/gptj.cpp.The error you’re seeing is produced in GPTJ::prompt(). Here, it looks like the prompt
n_ctx
that arrives from the frontend is not used, but instead the value comes from the model itself: https://github.com/nomic-ai/gpt4all/blob/8204c2eb806aeab055b7a7fae4b4adc02e34ef41/gpt4all-backend/gptj.cpp#L920 As such, setting the value yourself won’t really matter.ERROR: The prompt size exceeds the context window size and cannot be processed.
I realized this after posting yesterday. You are correct of course and I did post against the PrivateGPT project. I would imagine however that the hardcoded values were never addressed in the version of the project I am using and I’m in a way experiencing the same behavior. If so, I may be downloading the GPR4All project, building it and start experimenting with values of my own per your suggestion.