gpt4all: ERROR: The prompt size exceeds the context window size and cannot be processed.

Issue you’d like to raise.

I’m getting the following error:

ERROR: The prompt size exceeds the context window size and cannot be processed. GPT-J ERROR: The prompt is 9884 tokens and the context window is 2048!

You can reproduce with the following code:

from gpt4all import GPT4All

gpt =  GPT4All("ggml-gpt4all-j-v1.3-groovy", "./models/")
messages = []

text = "HERE A LONG BLOCK OF CONTENT (MORE THAN 2K CHARACTERS)"

messages.append({"role": "user", "content": "summarize :" + text})

gpt.chat_completion(messages=messages, streaming=True)

I noticed that when the model is initialised the n_ctx value is set to 2048

Found model file.
gptj_model_load: loading model from '.../models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048 <<<<<<<<<<<<<<<<<< HERE
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 2
gptj_model_load: ggml ctx size = 5401.45 MB
gptj_model_load: kv self size  =  896.00 MB
gptj_model_load: ................................... done
gptj_model_load: model size =  3609.38 MB / num tensors = 285

but is not possible to change this value. I tried sending the n_ctx parameter to the chat_completion method but it doesn’t work:

gpt.chat_completion(messages=messages, streaming=True, n_ctx = 4096)

I get the same error:

ERROR: The prompt size exceeds the context window size and cannot be processed.GPT-J ERROR: The prompt is 9980 tokens and the context window is 2048!

Any Idea?

Suggestion:

Is the n_ctx value hardcoded in the model itself, or is it something that can be specified when loading the model?

Having a character/token limit in the prompt input is very limiting specially when you try to provide long context to improve the output or to build a plugin to browse the web and so on.

Any help would be very appreciated.

Thanks!

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 13
  • Comments: 20 (1 by maintainers)

Most upvoted comments

Note, I’m pretty new to all this myself, so I might not arrive at the right answer. What I can see when I look at some of the code is: The main interface seems to be gpt4all-backend/llmodel_c.h & gpt4all-backend/llmodel_c.c and it defines the struct llmodel_prompt_context, which is passed into llmodel_prompt() That interface, or rather, that struct includes the n_ctx field.

In the logic itself, context data is first passed to an individual backend, then a prompt is run, then the values are passed back out again. See gpt4all-backend/llmodel_c.cpp. So I guess a backend could do whatever it wants with the context.

There are preset, i.e. default values in the individual backends, e.g. for groovy, which is a GPT-J model, its values are initialised in the struct gptj_hparams in gpt4all-backend/gptj.cpp.

The error you’re seeing is produced in GPTJ::prompt(). Here, it looks like the prompt n_ctx that arrives from the frontend is not used, but instead the value comes from the model itself: https://github.com/nomic-ai/gpt4all/blob/8204c2eb806aeab055b7a7fae4b4adc02e34ef41/gpt4all-backend/gptj.cpp#L920 As such, setting the value yourself won’t really matter.

ERROR: The prompt size exceeds the context window size and cannot be processed.

I realized this after posting yesterday. You are correct of course and I did post against the PrivateGPT project. I would imagine however that the hardcoded values were never addressed in the version of the project I am using and I’m in a way experiencing the same behavior. If so, I may be downloading the GPR4All project, building it and start experimenting with values of my own per your suggestion.