h2ogpt: Searching documents gets "stuck" for a moment if used "unsupported" LLM.

I’m using ChatGLM3-6b as LLM. It works normally in pure LLM mode. When used in doc query mode, it will take a long time to search the document (I believe something is blocking h2oGPT to do so).

Using RelSources as Subset here: stdout when I click Submit:

prompt: <|system|>
Sys</s>
<|user|>
Hello there</s>
<|assistant|>

The model 'ChatGLMForConditionalGeneration' is not supported for . Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'FuyuForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PersimmonForCausalLM', 'PhiForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'WhisperForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM'].
prompt: <|system|>
Sys</s>
<|user|>
Hello there</s>
<|assistant|>

sim_search in 0.09814214706420898
prompt: <|system|>
Sys</s>
<|user|>
Pay attention and remember the information below, which will help to answer the question or imperative after the context ends.
"""

"""
According to only the information in the document sources provided within the context above: Hello there</s>
<|assistant|>

And it will stuck here for a moment, like 20 or 30 seconds, and the stdout continues:

prompt: <|system|>
Sys</s>
<|user|>
Hello there</s>
<|assistant|>

The model 'ChatGLMForConditionalGeneration' is not supported for . Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'FuyuForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PersimmonForCausalLM', 'PhiForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'WhisperForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM'].
Distance: min: 0.62374347448349 max: 0.5229793787002563 mean: 0.6232390850782394 median: 0.6266895830631256
query: Hello there
answer: ****ACTUAL ANSWER WAS GENERATED HERE****

Despite the “not supported” message, it’s able to generate expected results. Just that it’ll stuck in the middle for some time. However if I use zephyr-7B-beta (or other “supported” LLMs), the “not supported” message is not there and document query is instant.

About this issue

Original URL
State: closed
Created 6 months ago
Comments: 15

Commits related to this issue

For Issue #1276 — committed to h2oai/h2ogpt by pseudotensor 6 months ago

Most upvoted comments

With the above work-around to some odd transformers bug, it’s no longer slow for general use. Thanks for your persistence. I’m not sure what the bug is in transformers that leads to such a simple thing being so slow.

python generate.py --base_model=THUDM/chatglm3-6b --prompt_type=custom --prompt_dict="{'PreInstruct': '<|user|>', 'PreResponse': '<|observation|>', 'chat_sep': '\n', 'chat_turn_sep': '\n', 'humanstr': '<|user|>', 'botstr': '<|observation|>', 'terminate_response': ['<|user|>']}"

pseudotensor on Jan 11, 2024