langchain: AI Prefix and Human Prefix not correctly reflected in
prompt = ChatPromptTemplate.from_messages([
SystemMessagePromptTemplate.from_template(template=system_template),
MessagesPlaceholder(variable_name="history"),
HumanMessagePromptTemplate.from_template("{input}")
])
llm = ChatOpenAI(temperature=0.9)
memory = ConversationBufferMemory(return_messages=True, ai_prefix="SpongebobSquarePants", human_prefix="Bob")
conversation = ConversationChain(memory = memory, prompt = prompt, llm = llm, verbose=True)
using ChatPromptTemplate.from_messages
will later use the method get_buffer_string
in the to_string()
for the class ChatPromptValue
in chat.py in Prompts. The format does not care of the new ai_prefix or human_prefix.
How can I change that ? Thanks
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 11
- Comments: 19 (1 by maintainers)
I have found that to enable a good multi-turn conversation with Llama2-chat based models, one needs to modify the
get_buffer_string
function by overriding it with a custom one. Specifically the list of AIMessage, SystemMessage and HumanMessage messages must be converted not only by prefixing the role likeHuman:
orAI:
, but rather by wrapping it with specific tags.In the Llama2-chat case the template looks like that (beware of all spaces and newlines between tags, they are important.
Here are some resources that confirm this necessity:
What are the best approaches to enable this wrapping of history messages based on specific model templates? One possibility, as far as I understand is to subclass the specific memory class like
ConversationSummaryBufferMemory
and override thesave_context
method with specific model based functions. Alternatively, a simpler option that can be extended to all subclasses ofBaseChatMemory
is to add as a base class member an optional callableget_buffer_string
where the user transforms the list of messages in the context in the way he wants.I second what @CarloNicolini said - it needs to be properly templated. Tried to use langchain today with a model that expects the ChatML format and found out that this is simply not possible since it’s hardcoded in various places for Llama2 format 😩
https://github.com/hwchase17/langchain/issues/3234 is potentially related. This whole feature needs to be uniform across all these classes. It’s no good this way.