vocode-python: [EPD-458] Openai completions stopping
After making minimal changes to the chat.py example to tailor it for a golf booking chatbot flow, the openai completions stop consistently.
The below has been repeatable many times. When the conversation reaches this point, the system responds Thank you for letting me know.
but doesn’t send the following sentence asking the user another question.
AI: Hello, I'm Tom from the golf course. How may I help you?
Human: hey i want to book comp
Human: DEBUG:__main__:Responding to transcription
AI: Sure, I can help you with that.
AI: Are you a member of our club?
yep
Human: DEBUG:__main__:Responding to transcription
AI: Great!
AI: Could you please provide me with your member number?
12345
Human: DEBUG:__main__:Responding to transcription
AI: Thank you for providing your member number.
AI: May I have your name, please?
cam
Human: DEBUG:__main__:Responding to transcription
AI: Thank you, Cam.
AI: How many players will be participating in the competition?
2
Human: DEBUG:__main__:Responding to transcription
AI: Thank you for letting me know.
Human: DEBUG:__main__:Responding to transcription
ERROR:asyncio:Unclosed connection
client_connection: Connection<ConnectionKey(host='api.openai.com', port=443, is_ssl=True, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=None)>
The response stops, then after I send an empty message to the system I receive the above asyncio error and the system continues as normal.
EDIT: I should state that the completion stops at Thank you for letting me know.
100% of the time but the asyncio error only happens occasionally.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 25 (6 by maintainers)
Commits related to this issue
- Don't split bot messages in transcript (Fixes #319) (#337) * Initial fix * Fix after merge * Add message_ids everywhere * Revert past commits * Add format_openai_chat_messages_from_transc... — committed to vocodedev/vocode-python by HHousen a year ago
- Don't split bot messages in transcript (Fixes #319) (#337) * Initial fix * Fix after merge * Add message_ids everywhere * Revert past commits * Add format_openai_chat_messages_from_transc... — committed to m5a0r7/vocode-python by HHousen a year ago
I have identified the problem. Vocode splits the OpenAI response on sentences in order to synthesize them as fast as possible. After something is spoken, Vocode adds the utterance to the transcript associated with the ChatGPT Agent. As a result, OpenAI’s response gets added to the transcript but split apart by sentence. So, when the user sends another message and this transcript is reformatted and sent back to OpenAI to generate the next message, the previous assistant message is split.
For example, when recreating @cammoore54’s example with
temperature=0
and thegpt-3.5-turbo-16k-0613
model, this is what is sent to the OpenAI API when the user says “yep”:This is what should be sent (what you put into the OpenAI playground):
This difference is the source of the problem. If the previous chat history contains only one sentence responses, then future assistant messages will also only be one sentence.
Good catch finding this bug! The messages should definitely not be split apart when they are sent back to the OpenAI API.
Here are the differences from the OpenAI playground (both with
temperature=0
and thegpt-3.5-turbo-16k-0613
model): Above: Formatted properly, the second sentence is generated.Above: Formatted how Vocode currently does it, only one sentence is generated.
So, it seems like the OpenAI playground and the OpenAI python library create the exact same response (testing with
temperature=0
). Also, setting thestream
option or using the async vs sync api doesn’t make a difference. The OpenAI Python issue https://github.com/openai/openai-python/issues/555 is probably not an issue after all. Depending on how the messages are formatted, the second sentence is not generated.We are currently working on a fix for this! Thanks 😄!
Thanks @bjquinn.
I have tested in isolation with async implementation using the below code and I get the desired responses 100% of the time (the same as playground). Therefore it has to do with the implementation in vocode.
@ajar98 @Kian1354 Do you have the capacity to look into this? I am happy to support but am still familiarising myself with the codebase