langchainjs: Streaming causes LLM to always start answers with a rephrased version of the question

Describe the issue

When I enable streaming on the OpenAI model, it causes all answers to begin with a rephrased version of the question.

Example:

Question from user: “How far is the sun?” Answer streamed from handleLLMNewToken: “What is the distance from Earth to the sun? I don’t know.”

It’s worth noting that the answer streamed token-by-token by handleLLMNewToken is different than the response return by await chain.call. The latter returns { text: " I don't know." }, which is the desired behavior. The problem is that this value can’t be streamed. As far as I know the streaming needs to happen from inside handleLLMNewToken like this:

const sendData = (data: string) => {
    res.write(`data: ${data}\n\n`);
  };

const model = new OpenAI({
    openAIApiKey: process.env.OPENAI_API_KEY,
    streaming: true,
    callbackManager: CallbackManager.fromHandlers({
      async handleLLMNewToken(token: string) {
        console.log('handleLLMNewToken', token); 
        sendData(JSON.stringify({ data: token })); // stream each token
      },
      async handleLLMStart(llm: any, prompts: string[]) {
        console.log('handleLLMStart');
      },
    }),
  });

Environment

“langchain”: “^0.0.51”, “next”: “13.3.0”,

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 12
  • Comments: 18 (2 by maintainers)

Most upvoted comments

The immediate fix for this issue is to do the following

// construct your chain as before
const chain = ConversationalRetrievalQAChain.fromLLM(new ChatOpenAI({streaming: true, ...}), ...)
// after creating the chain override the LLM in the inner `questionGeneratorChain`
chain.questionGeneratorChain.llm = new ChatOpenAI()

// use the chain

We’re working on a better solution

Having this issue as well after updating to 0.0.61. Looks like handleLLMNewToken when passed to the OpenAI LLM callbacks is returning the tokens from the “standalone question” , not the final result.