Forums

Stream response chunks ChatGPT

I would like to print the response coming from ChatGPT API in chunks, without waiting for the full message. I found a solution here, but it doesn't seem to work on PythonAnywhere (I also updated Flask to the latest version 3.0): https://dev.to/jethrolarson/streaming-chatgpt-api-responses-with-python-and-javascript-22d0

Python:

def send_messages(messages):
    openai.api_key = "MYKEY"
    return openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=messages,
        stream=True
    )

@app.route('/chat', methods=['GET', 'POST'])
def chat():
    if request.method == 'POST':
    messages = request.json['messages']

def event_stream(messages):
        for line in send_messages(messages=messages):
            text = line.choices[0].delta.get('content', '')
            if len(text):
                yield text

        return Response(event_stream(messages), mimetype='text/event-stream')
    else:
        return stream_template('./chat.html')

html:

<!DOCTYPE html>
<html>
   <head>
<title>Chat</title>
  </head>
  <body>
    <h1>Chat</h1>
    <form id="chat-form">
  <label for="message">Message:</label>
  <input type="text" id="message" name="message">
  <button type="submit">Send</button>
</form>
<div id="chat-log"></div>
<script>
const form = document.querySelector("#chat-form");
const chatlog = document.querySelector("#chat-log");

form.addEventListener("submit", async (event) => {
  event.preventDefault();

  // Get the user's message from the form
      const message = form.elements.message.value;

      // Send a request to the Flask server with the user's message
      const response = await fetch("/chat", {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
        },
        body: JSON.stringify({ messages: [{ role: "user", content: message }] }),
      });

      // Create a new TextDecoder to decode the streamed response text
      const decoder = new TextDecoder();

      // Set up a new ReadableStream to read the response body
      const reader = response.body.getReader();
      let chunks = "";

      // Read the response stream as chunks and append them to the chat log
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        chunks += decoder.decode(value);
        console.log(chunks)
        chatlog.innerHTML = chunks;
      }
    });
    </script>
  </body>
</html>

Could you tell more how it's not working on PA? Do you see any errors in the error / server log of your web app?

Hi, sorry for the late answer. I don't have any error in error/server log regarding it. My problem is that the answer is received by the frontend all together, instead of receiving a continuous stream of data/strings. Therefore, the user has to wait about 70-80 seconds for full long answers, instead having an "immediate" feedback of few seconds. I was asking if it is related to PA and someone experienced the same problem in the past.

Have a look at this answer on Stack Overflow.