Stream AI chats using Django in 5 minutes (OpenAI and Anthropic) 💧

I'll show you how the fastest way to send LLM completions from Anthropic or OpenAI to the browser in real-time using Django.

No third-party packages needed:

  • We'll use Django's inbuilt server-sent events with StreamingHttpResponse, plus minimal vanilla JavaScript in a simple template.

Streaming completions from LLMs to the browser immediately:

  • We'll show results to users as soon as they are generated, rather than waiting for the entire completion, using Django's server-sent events (SSE).

Here's how our finished app will look:

This technique is surprisingly simple. We'll aim to do this in under 5 minutes.
Let's begin 🚀

Setup your Django app

  • Run this in the terminal:
pip install --upgrade django python-dotenv anthropic openai
django-admin startproject core .
python startapp sim
  • Add our app sim to the INSTALLED_APPS in

Add your environment variables

Create a file called .env at "core/.env" and add the below to it. We'll use this to store our environment variables, which we won't commit to version control.

  • Add your api keys to the .env file. You can get your api keys from the Anthropic and Open AI websites.
  • Add the following to the top of your file to load your environment variables from the .env file:
from pathlib import Path
import os
from dotenv import load_dotenv


Create your Django view to stream the LLM completions to the browser

  • Add the following code to sim/
from django.http import HttpResponse, StreamingHttpResponse
from django.shortcuts import render
from openai import OpenAI
from typing import Iterator
from anthropic import Anthropic

def index(request) -> HttpResponse:
    return render(request, 'index.html')

def generate_completion(request, user_prompt: str) -> StreamingHttpResponse:
    This func returns a streaming response that will be used to stream the completion back to the client.

    We specify the LLM stream that we want to use in the `is_using` variable.
    You could easily modify this to choose the LLM in your request.

    is_using = "anthropic"  # "openai

    def complete_with_anthropic() -> Iterator[str]:
        Stream an anthropic completion back to the client.
        anthropic_client = Anthropic()
                system="You turn anything that I say into a funny, jolly, rhyming poem. "
                       "Add emojis occasionally.",

                    {"role": "user",
                     "content": user_prompt
        ) as stream:
            for text in stream.text_stream:
                content = text
                if content is not None:
                    # We tidy the content for showing in the browser:
                    content = content.replace("\n", "<br>")
                    content = content.replace(",", ", ")
                    content = content.replace(".", ". ")

                    yield f"data: {content}\n\n"  # We yield the content in the text/event-stream format.

                print(text, end="", flush=True)

    def complete_with_openai() -> Iterator[str]:
        Stream an openai completion back to the client.
        openai_client = OpenAI()
        stream =
                    "role": "system",
                    "content": "You turn anything that I say into a funny, jolly, rhyming poem. "
                               "Add emojis occasionally.",
                {"role": "user",
                 "content": user_prompt
        for chunk in stream:
            content = chunk.choices[0].delta.content
            if content is not None:
                # We tidy the content for showing in the browser:
                content = content.replace("\n", "<br>")
                content = content.replace(",", ", ")
                content = content.replace(".", ". ")

                yield f"data: {content}\n\n"  # We yield the content in the text/event-stream format.

    # We select our chosen completion func.
    if is_using == "openai":
        completion_func = complete_with_openai
    elif is_using == "anthropic":
        completion_func = complete_with_anthropic
        raise ValueError(f"Unknown completion service: {is_using}")

    return StreamingHttpResponse(completion_func(), content_type="text/event-stream")

- Check the stream by visiting http://localhost:8000/stream/ in your browser.

You should see the completions streaming in real-time like in the below video:

Create your Django template to display the LLM results to the user in the browser

  • Create a new folder at sim/templates
  • add a new file called index.html in it and add the following code:
<!DOCTYPE html>
<html lang="en">
    <meta charset="UTF-8">
    <title>Stream LLM completion with Django</title>
        .container {
            display: flex;
            flex-direction: column;
            align-items: center;
            text-align: center;
            font-family: Arial, sans-serif;
        .heading {
            font-size: 24px;
            margin-bottom: 20px;
            background-color: #ffcccc;
            color: black;
            padding: 10px 20px;
            border: none;
            border-radius: 20px;
            cursor: pointer;
        .btn:hover {
            background-color: #ff9999;
        #prompt-input {
            width: 80%;
            padding: 10px;
            border-radius: 5px;
            border: 1px solid #ccc;
            margin-bottom: 15px;
        #completion-text {
            border-radius: 5px;
            width: 80%;
            overflow-y: scroll;
<div class="container">
    <p class="heading">Stream data from an LLM</p>
    <div id="completion-text"></div>
    <input id="prompt-input" type="text" placeholder="Enter your text" style="" required>
    <button class="btn" style="" onclick="startSSE()">

    let eventSource;
    const sseData = document.getElementById('completion-text');
    const promptInput = document.getElementById('prompt-input');

    function startSSE() {
        const prompt = document.getElementById('prompt-input').value
        if (!prompt) {
            alert("Please enter a prompt");
        const urlEncoded = encodeURIComponent(prompt);
        const url = `generate-completion/${urlEncoded}`

        eventSource = new EventSource(url);

        eventSource.onopen = () => {
            console.log("Connection to server opened");

        eventSource.onmessage = event => {
            console.log(" = ",
            sseData.innerHTML +=;

Update your urls

  • In core/, add the following code:
from django.contrib import admin
from django.urls import path, include

urlpatterns = [
    path('', include('sim.urls')),
  • Create a file at sim/, add the following code:
from django.urls import path

from . import views

urlpatterns = [
    path('', views.index, name='index'),
    path('generate-completion/<str:user_prompt>', views.generate_completion, name='generate-completion')

Run your Django app

python runserver
  • Visit http://localhost:8000/ in your browser to see the completions streaming in real-time.

Complete - you can now stream your LLM completions to the browser using Django ✅

Congrats. You've successfully set up a Django app to stream LLM completions to the browser in real-time, using Django's inbuilt server-sent events.

You've added a new technique to your programming toolbelt 🙂

If you'd like to see another simple guide on server-sent events (SSE) and Django, check out my guide: The simplest way to add server sent events to Django.

