I'll show you how the fastest way to send LLM completions from Anthropic or OpenAI to the browser in real-time using Django.
No third-party packages needed:
- We'll use Django's inbuilt server-sent events with
StreamingHttpResponse
, plus minimal vanilla JavaScript in a simple template.
Streaming completions from LLMs to the browser immediately:
- We'll show results to users as soon as they are generated, rather than waiting for the entire completion, using Django's server-sent events (SSE).
Here's how our finished app will look:
This technique is surprisingly simple. We'll aim to do this in under 5 minutes.
Let's begin 🚀
Setup your Django app
- Run this in the terminal:
pip install --upgrade django python-dotenv anthropic openai
django-admin startproject core .
python manage.py startapp sim
- Add our app sim to the
INSTALLED_APPS
in settings.py:
# settings.py
INSTALLED_APPS = [
'sim',
...
]
Add your environment variables
Create a file called .env at "core/.env" and add the below to it. We'll use this to store our environment variables, which we won't commit to version control.
- Add your api keys to the .env file. You can get your api keys from the Anthropic and Open AI websites.
ANTHROPIC_API_KEY=<your_anthropic_api_key>
OPENAI_API_KEY=<your_openai_api_key>
- Add the following to the top of your
settings.py
file to load your environment variables from the .env file:
from pathlib import Path
import os
from dotenv import load_dotenv
load_dotenv()
Create your Django view to stream the LLM completions to the browser
- Add the following code to
sim/views.py
:
from django.http import HttpResponse, StreamingHttpResponse
from django.shortcuts import render
from openai import OpenAI
from typing import Iterator
from anthropic import Anthropic
def index(request) -> HttpResponse:
return render(request, 'index.html')
def generate_completion(request, user_prompt: str) -> StreamingHttpResponse:
"""
This func returns a streaming response that will be used to stream the completion back to the client.
We specify the LLM stream that we want to use in the `is_using` variable.
You could easily modify this to choose the LLM in your request.
"""
is_using = "anthropic" # "openai
def complete_with_anthropic() -> Iterator[str]:
"""
Stream an anthropic completion back to the client.
Docs: https://docs.anthropic.com/claude/reference/messages-streaming
"""
anthropic_client = Anthropic()
with anthropic_client.messages.stream(
max_tokens=1024,
system="You turn anything that I say into a funny, jolly, rhyming poem. "
"Add emojis occasionally.",
messages=[
{"role": "user",
"content": user_prompt
},
],
model="claude-3-opus-20240229",
) as stream:
for text in stream.text_stream:
content = text
if content is not None:
# We tidy the content for showing in the browser:
content = content.replace("\n", "<br>")
content = content.replace(",", ", ")
content = content.replace(".", ". ")
yield f"data: {content}\n\n" # We yield the content in the text/event-stream format.
print(text, end="", flush=True)
def complete_with_openai() -> Iterator[str]:
"""
Stream an openai completion back to the client.
Docs: https://platform.openai.com/docs/api-reference/streaming
"""
openai_client = OpenAI()
stream = openai_client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": "You turn anything that I say into a funny, jolly, rhyming poem. "
"Add emojis occasionally.",
},
{"role": "user",
"content": user_prompt
},
],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content is not None:
# We tidy the content for showing in the browser:
content = content.replace("\n", "<br>")
content = content.replace(",", ", ")
content = content.replace(".", ". ")
yield f"data: {content}\n\n" # We yield the content in the text/event-stream format.
# We select our chosen completion func.
if is_using == "openai":
completion_func = complete_with_openai
elif is_using == "anthropic":
completion_func = complete_with_anthropic
else:
raise ValueError(f"Unknown completion service: {is_using}")
return StreamingHttpResponse(completion_func(), content_type="text/event-stream")
- Check the stream by visiting http://localhost:8000/stream/ in your browser.
You should see the completions streaming in real-time like in the below video:
Create your Django template to display the LLM results to the user in the browser
- Create a new folder at
sim/templates
- add a new file called
index.html
in it and add the following code:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Stream LLM completion with Django</title>
<style>
.container {
display: flex;
flex-direction: column;
align-items: center;
text-align: center;
font-family: Arial, sans-serif;
}
.heading {
font-size: 24px;
margin-bottom: 20px;
}
.btn{
background-color: #ffcccc;
color: black;
padding: 10px 20px;
border: none;
border-radius: 20px;
cursor: pointer;
}
.btn:hover {
background-color: #ff9999;
}
#prompt-input {
width: 80%;
padding: 10px;
border-radius: 5px;
border: 1px solid #ccc;
margin-bottom: 15px;
}
#completion-text {
border-radius: 5px;
width: 80%;
overflow-y: scroll;
}
</style>
</head>
<body>
<div class="container">
<p class="heading">Stream data from an LLM</p>
<div id="completion-text"></div>
<br>
<input id="prompt-input" type="text" placeholder="Enter your text" style="" required>
<button class="btn" style="" onclick="startSSE()">
Generate
</button>
</div>
<script>
let eventSource;
const sseData = document.getElementById('completion-text');
const promptInput = document.getElementById('prompt-input');
function startSSE() {
const prompt = document.getElementById('prompt-input').value
if (!prompt) {
alert("Please enter a prompt");
return;
}
const urlEncoded = encodeURIComponent(prompt);
const url = `generate-completion/${urlEncoded}`
eventSource = new EventSource(url);
eventSource.onopen = () => {
console.log("Connection to server opened");
}
eventSource.onmessage = event => {
console.log("event.data = ", event.data)
sseData.innerHTML += event.data;
}
}
</script>
</body>
</html>
💡 Side note : Here's a video of me generating the above HTML using Photon Designer 💡
Here's me using my product Photon Designer to generate the above HTML code:
-> Let's get back to building 🚀
Update your urls
- In
core/urls.py
, add the following code:
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path('admin/', admin.site.urls),
path('', include('sim.urls')),
]
- Create a file at
sim/urls.py
, add the following code:
from django.urls import path
from . import views
urlpatterns = [
path('', views.index, name='index'),
path('generate-completion/<str:user_prompt>', views.generate_completion, name='generate-completion')
]
Run your Django app
python manage.py runserver
- Visit
http://localhost:8000/
in your browser to see the completions streaming in real-time.
Complete - you can now stream your LLM completions to the browser using Django ✅
Congrats. You've successfully set up a Django app to stream LLM completions to the browser in real-time, using Django's inbuilt server-sent events.
You've added a new technique to your programming toolbelt 🙂
If you'd like to see another simple guide on server-sent events (SSE) and Django, check out my guide: The simplest way to add server sent events to Django.
Build your Django frontend even faster
I want to release high-quality products as soon as possible. Probably like you, I want to make my Django product ideas become reality as soon as possible.
That's why I built Photon Designer - an entirely visual editor for building Django frontend at the speed that light hits your eyes. Photon Designer outputs neat, clean