Here's a common mistake to kill your Django app's performance: putting a query in a for-loop.
This simple page of text becomes extremely slow for your users:
I saw this being done by professionals at one of the first startups I worked at in.
It's easy to do as a Django beginner because it feels like you're writing normal python code.
But it's one to avoid with Django. I'll show you how.
The common name for this problem, which affects all databases, is the N+1 problem.
I've made an optional video guide (featuring me 🏇🏿) here, describing the problem and solution:
Problem: Killing your Django app by accidentally running 1001 database queries
Here's our troublesome view in views.py
:
from django.shortcuts import render
from sim.models import Movie, ProductionCompany
def movies(request):
movies = Movie.objects.all()
companies_releasing_movies_2020 = []
for movie in movies:
if movie.release_date.year >= 2020:
companies_releasing_movies_2020.append(movie.production_company)
return render(request, 'movies.html', {'movies': movies, 'companies_releasing_movies_2020': companies_releasing_movies_2020})
And here we are in our models. See that each movie has a foreign key to a production company.
from django.db import models
class ProductionCompany(models.Model):
name = models.CharField(max_length=255)
country = models.CharField(max_length=255)
established_date = models.DateField()
def __str__(self):
return self.name
class Movie(models.Model):
title = models.CharField(max_length=255)
director = models.CharField(max_length=255)
release_date = models.DateField()
production_company = models.ForeignKey(ProductionCompany, on_delete=models.CASCADE)
def __str__(self):
return self.title
A page with this code will run extremely slowly. This might appear pythonic, but it's bad for Django.
Why does this make my Django app so slow?
The reason: for each object in your loop, you make a database query.
This means that for each row in your database, you make a database call.
I've loaded 1000 fake movies into my database. This means, in the view above, we do 1 database query to load all the movies, plus 1000 queries to iterate over each movie.
Our relational database is extremely fast for collecting data in a few queries. But having a large number of queries will make your app extremely slow, and lock up your app for other users.
As mentioned, the common name for this is the N+1 problem.
How to you fix this for-loop database query problem with the Django ORM?
There are two main ways using the Django ORM. Both reduce the number of database queries you do.
1. Use a filter at the database level
from django.shortcuts import render
from sim.models import Movie, ProductionCompany
def movies(request):
movies = Movie.objects.all()
companies_releasing_movies_2020 = ProductionCompany.objects.filter(
movie__release_date__year_gte=2020
).distinct()
return render(
request, 'movies.html',
{'movies': movies, 'companies_releasing_movies_2020': companies_releasing_movies_2020}
)
This means that just do 2 queries: one to get all movies and one to get all the production companies who released movies with a release date on or after 2020.
You can see there are two queries here:
2. Join your database tables with select_related
from django.shortcuts import render
from sim.models import Movie, ProductionCompany
def movies(request):
movies = Movie.objects.select_related('production_company').all()
companies_releasing_movies_2020 = []
for movie in movies:
if movie.release_date.year >= 2020:
companies_releasing_movies_2020.append(movie.production_company)
return render(
request, 'movies.html',
{'movies': movies, 'companies_releasing_movies_2020': companies_releasing_movies_2020}
)
This Django ORM command updates the database query to attaches the production company to the movie (Called a 'JOIN' operation in SQL).
This reduces our database queries to just 1, getting all the movies and their production companies in one go.
Congrats and my recommendation
Congrats 🎉 You know how to avoid killing your Django app's speed by avoiding putting db queries in for loops.
Regarding what technique to normally use, even though there is an extra query, I recommend generally using filter
. Reasons:
-
If you're new to Django, using
filter
teaches you how to use the database. This will be useful for when you've got more complicated database operations in the future. -
select_related
is more of an optimization technique, whereas usingfilter
is a fundamental technique for choosing the data that you want from the database (usingfilter
with the Django ORM calls theWHERE
SQL command underneath).
P.S Want to build Django frontend faster?
Probably like you, I want to get my Django frontend out fast as possible (preferably instantly).
So, I'm building Photon Designer. Photon Designer lets me produce Django frontend visually and extremely quickly - like a painter sweeping his brush across the page 🖌️
If you found this guide helpful, you can see Photon Designer here .