Django migrations and your database
First thing first. Django’s migrations are fantastic. It’s really hard to switch to any other ORM due in large part to Django’s migrations. However, they are a little difficult to wrap your head around as a beginner. If this might be you, don’t worry! Every Django developer has been in your position multiple times and we all will be again at some point in the future.
A high level overview of migrations
First we need to agree on some truths. There are three places where our data model is
represented. The “data model” being the real-world objects you’re defining in code.
A common one is the django.contrib.auth.models.User
model. The three places it’s
represented in your project are:
- Models
- Database schema1
- Migration models
The first is the most obvious place we interact with our data model. It’s the models we create in our code.
The second is the tables and columns in our database that correspond to our models. If things go according to plan, this should perfectly match our models.
But we’re here because it doesn’t always go according to plan. This is partly due to a lack of awareness around migration models. Migration models are snapshots of our project’s models. The Django migration system can calculate what your entire project’s models looked like given a set of migrations2.
Why do we need migration models
We need these migration models because each migration file will translate into a set of SQL instructions that will update your database schema. These instructions need to start with an understanding of what the database schema looks like. It makes the assumption that the schema matches that which would have been created by previous migrations.
Why can’t we rely on the database schema?
Maybe your thinking, Tim I know you’re not a regular Django contributor, but in your best estimation why don’t migrations just check the dang database? First, great question reader. The justifications I can come up with are:
- This is easier.
- This is predictable.
- These migrations are easier to maintain.
If we needed to inspect the database, there are an infinite number of combinations of schema
that could exist. How do we know what all needs to change? For example, if we want a model
with a char field color
, but the underlying table has two different fields not defined
on the model named color_int
(integer) and color_hex
(char), how could we reasonably
determine what to do. Maybe we need to rename color_hex
. Or drop both color_int
and
color_hex
. Or keep both and add a new column. And there are scenarios in which any of those
three are desirable outcomes.
Migrations that resolve to specific SQL no matter what are more predictable which is a huge
benefit. Let’s assume we have an existing table, but it has a different schema between our
test environment and production. In testing it has an existing column in the database that
only needs to have a NOT NULL
constraint applied. On production there is no column at all.
This means when we run this migration on test, we will be running an entirely different SQL
migration than we would on production. It’s very hard to test a deployment if you’re not
executing the same code.
Finally, these migrations are easier to maintain because they don’t require a database to determine the existing state. We can imagine what the database would look like at any point. This eliminates the need to have a database running to test every aspect of the migration system. That reduces complexity, requires less energy and makes people more efficient.
Cripes you write a lot
Yes, apparently I do reader. Apparently I do.
Just tell me how they work!
Sheesh, reader. I’m getting to it.
You define your Django models to indicate what structure you’d like to store your data. The migration files are created to include instructions on how to change the database to match the schema needed to support your models. Each migration file is an incremental set of instructions and builds off previous migrations. The last thing I want mention is that the act of creating/updated/deleting the migration files do not impact your database directly because they are instructions.
If you’re curious about how all this works in the background, Markus Holtermann gave a great talk at DjangoCon Europe 2020 on this topic.
Now that we have our general overview, let’s dive into the specifics. Django has two main operations for dealing with migrations, but there are two more that can be helpful.
makemigrations
Short version: This creates a file with operations to manipulate your database. The operations are determined by comparing the latest migration’s model state (from your earlier migrations) to your current project’s model state. The operations will be determined by the differences between those two states.
Note: These files should be committed into source control!
migrate
Short version: This will execute the operations in the migration files that are being run. You can specify which migrations to target and it will run all migrations leading up to it (migrations have dependencies). This is very important. This will ONLY execute the SQL that the migration’s operations generate. This SQL won’t change if your models change. If you create a migration, then change your models you need to create a new migration or update/delete the old migration ( this is dangerous).
showmigrations
Short version: This tells you which migrations you have applied by checking the table django_migrations
in your database. This does not tell you what state your database is in compared to your migrations.
It simply tells you whether a migration has run or not.
It will not change whether the database considers the migration to have been applied or not.
sqlmigrate
Short version: This will print the SQL that be executed for a specific migration. You can use
--backwards
for the reverse migration.
Migrations and timing
Migrations are a very difficult thing. I’m not being humble, nice or over-exaggerating for effect when I say that every Django developer has run into migration problems and will encounter them again in the future.
Don’t feel bad if you’re reading this article because you’re facing a thorny database issue. Keep focused, stay calm, make backups and test your ideas. You’ll get there.
But back to timing. Timing is very important when it comes to migrations. This is because you are dealing with three separate entities. The models, migration files and the database. Any of these can be changed at any time. And only one of those changes can actually be detected.
- Changing models
- Regardless of which migrations have been run, a new migration will be created.
- The database isn’t impacted directly by changing models, only via migrations.
- Changing migrations
- A new migration will be created if the resulting migration model state is different than the
existing project model state. Unless you’re doing something complicated, changing your migration
should cause
makemigrations
to want to create a new migration. - Changes to a migration will only be used when the migrations is applied or unapplied. For example, if you’ve applied a migration and then change the migration the database will not have that change. However, when you try to reverse the migration, the resulting SQL will take those changes into account. This can be a source of problems.
- A new migration will be created if the resulting migration model state is different than the
existing project model state. Unless you’re doing something complicated, changing your migration
should cause
- Changing database schema
- This will result in no change to the models or migrations. In fact, these changes will go unnoticed until the application executes SQL that tries to use something that doesn’t exist.
- The migrations can’t tell that the database has changed. Any migrations that are applied will execute their SQL, potentially compounding problems.
That’s neat, but how about an example
Neat, I learn best by example too dear reader.
Let’s work with the models from the Django tutorial. We’ll be using SQLite for these examples since it’s what is used by default.
from django.db import models
class Question(models.Model):
question_text = models.CharField(max_length=200)
pub_date = models.DateTimeField("date published")
class Choice(models.Model):
question = models.ForeignKey(Question, on_delete=models.CASCADE)
choice_text = models.CharField(max_length=200)
votes = models.IntegerField(default=0)
Our initial migration will look like:
from django.db import migrations, models
import django.db.models.deletion
class Migration(migrations.Migration):
initial = True
dependencies = []
operations = [
migrations.CreateModel(
name='Question',
fields=[
('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('question_text', models.CharField(max_length=200)),
('pub_date', models.DateTimeField(verbose_name='date published')),
],
),
migrations.CreateModel(
name='Choice',
fields=[
('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('choice_text', models.CharField(max_length=200)),
('votes', models.IntegerField(default=0)),
('question', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='myapp.question')),
],
),
]
Now if we run python manage.py sqlmigrate polls 0001
we’ll get the following SQL output (I added line breaks for readability).
BEGIN;
--
-- Create model Question
--
CREATE TABLE "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"question_text" varchar(200) NOT NULL,
"pub_date" datetime NOT NULL);
--
-- Create model Choice
--
CREATE TABLE "polls_choice" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"choice_text" varchar(200) NOT NULL,
"votes" integer NOT NULL,
"question_id" bigint NOT NULL REFERENCES "polls_question" ("id") DEFERRABLE INITIALLY DEFERRED);
CREATE INDEX "polls_choice_question_id_c5b4b260" ON "polls_choice" ("question_id");
COMMIT;
Give that the eye test. Compare the SQL, migration, and model. We can see that they are all in sync.
Now before we run migrate
, let’s take a look at the database (you may need to install
sqlite3
on your system). Run the following:
python manage.py dbshell
.schema polls_question
Use CTRL+C
a few times to exit the SQLite shell.
You should not have seen any output. Now let’s run migrate
, then inspect the table again.
python manage.py migrate
python manage.py dbshell
.schema polls_question
The following should be in the output:
CREATE TABLE IF NOT EXISTS "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"question_text" varchar(200) NOT NULL,
"pub_date" datetime NOT NULL);
We’ve now applied the migration to our database and have confirmed that the table matches what the migration specified.
Cool, but it’s time to make some moves. I’m not a fan of using question_text
when we know we’re in
the Question
model. Let’s rename that field to text
.
class Question(models.Model):
text = models.CharField(max_length=200)
pub_date = models.DateTimeField("date published")
At this current moment migration polls/migrations/0001_initial.py
is still the same and our database
still matches what we found. Re-run the SQL commands from earlier to check.
Now let’s create the second migration via python manage.py makemigrations polls
. You should get a
prompt:
Was question.question_text renamed to question.text (a CharField)? [y/N]
Enter y
and press Enter
.
This generates 0002__rename_question_text_question_text.py
:
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
('polls', '0001_initial'),
]
operations = [
migrations.RenameField(
model_name='question',
old_name='question_text',
new_name='text',
),
]
Cool, that looks good.
Now let’s check what SQL this will generate.
python manage.py sqlmigrate polls 0002
BEGIN;
--
-- Rename field question_text on question to text
--
ALTER TABLE "polls_question" RENAME COLUMN "question_text" TO "text";
COMMIT;
This makes sense. The operation we that exists in our migration file is RenameField
and
the generated SQL is updating the table polls_question
to name a column. As a reminder,
a column on a table will map to the fields of models in the Django project.
What we’re going to do next is to run migrate
to apply 0002__rename_question_text_question_text
.
Then we’re going to check what the table looks like from a SQLite shell. Before we get to that
step I’d like you to think about what you expect it to generate. Recall that our previous
inspection gave the following output:
CREATE TABLE IF NOT EXISTS "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"question_text" varchar(200) NOT NULL,
"pub_date" datetime NOT NULL);
Now that you have an idea in mind, let’s run our commands.
python manage.py migrate
python manage.py dbshell
.schema polls_question
Did you expect question_text
to turn into text
? Good work!
Here’s the output from when I ran it.
CREATE TABLE IF NOT EXISTS "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"text" varchar(200) NOT NULL,
"pub_date" datetime NOT NULL);
Example intermission
What I want you to take away from this is that any given moment, our Django projects models are a representation of our data model. Similarly, at any given moment our database has a specific version of our data model. The migrations are what are used to change the database to reflect our models, but are individually singular operations performed on the database.
Let me try explaining the three layers one more time.
- Models: what we want our data model to look like right now.
- Database: what our data model actually is.
- Migrations: instructions to change our database to reflect our models.
Example of what can go wrong
We’re going to do something similar to the previous example, but this time we’re going to purposely get into some trouble. The purpose of this is to highlight the importance of retaining applied migrations and to not change the database schema manually.
This will build upon our previous example, so if you haven’t gone through that process please do so.
Alright currently our Question
model has the field text
. Let’s say we want to
make the reasonable choice to keep that model inline with the rest of the Django
tutorial’s design where that field is called question_text
. There are a few options,
the best one is to use python manage.py migrate polls 0001
because migration
0002__rename_question_text_question_text
is what renamed our column from
question_text
to text
. Unapplying 0002
would simply change the column back
to question_text
. From there we could delete the migration file and everything
would be perfect. But I said we’re going to get into some trouble, so
let’s get after it!
Let’s change our database schema manually. Let’s open up a SQLite shell and rename
that column. Start with python manage.py dbshell
.
-- Change the table schema, renaming text to question_text
ALTER TABLE "polls_question" RENAME COLUMN "text" TO "question_text";
-- Let's confirm this change
.schema polls_question
You should see the following output:
CREATE TABLE IF NOT EXISTS "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"pub_date" datetime NOT NULL,
"question_text" varchar(200) NOT NULL);
Excellent! Our database is now back to where we want it.
But wait! Our Question
model still shows text
. Let’s update that too.
class Question(models.Model):
question_text = models.CharField(max_length=200)
pub_date = models.DateTimeField("date published")
Wonderful! Oh, but now when we run a management command we’re getting a warning. Running
migrate
should generate the following message:
Operations to perform:
Apply all migrations: admin, auth, contenttypes, polls, sessions
Running migrations:
No migrations to apply.
Your models in app(s): 'polls' have changes that are not yet reflected in a migration, and so won't be applied.
Run 'manage.py makemigrations' to make new migrations, and then re-run 'manage.py migrate' to apply them.
Well that’s odd. We changed our model to match our database. Now it wants us to
create a new migration? The computer is pretty smart and Django does a lot of
amazing stuff for us, it’s probably right here. Let’s run manage.py makemigrations
and again answer y
to the rename question.
Now let’s run manage.py migrate
as the warning told us to.
Ack! A wild OperationalError
appeared!3
example$ ./manage.py migrate
Operations to perform:
Apply all migrations: admin, auth, contenttypes, polls, sessions
Running migrations:
Applying polls.0003_rename_text_question_question_text...Traceback (most recent call last):
File "example/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
File "example/venv/lib/python3.10/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
return super().execute(query, params)
sqlite3.OperationalError: no such column: "text"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "example/./manage.py", line 22, in <module>
main()
File "example/./manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "example/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
utility.execute()
File "example/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 436, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "example/venv/lib/python3.10/site-packages/django/core/management/base.py", line 412, in run_from_argv
self.execute(*args, **cmd_options)
File "example/venv/lib/python3.10/site-packages/django/core/management/base.py", line 458, in execute
output = self.handle(*args, **options)
File "example/venv/lib/python3.10/site-packages/django/core/management/base.py", line 106, in wrapper
res = handle_func(*args, **kwargs)
File "example/venv/lib/python3.10/site-packages/django/core/management/commands/migrate.py", line 356, in handle
post_migrate_state = executor.migrate(
File "example/venv/lib/python3.10/site-packages/django/db/migrations/executor.py", line 135, in migrate
state = self._migrate_all_forwards(
File "example/venv/lib/python3.10/site-packages/django/db/migrations/executor.py", line 167, in _migrate_all_forwards
state = self.apply_migration(
File "example/venv/lib/python3.10/site-packages/django/db/migrations/executor.py", line 252, in apply_migration
state = migration.apply(state, schema_editor)
File "example/venv/lib/python3.10/site-packages/django/db/migrations/migration.py", line 132, in apply
operation.database_forwards(
File "example/venv/lib/python3.10/site-packages/django/db/migrations/operations/fields.py", line 303, in database_forwards
schema_editor.alter_field(
File "example/venv/lib/python3.10/site-packages/django/db/backends/sqlite3/schema.py", line 173, in alter_field
super().alter_field(model, old_field, new_field, strict=strict)
File "example/venv/lib/python3.10/site-packages/django/db/backends/base/schema.py", line 830, in alter_field
self._alter_field(
File "example/venv/lib/python3.10/site-packages/django/db/backends/sqlite3/schema.py", line 455, in _alter_field
return self.execute(
File "example/venv/lib/python3.10/site-packages/django/db/backends/base/schema.py", line 201, in execute
cursor.execute(sql, params)
File "example/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 102, in execute
return super().execute(sql, params)
File "example/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
File "example/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
File "example/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 84, in _execute
with self.db.wrap_database_errors:
File "example/venv/lib/python3.10/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "example/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
File "example/venv/lib/python3.10/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
return super().execute(query, params)
django.db.utils.OperationalError: no such column: "text"
Your error may look slightly different, but the error message should be the same,
django.db.utils.OperationalError: no such column: "text"
. Why is this happening?
I recommend that you run manage.py sqlmigrate polls 0003
to see what SQL that
migration is running. You’ll be able to see the boneheaded steps I’m about to take
and laugh at my ridiculousness.
Proposed solution: What if we migrate backwards to polls.0001_initial
? That way
our migrations would indicate we’re back to where we started. Run manage.py sqlmigrate polls 0001
Eesh. Did you get another OperationalError
too? This is getting worrisome. Django
told us to create this new migration, but now
we can’t migrate forwards or backwards! Welp, drastic times call for drastic measures.
Let’s delete all our migrations except for the one we know we need. Delete the following files:
polls/migrations/0002_rename_question_text_question_text.py
polls/migrations/0003_rename_text_question_question_text.py
Few, everything is right in the world again. Let’s just double check to make sure we’re safe.
Run python manage.py migrate
example$ ./manage.py migrate
Operations to perform:
Apply all migrations: admin, auth, contenttypes, polls, sessions
Running migrations:
No migrations to apply.
Looks good. Run python manage.py makemigrations
example$ ./manage.py makemigrations
No changes detected
Beautiful. Let’s be super sure now and make sure that we’ve only run this migration.
Run python manage.py showmigrations
example$ ./manage.py showmigrations
admin
[X] 0001_initial
[X] 0002_logentry_remove_auto_add
[X] 0003_logentry_add_action_flag_choices
auth
[X] 0001_initial
[X] 0002_alter_permission_name_max_length
[X] 0003_alter_user_email_max_length
[X] 0004_alter_user_username_opts
[X] 0005_alter_user_last_login_null
[X] 0006_require_contenttypes_0002
[X] 0007_alter_validators_add_error_messages
[X] 0008_alter_user_username_max_length
[X] 0009_alter_user_last_name_max_length
[X] 0010_alter_group_name_max_length
[X] 0011_update_proxy_permissions
[X] 0012_alter_user_first_name_max_length
contenttypes
[X] 0001_initial
[X] 0002_remove_content_type_name
polls
[X] 0001_initial
sessions
[X] 0001_initial
Hooray! We made it! There are lot of migrations in there, but polls
indicates there
is only one migration applied, 0001_initial
.
Celebrate and rejoice folks. You ignored all the warnings about manipulating the database manually and got out unscathed. Absolute fantastic work!
As you probably expected here’s where I say, “Not so fast!”. And indeed,
Not so fast!
We were wrong earlier. We do want to use text
rather than question_text
. Those
extra 9 characters are going to crowd our IDEs and use up too much storage space in our
git repository. We must rename that field for good back to Question.text
.
Take a moment here. Can you predict where things will break?
Have your idea in mind? Cool, let’s start with the model change.
class Question(models.Model):
text = models.CharField(max_length=200)
pub_date = models.DateTimeField("date published")
Then make our migration. Answer y
to the rename field question.
example$ ./manage.py makemigrations
Was question.question_text renamed to question.text (a CharField)? [y/N] y
Migrations for 'polls':
polls/migrations/0002_rename_question_text_question_text.py
- Rename field question_text on question to text
Now run manage.py migrate
example$ ./manage.py migrate
Operations to perform:
Apply all migrations: admin, auth, contenttypes, polls, sessions
Running migrations:
No migrations to apply.
Umm… No. We just created a migration. What do you mean “No migrations to apply”? Excuse me computer, you’re wrong!
Alright, let’s not lose our heads. Let’s see what’s going on. Run manage.py showmigrations polls
/example$ ./manage.py showmigrations polls
polls
[X] 0001_initial
[X] 0002_rename_question_text_question_text
Welp there goes the computer lying to us again.
On a serious note, the computer never lies to us. It’s trying to communicate its own reality to us. It has a fixed number of ways to communicate so it’s on us to try to understand it. The best way to do that is to get the computer to tell us what its reality is.
The best way forward here is to have the computer tell us when that migration was applied
with the --verbosity 2
I don’t expect a beginner to know that Django supports this feature, but it’s super helpful
in this scenario4.
Run manage.py showmigrations polls --verbosity 2
example$ ./manage.py showmigrations polls --verbosity 2
polls
[X] 0001_initial (applied at 2023-05-17 13:20:01)
[X] 0002_rename_question_text_question_text (applied at 2023-05-17 13:34:06)
Your dates should be different. However, if you look at the current time, you’ll see that
0002_rename_question_text_question_text
was applied much earlier than when you
created it right now. You can confirm this by opening the migration and looking at the
header of the file. It should have some text such as
# Generated by Django 4.2.1 on <today's date> <nearby time>
This means that Django thinks that the migration was run before the migration
file was created. This is because Django tracks what migrations are run in a table
that lists the migrations and the timestamps of their application. When we run
migrate
those rows are inserted or removed. But when we manipulated the database
manually, we weren’t able to migrate forwards or backwards. We had deleted the migration
files from our project, thinking that removed them entirely. Unfortunately it didn’t,
because Django had stored a record of applying a migration with that name in the database.
Let’s confirm that we see 0002_rename_question_text_question_text
in the
django_migrations
table, but not 0003_rename_text_question_question_text
.
Run manage.py dbshell
sqlite> select * from django_migrations where app = 'polls';
-- The following output should be similar
-- 18|polls|0001_initial|2023-05-17 13:20:01.405385
-- 20|polls|0002_rename_question_text_question_text|2023-05-17 13:34:06.055783
As we expected, there are two rows for the polls
app. One of which is for
0002_rename_question_text_question_text
. And we don’t see 0003_rename_text_question_question_text
because it was never applied to the database. We kept getting an error. Since it
wasn’t applied a row wasn’t inserted.
Now, if our new migration had a slightly different name, we would not run into
this error. But because the 0002
migration has the same name as the previous
0002
migration, it’s considered to have already run.
The goal here was to show you how manipulating the database manually and deleting migrations can both lead to trouble. Next we’ll cover how to get out of trouble.
Getting out of trouble
The best way to get out of trouble is to avoid it in the first place. Every other option I present is burdened with problematic consequences.
Option 1: Create a new database
If you’re operating in a non-production environment where you can recreate your data such as your local environment or a staging environment, I recommend creating a new database.
I am not going to include the specific instructions for creating a new database since they can be found in an assortment of places on the internet.
Pros:
- Your database ends up perfectly inline with your migrations
- It’s a straightforward operation
Cons:
- You need to recreate any existing data
Option 2: Update your database to match your migrations
If creating a new database isn’t reasonable, then you’re really stuck with manually manipulating your database until it’s correct. This sounds like you’re digging a deeper hole, but sometimes you have to go deeper into the crevasse to escape. While this is a tremendously bad idea in general, you can take some precautions to make it less awful.
First, make back-up of the current database. For sqlite, it’s as simple as copying the
database file. For PostgreSQL, there’s pg_dump
.
Second, do a restore of that backup into a testing/staging/local environment. Work on your solution here. Do not, I repeat, do not make changes to your production environment without testing them elsewhere first.
These two precautions give you the ability to try a number of different things without making your production environment worse. It also allows you to have a final SQL script that you was tested with an exact copy of your production database.
The other thing to do is to create a new database and migrate that forwards. While you aren’t going to use that one, you can use it in a comparison to find the differences in schema. I have not done this before, but here’s a Stack Overflow answer for diffing schemas in PostgreSQL.
As you find the differences, you can write SQL statements to update the bad schema to reflect what the expected, good schema should be. This is going to be dependent on your exact problem.
Let’s apply this process to our problem from earlier.
Step one, let’s make two copies of our database:
cp db.sqlite3 bad.sqlite3
cp db.sqlite3 bad_backup.sqlite3
Next let’s try to compare the differences between these. You may need to install sqlite3-tools5.
sqldiff bad.sqlite3 db.sqlite3 --schema
This should have no output. This is because we’re comparing copies of the database which should be the same.
Let’s create a new, known to be good database. With SQLite this is as easy as deleting
the existing database file and letting the migrate
command create a new file.
rm db.sqlite3
python manage.py migrate
This should create a new db.sqlite3 file. Let’s diff the two databases.
sqldiff bad.sqlite3 db.sqlite3 --schema
This should product the following output:
DROP TABLE polls_question; -- due to schema mismatch
CREATE TABLE "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"text" varchar(200) NOT NULL,
"pub_date" datetime NOT NULL);
This sort of makes sense. It’s trying to tell us there’s a schema difference on the polls_question
table. It’s not telling us what is exactly different. We need to inspect these tables manually.
So run the following two scripts and let’s compare their outputs:
First sqlite3 db.sqlite3
:
.schema polls_question
-- Should generate the following output
CREATE TABLE IF NOT EXISTS "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"text" varchar(200) NOT NULL,
"pub_date" datetime NOT NULL);
Now sqlite3 bad.sqlite3
:
.schema polls_question
-- Should generate the following output
CREATE TABLE IF NOT EXISTS "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"pub_date" datetime NOT NULL,
"question_text" varchar(200) NOT NULL);
Looking at these two SQL statements, is easier to see that the difference is
between text
and question_text
.
Now that we know that the column needs to be renamed. Let’s run the following
SQL once we open a shell, sqlite3 bad.sqlite3
.
ALTER TABLE "polls_question"
RENAME COLUMN "question_text" TO "text";
Now let’s check the schema of the table to confirm.
.schema polls_question
-- Should generate the following output
CREATE TABLE IF NOT EXISTS "polls_question" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"pub_date" datetime NOT NULL,
"text" varchar(200) NOT NULL);
Excellent! Our bad database now matches our ideal version. What’s interesting is that
sqldiff --schema
still considers these tables to be different because the column order
is different. It’s fair to be worried about that, but since Django created the table
with question_text
as the last column when we renamed the field much, much earlier
I think we’re fine.
Are we done yet? Ye-No!
We still need to check our migrations out. So open up a sqlite shell, sqlite3 bad.sqlite3
SELECT * FROM django_migrations WHERE app='polls';
-- My output:
-- 18|polls|0001_initial|2023-05-17 13:20:01.405385
-- 20|polls|0002_rename_question_text_question_text|2023-05-17 13:34:06.055783
The reason we are checking the table and now using showmigrations polls --verbosity 2
is because
showmigrations
will only tell us about the migrations that we have in our Django
project. We want to know what migrations our database thinks have been applied.
Now let’s run python manage.py showmigrations polls
to confirm that our Django app is happy.
polls
[X] 0001_initial
[X] 0002_rename_question_text_question_text
At this point the crisis has been averted, good work! You can write the script to make the change to your production application, test it once more, then run it in production and save the day.
Django Migrations - TLDR edition
This was a very long post on the intricacies of managing migrations with Django. It’s easy to get into trouble with them. Everyone does eventually. But it’s not a terrible problem if you know the tools and strategies available to you.
Toolkit:
manage.py makemigrations
: Creates the migration filesmanage.py migrate
: Runs the migrations (changes the database)manage.py showmigrations --verbosity 2
: Tells you what migrations have been applied and whenmanage.py sqlmigrate <app> <migration>
: Tells you what SQL will run for a migrationmanage.py dbshell
: Opens up a database SQL shell (we usedsqlite3
directly)
Strategies:
- Make a backup of your database before changing it
- Commit your migrations
- Before changing/deleting a migration file, make sure it’s unapplied in every environment first
- When
migrate
fails, investigate the current database schema and the generated SQL from the migration - Use SQL diffing tools to compare the schema with a known good database
Conclusion
That’s it! Congratulations, you scrolled to the end of this behemoth of a page. I hope you learned something. I know I did6. If you have some feedback, a question or want to connect reach out! You can find me on the Fediverse, Django Discord server or via email.
-
Schema means the tables, columns, indexes, basically any object that exists in the database that isn’t a row of data. ↩
-
The migrations need to be in order and respect their dependencies on each other. You can’t calculate the model structure from a random collection of migrations. ↩
-
Some nervous freelancer thought it would help debugging. Nerd. ↩
-
sudo apt-get install sqlite3-tools
on Ubuntu ↩ -
I didn’t know about
sqldiff
. I now have a new tool in my belt! ↩