Programmatically populate data while running Django migrations
In this article, I will explain how to programmatically populate data in a database during a migration. This technique is particularly useful when we want to add a new field to a model where the value of the new field can be determined by the other fields of the model.
Let me illustrate with an example:
Suppose we have a model called Product
that has a field called serial_number
. We are asked to
implement a new mandatory field for this model that holds the name of the product supplier. We are told that the
supplier name can be deduced from the serial number. If a serial number starts with AC
, then it is supplied by
ACME Corporation; if it starts with BR
, then is it supplied by Brand Corporation, otherwise it is manufactured
in-house.
So we go ahead and add the supplier
field to our model:
We have set a default value for the supplier
field. This makes sense for future products, but not for the
ones we already have in our database. We know how to determine the supplier of a product. So how do we dynamically set it for
each existing product during the migration?
Enter RunPython
!
RunPython
is a special operation that can run custom Python code during a migration. To use it, we need
to add it to the end of the operations
list in the automatically-generated migration script:
RunPython
operation takes two callable arguments. code
is called during a forward migration, and reverse_code
during a reverse migration. In our case, we don’t want to do anything when reversing the migration.
Django can just delete that column as far as we are concerned. That is why we supply RunPython.noop
to it.
All the magic happens in populate_supplier
. We retrieve all the products from the database, check their
serial number and set the supplier according to the rules.
Here we need to be very careful about which model definitions we are using for product and for what purpose.
To get all the products from the database, we cannot use Product
model from models.py
since it has a new field that
is yet to exist in the database. Therefore, we need to get the model from the versioned app registry using
apps.get_model()
.
Also note that to access model variables ACME
, BRAND
, and INHOUSE
, we needed to import the Product
model
from models.py
since these variables are not available in the old Product
model returned by apps.get_model()
. We
imported myapp.models.Product
with an alias to distinguish it from the Product
model returned by apps.get_model()
.
Now we can run our migration, and the supplier field will be populated dynamically for each product based on its serial number. ✌️