I’ve recently come across the database schema migration toolkit that builts on top of the awesome SQLAlchemy ORM library named SQLAlchemy-Migrate. Since I’m working mostly on Pylons projects with extensive schema written using SQL Elixir, I was wondering if I could make SQLAlchemy-Migrate to work with my Elixir models. I’ve managed to make it work perfectly. But before I tell you how, some quick review of the tools and libraries for those of you new to the world of SQLAlchemy.
SQLAlchemy is an Object-Relational Mapper, ORM in short, that maps Python object classes to database objects, i.e.: tables, records, foreign-keys and all the other related stuff. Since SQLAlchemy is an ORM, you don’t have to write sometimes complicated SQL code for querying database. Instead you are working with Python objects and methods. SQLAlchemy adds overhead to SQL query execution times performance, but this tradeoff most of the time is worth it. Trust me. And besides, SQLAlchemy makes your project database engine independent, at least in theory.
Elixir is a declarative-style layer on top of SQLAlchemy that utilizes so called ActiveRecord design pattern. It allows you to use some shortcut methods known mostly from Rails ORM. Elixir is a library that fully interfaces SQLAlchemy and wraps some of the methods and declarations into one Entity. SQLAlchemy as of version 0.6 has declarative layer extension, but it doesn’t have those ActiveRecord methods. Using Elixir is easier than bare SQLAlchemy, but I should warn you, that without understanding SQLAlchemy first you’d be really lost. Elixir is much harder to debug when you stumble upon some kind of non-trivial error, since it’s a wrapper on top of SQLAlchemy.
However Elixir, with some caution and understanding on what you’re doing, can be really satisfactory to work with. Currently I’m working on advanced content management system and I can assure you that it’s model is really complicated. Without Elixir I would be lost long time ago, but since it’s very clean and transparent, just a quick glance at a class based on Elixir Entity gives you complete picture of what’s going on. I very much like it and would recommend using it, especially because you can mix Elixir and SQLAlchemy without any problems provided you know what you’re doing. But enough of this.
So. SQLAlchemy-Migrate. What is it? As I’ve said at the beginning it’s a database schema versioning toolkit built on top of SQLAlchemy. It doesn’t support Elixir out of the box, but with a little trickery you can make it work. What is database schema versioning? Well. It’s a record of changes in database structure between some arbitrary “points in time” called versions. Let’s suppose you have a table in Elixir model defined like this:
class Page(Entity):
id = Field(Integer, primary_key = True)
title = Field(Unicode(100))
content = Field(UnicodeText, required = True)
This is fairly basic Elixir entity that really works. That’s all. You just run setup_all() method from Elixir and you can use this entity in your code querying, updating, inserting and making changes in database. Let’s assume that you use this model and it behaves really well, but when you develop your application further, you find it rather limiting.
For example you wish you could add a new field, so the entity would automatically update creation date and time on inserting new Page record to database. You can alter your model’s Entity like this:
class Page(Entity):
id = Field(Integer, primary_key = True)
title = Field(Unicode(100))
content = Field(UnicodeText, required = True)
ctime = Field(DateTime, required = True, default = datetime.datetime.now())
It will work without any problems, but… Not on existing database. You would either have to issue SQL query directly on the database to bring it up to date with your defined model (i.e. add new column ctime) OR in case of Pylons run paster setup-app which will drop your tables and recreate database from scratch. This is acceptable for new project or development databases, but not on testing (to some degree) or especially production ones, since you would loose all your data.
Here comes SQLAlchemy-Migrate. It’s an extension and toolset built on and working with SQLAlchemy models. It’s intended for schema migration, which is exactly resolving our problem changing and keeping up to date our database table’s structure with our application’s model. Migrate would create repository of changes for your application and create additional table in the database called migrate_version. After creating the repository your database would be at version zero. Official documentation states that this version should represent empty database, in our case empty model.
I’ll digress here a little bit. I’m mostly agreeing with this approach, because it ensures that you’ll build your model from the scratch and then use Migrate to upgrade it to current or desired schema version. There are however some cases when I find it unacceptable and not worth the work. One such case is the CMS I’m currently working on. It’s almost three years worth of work, containing nearly hundred tables, numerous Entity mix-ins, very deep relations of ManyToOne, OneToMany, ManyToMany and other, each table and model class has multitude of methods, decorators and much much more. Rewriting this model according to official documentation guidelines would be really hard work giving virtually no benefits.
What I wanted was to have an established baseline database model that could be extended for this CMS and changes could be committed to the baseline version and further. I need such kind of arrangement, because current version of this application with baseline model is deployed in production for few clients, that I have an agreement with for upgrade, when new versions of the software would be available. This way I could write migration scripts and upgrade their system instances to the latest version, without all the tedious work on writing alternation and schema change scripts in pure SQL, dumping current data and readjusting it to new schema.
SQLAlchemy-Migrate allows me to do this, because each change is registered sequentially as new version of the schema. For each change I write a script with two methods: upgrade and downgrade. First one brings changes up to date and the second drops those changes if you would like to return to previous version of the database schema.
Basically setting a baseline model is a good idea. Official Migrate docs state that database should be empty, but they don’t restrict it such as version zero of migration repository must be empty. Probably it’s related to the problems that might occur when working in teams and you would need to quickly bootstrap new application instance, but I won’t digress here. You can find much information on this topic in the Internet.
The only problem might be that with baseline model must be always in current version of migrate repository. Using your application models IS NOT recommended. There are many problems that can arise from such arrangement and that’s why, according to the Zen Of Python: “Explicit is better than implicit” which means that you have to keep two similar models – one in your application and one in the repository.
Yes. That’s somehow doubling the work, but when you setup a baseline just like me, you can simply track only changes in your migration repository model, updating application model accordingly. Shouldn’t be much work and it’s a good tradeoff for a comfort of having tested and verified database structure upgrade path. And most of those problems that can arise for keeping in sync your two models could be mitigated to some degree by using Mercurial or some other version control system.
Phew. Let’s get back to our problem. How to make SQLAlchemy-Migrate work with Elixir? Let’s suppose that you’d be interested in adding migration facilities to your existing project. I’d assume that you installed sqlalchemy-migrate via pip or easy_install. If so, you’re ready to go.
This is how I’ve setup my migrate repository. First step is to create a migration repository. I have assumed that it will reside under my project’s root directory. The project is under VirtualEnv control and it’s main directory is a subdirectory within the VirtualEnv root. So from my main project directory I have issued:
migrate create migration "Project Repo"
What this basically does is to create a migration subdirectory in the main project’s directory that will hold the sqlalchemy-migrate repository named “Project Repo”. After successful repository creation you must put your database under control. If you’re using Pylons, check out development.ini option sqlalchemy.url, which will give you necessary credentials for your database. To put your database under migrate version control you have to issue:
migrate version_control postgres://login:password@host/database
This will initialize your database for the version control with migrate repository version zero. You also can create a support script which will hold the credentials for your database, so you wouldn’t have to pass those to the migrate script. It’s a convenience script which wraps around migrate. We’ll call it migrate.py and it can be automatically created by typing in this command:
migrate manage manage.py --repository=migration --url=postgres://login:password@host/database
This will create migrate.py script in your project’s root directory, for repository residing under migration subdirectory and with credentials specified by –url option. Note that I’m using PostgreSQL engine, but you can use whatever engine that SQLAlchemy supports. Note however that SQLite doesn’t support full ALTER statements in SQL, so you might not get results that you’ve expected when using sqlalchemy-migrate. Now you can use:
python manage.py
as a shortcut for your migration repository administration. You can issue:
python manage.py version
to check at what version your migration repo is at and you can also use:
python manage.py db_version
to see what version of the migration model your database is currently at.
Ok. So we’re finally getting to the juicy part. We have our migration repository setup and ready to use. Let’s establish that our baseline model would be the first version of Page Elixir Entity presented in this post (this without ctime field). We want to create a script that will allow us to create a script for updating our baseline model with this option. Let’s say that we would name this change “Add ctime DateTime field to Page”. We can issue:
python manage.py script “Add ctime DateTime field to Page”
This will create a script named migration/versions/001_Add_ctime_DateTime_field_to_Page.py in the migration/versions subdirectory. It will look like this:
from sqlalchemy import *
from migrate import *
def upgrade(migrate_engine):
# Upgrade operations go here. Don't create your own engine; bind migrate_engine
# to your metadata
pass
def downgrade(migrate_engine):
# Operations to reverse the above upgrade go here.
pass
There are two functions: upgrade – which brings database up to date to current version of the script (in this case version 1) and downgrade – which will rollback all the changes created by this version’s changeset (given of course you write the method code).
Since we’re using Elixir and not pure SQLAlchemy, it needs additional setup for each change script. Also having baseline model as our version zero for the migration, we must make sure that we won’t introduce conflicts with our model. If you’re probably familiar with SQLAlchemy, you know that Metadata object instance holds database metadata including database engine bindings. As you can see upgrade and downgrade methods take migrate_engine as their parameter. This is the engine specified by –url option on the migrate script, in our case manage.py which will feed the engine automagically. You have to import additional modules in our script:
from sqlalchemy import *
from migrate import *
from elixir import *
import elixir
This will import Elixir and it’s namespace, so we can use Entity, Field and it’s definitions directly, but will also allow us to utilize elixir namespace. This would be needed for engine binding to Elixir’s metadata, since we don’t want to use pure SQLAlchemy metadata. We want to update Entity (and database table) Page accordingly. To do this we should declare Page class which inherits from Entity like this:
import datetime
class Page(Entity):
using_options(tablename = 'page', autoload = True)
ctime = Field(DateTime, required = True, default = datetime.datetime.now())
This sets up Elixir Entity named Page, which will take our baseline model and add additional field named ctime. This however doesn’t do anything, except loading our Page model from database and declaring additional table field. Take a look at using_options. It has two parameters:
- tablename – which holds a value of page; and
- autoload – which is set to True.
What those options do? Option tablename is specifying table name in the database this model will resolve to. This is important, because sometimes when using ManyToMany, Elixir creates distinct names. Second autoload set to True, states that Elixir should load table definition from the database in a way our previously defined fields would be available to us, namely: id, content and title. Defining ctime would cause addition, not substitution of the table fields and in our model.
Note: You could also forget about “autoload = True” in the using_options if you had specified at the top of the script, just right after imports a directive:
options_defaults["autosetup"] = True
But since, we want to have control over our model, I didn’t use it. If you think it’s safe (not many relations), go ahead and use it. Just know what you’re doing. This Entity basically creates a model foundation for upgrade and downgrade functions. So let’s get into them. First we’ll take an upgrade:
def upgrade(migrate_engine):
# Upgrade operations go here. Don't create your own engine; bind migrate_engine
# to your metadata
elixir.metadata.bind = migrate_engine
Page.table.c.ctime.create()
Wait a second, you would say. Everywhere you’ve looked tutorials bound metadata to global metadata variable which was instantiated as metadata = MetaData(), even those describing Elixir usage with migrate. We’ll. Unfortunately they are all wrong.You have to bind your database engine to Elixir metadata and not the pure SQLAlchemy metadata. This way we don’t need any global MetaData instance and we will be sure that there are no problems with engine_dialect not found errors. That’s basically the magic trick. You could also bind metadata directly to the Elixir Entity table, but that’s too much fuss. If you need that kind of granularity, you probably wouldn’t be reading this tutorial after all. 😉
To actually create a new column for the table, we have to actually point to the table’s column definition and call .create() method on it, so that migrate would create this column with SQL’s ALTER statement. Just remember that SQLite has some limitations. PostgreSQL and MySQL should be perfectly fine. Ok. This is pretty straightforward. Without delay, we’ll do something analogous with downgrade function:
def downgrade(migrate_engine):
# Operations to reverse the above upgrade go here.
elixir.metadata.bind = migrate_engine
Page.table.c.ctime.drop()
This causes the column definition to be dropped from database. You can now test our repository change script typing at your command prompt:
python manage.py test
And you should see:
Upgrading...
done
Downgrading...
done
Success
This will confirm that migration upgrade to version 1 and downgrade to version 0 is working fine. I could end just here, because that’s almost all you need to know, but unfortunately migrate is a little bitchy with Elixir, when using imported model Entities for using in relations. And with Elixir relations also. I have eaten my teeth on it, but I’ve finally managed to get it up and running perfectly.
Let’s suppose that we want to create another Entity named User, which for clarity would be defined as:
class User(Entity):
id = Field(Integer, primary_key = True)
login = Field(Unicode(30), required = True, unique = True)
and it would reside somewhere else, like project_directory.model.entities subdirectory. We want to create a ManyToOne Elixir relation in our Page model with the field name of created_by. Migrate needs a little trickery for this to work. So, let’s suppose that instead of ctime field, we want this created_by relation. We must edit our Page entity accordingly:
from project.model.entities import User
class Page(Entity):
using_options(tablename = 'page', autoload = True)
created_by = ManyToOne('User')
This will initialize our model instance of Page with relation to User entity. However, migrate does not know anything about ManyToOne relation, so we would have to fallback to our old, good SQLAlchemy within upgrade and downgrade functions. Since Elixir uses Integer fields for ManyToOne relations with the name constructed as: <RELATION_FIELD_NAME>_<PRIMARY_KEY_FIELD>, given RELATION_FIELD_NAME = created_by, and PRIMARY_KEY_FIELD = id, we would end up with created_by_id field of Integer type. We have to create a relation manually within the upgrade function like this:
def upgrade(migrate_engine):
# Upgrade operations go here. Don't create your own engine; bind migrate_engine
# to your metadata
elixir.metadata.bind = migrate_engine
clmn = Column('created_by_id', Integer)
clmn.create(Page.table)
cbid_fk = ForeignKeyConstraint([Page.table.c.created_by_id], [User.table.c.id])
cbid_fk.create()
Index('ix_page_created_by_id', Page.table.c.created_by_id).create()
This will add created_by_id Integer field that will be used for relation. Note that we have to define this relation manually. cbid_fk variable would create ForeignKey Constraint on this Integer field which relates to User.id field and adds it to database table definition. Index will create index on this field to speed up lookup. This is how a full ManyToOne relation from Elixir looks under SQLAlchemy. Determining names for ForeignKey and Index should be pretty straightforward for you.
By analogy we define downgrade function. This time however we only need to drop this Integer field. Migrate will take care of removing ForeignKey and Index for us. That’s how this function looks like:
def downgrade(migrate_engine):
# Operations to reverse the above upgrade go here.
elixir.metadata.bind = migrate_engine
Page.table.c.created_by_id.drop()
That’s how you can use Elixir with SQLAlchemy-Migrate using also relations. Perhaps this should be designed better on behalf of Migrate, but this package is pretty awesome and I commend it’s developers with a really useful and easy to use package, that’s really well documented.
Feel free to post comments if you have suggestions for improvement or perhaps better way to force Migrate to work in accord with Elixir. That’s all folks for now. Hopefully you found this useful. Cheers.