strapyourself.in and flouri.sh

Duplicate Migrations in Rails

November 6th, 2007

Why we need duplicate migrations

Have you ever been working on a large project, and had people check in migrations with the same numbers? It's happened to me probably no less than 10 times in the last year. In each case, the situation is recoverable, but sometimes requires a lot of manual rolling back of specific migrations on possibly several machines. Then you have to renumber all the migrations after the conflict, of course.

An even worse situation is when a project is branched and remerged. For example, you might want to branch out several complicated features from trunk for a few weeks, then bring them back when complete. Assuming you create 2 feature branches (for adding profiles and friends to your users), you could end up with something like this:

  • 036_modify_users_to_include_first_name.rb
  • 037_create_profiles.rb
  • 037_create_friendships.rb
  • 037_fix_a_bug.rb
  • 038_add_timestamps_to_friendships.rb
  • 038_modify_accounts_to_limit_length.rb
  • 039_modify_users_to_include_gender.rb

In the above situation, the person merging the two branches has a very difficult situation ahead. Everyone working on the project is probably on revision 37 (profiles branch), 38 (friends branch), or 39 (trunk). The safe way to proceed with traditional rails migrations is to force all machines be migrated down to 36. No new migrations can be added while the migrations are then renumbered so they range from 036 to 042. Finally, all users can update from trunk and run rake db:migrate. Of course, people often forget to migrate down, and end up stuck in the middle of a sequence of migrations that has been renumbered (I am so tired of reversing migrations by hand).

Solution: Allowing duplicate migration version numbers

In the above example, the 3 migrations numbered 37 are not dependent in any way. Because they had to be developed independently, duplicate version numbers are very rarely dependent. For this reason, we beleive that it is usually safe to create a "partial ordering" of migrations rather than an exact ordering. In this partial ordering (which can be represented as a lattice), migrations with the same version number will be run in an arbitrary order:

lattice

Since all of the dependencies in the above lattice flow downward, we can satisfy the partial ordering by running the migrations alphabetically by filename, alphabetizing them first by version number and then by class name. This will only work if we can make the assumption that when new migrations are added, they can only be dependent on those with smaller version numbers.

How the plugin works

Traditional rails schema_info table cannot hold enough information to keep track of which migrations have been run, so we need to adopt a new schema format, which we place in a new schema_infos table:

schema_infos schema

In this new schema, every record represents a migration that has been run. By traversing this table, we can get an accurate picture of the state of the system, and decide which migration to run next.

If we want to migrate to version 10, for example, we create an alphabetical listing of migrations up to and including version 10(s). Then we traverse that list in order, running "up" on migrations which have not been previously run, and inserting a record into schema_infos. Finally, we create a list of migrations with version numbers larger than 10, and run "down" on those in reverse alphabetical order, removing the entries in schema_info.

A little under the hood

Below is the main migrate function. It does exactly what is discussed in the previous section:

def migrate_with_duplicates
  migration_classes_before(@target_version).each do |(version, migration_class)|    
    next if schema_information_contains?(migration_class)
    ActiveRecord::Base.logger.info "Migrating up #{migration_class} (#{version})"
    migration_class.migrate(:up)
    insert_schema_information(migration_class)
  end
  
  migration_classes_after(@target_version).each do |(version, migration_class)|              
    next if !schema_information_contains?(migration_class)
    ActiveRecord::Base.logger.info "Migrating down #{migration_class} (#{version})"
    migration_class.migrate(:down)
    remove_schema_information(migration_class)
  end
end

What would be even better...

I've always wanted to write a migration system based on partial orderings where dependencies are explicit, and version numbers are history. Such a system would work nicely on top of the new schema_infos table format. The tricky part would be how to state the dependencies without forcing the migration author to work too hard.

Download

From the ELC plugin repository: http://wush.net/svn/public/plugins/duplicate_migrations

To install:
./script/plugin install -x http://wush.net/svn/public/plugins/duplicate_migrations
(installing automatically creates the schema_infos table and populates it, but does NOT delete your old schema_info table... don't panic!)

Originally posted on ELC's blog

Sorry, comments are closed for this article.

original design by gorotron ported by railsgrunt powered by mephisto