ruby! food! kids! … and other fun from terry heath
RSS icon Email icon Home icon
  • Guidification

    Posted on October 3rd, 2010 terry 4 comments

    Introduction

    About 4 months ago, I had a conversation with a coworker. It went something like this:

    J: “I noticed we’re using integers for our IDs. Why don’t we use GUIDs?”
    Me: “We don’t have multiple sources of data creation, and incrementing IDs are easier to keep up with. I’ll let you know if that turns out to be wrong though.”

    About 3 weeks ago, I had this conversation:

    Me: “Shit. We need GUIDs.”

    We’re working on being able to replicate datasets and clone them into existing environments, and incrementing IDs allow for collisions, which would be bad news bears.

    There’s a few hiccups I ran into, a few gems that needed to be patched, and a few unexpected benefits that have come from guidifying our app.

    Before I get too deep into it, here’s the plugin we’re using: BMorearty/usesguid, and this is on Rails 2.3.9, so I’m not sure all the other things that will come up when we get to Rails 3.

    The Migration

    The first step in all of this was to convert the IDs to GUIDs. I talked with JBogard if he’d done anything similar, and he said they had a script that would look at the MSSQL metadata and figure out what columns to update from that. The script wouldn’t really work for us since we use MySQL, but it’s a good approach.

    Using that idea, I used ActiveRecord’s reflections and column to figure out what columns to change. The process itself was something like:

    1. Remove any indexes that could cause problems as you’re killing FK columns.
    2. Find every table with a PK ID item, and add a GUID column (can be done very fast using MySQL’s select uuid())
    3. Use ActiveRecord’s reflections to find tables looking at other tables, and add a GUID column for the FK and populate it with the other table’s GUID
    4. Delete the ID columns
    5. Add indexes back to your keys (don’t forget this step!)

    Here’s how it looked:

    I took out most of the project-specific stuff, but kept in the parent_id stuff since that’s most likely useful to someone else.

    Hiccups

    Anywhere we were writing explicit SQL, IDs had to be quoted. Unit tests caught all of these issues, thankfully.

    Anywhere we were passing non-quoted IDs in Javascript also had to be quoted, because a1232-444ds isn’t a valid expression.

    Anywhere we were checking to see if an ID was valid, we were doing it by saying id.to_i != 0. That still works, 9/36ths of the time. If your guid starts with a non-zero integer, it’ll work, otherwise, no good. We now use a regex to tell if something is a guid.

    Because some plugins (below) will have to use usesguid, I had to change the plugin load order to put usesguid first.

    We had some routes that would look for an integer-only-regex match to find :id parameters, so we had to change that to a guid-finding regex.

    Library Changes

    acts_as_list

    Had to vendor the gem to quote explicit SQL relying on IDs.

    acts_as_archive

    Had to vendor the gem to quote explicit SQL relying on IDs.

    acts_as_solr_reloaded

    I had to change acts_as_solr_reloaded’s PK to be a string instead of an integer and add usesguid to dynamic_attributes and locals.

    acts_as_audited

    Needed usesguid on audit.rb

    delayed_job

    DJ does a very simple regex check to see if something is a valid job, and if not, bails. Unfortunately, it just looks for Model:integer. This fails with guids, obviously, so I had to change the regex in performable_method to look for a GUID instead. See it here.

    usesguid

    I really like the 36 character string GUIDs. I don’t have a good reason for it, but I modified the usesguid library to not use .to_s22, because the 22 character ones look odd. It might be just me who cares about it, but it’s worth noting that if you’re sending GUIDs to other systems, there’s a chance they’ll see the 22 character string and tell you the GUID is invalid. So, there’s that.

    Unexpected Benefits

    I recently wrote a migration that imported 131,000 new records into an existing table, and set up associations along the way and did a little bit of geocoding magic. The migration took about 45 minutes. After doing all of that hard work, I mysqldump‘d those tables and changed the migration to just load the generated data. The migration now takes <5sec.

     

    4 responses to “Guidification” RSS icon