Posted on April 23rd, 2011 1 comment
Update: David Genord pointed out a *huge* bug in my code, where jobs will never be retried; his fix has replaced the existing gist.
Our workers (DelayedJob) leak memory. Not heinously fast, but enough that monit bounces them fairly often. I tried out perftools.rb on one of our longer running jobs, reindexing contacts. I cut the job down to indexing only 25 contacts, and profiled objects instantiated. To index 25 contacts, which took about 25 seconds, the app instantiated 3.1 million objects.
Just in case the title is misinterpreted – DJ is not leaking memory. My code is.
With a little bit of memoization and a few short-circuits, I got it down to 800 thousand. That was nice, and it sped things up by an order of magnitude, but the jobs were still leaking memory
So, I decided to steal a cool concept from the next async processor I really want to work with: Resque. I changed the worker to fork and wait for every job it performs. This means that there’s an overhead added to every worker of about 200MB, but that’s nothing compared to how bad things got if all the workers started sucking up huge chunks of memory at the same time.
One caveat: I tried to look around for someone who’d already done this and just incorporated those changes, but it turns out googling “delayed job fork” doesn’t reveal much. If there’s an existing project that’s already trying to do this for delayed job, please let me know
The code changes around this are pretty simple, just changing the run method and adding a hook for handling fork reconnection stuff (it looks like DJ has some hooks built for this, but I didn’t see them in my local copy)
Before this patch, if I ran 1 worker locally, after about half an hour my machine ran out of ram (it has 8gb). If I ran 4, my computer was unusable. After the patch, I can run 4 workers locally with no perceived performance hit. Money shot:
Posted on October 6th, 2010 No comments
There are about 4 hours in the day where the caffeine vs fatigue level is just right, and I get a lot done. Unfortunately, a lot of the stuff I work on requires a lot of computer effort too, which at first led me to creating multiple directories to work in.
After the environment variable thing I got working, everything was a lot more silo’d, which was great, but I still had to either explicitly type at the end of long running commands “;say ‘done’” or just check it spontaneously.
I don’t like having my headphones on all the time, so I wanted to hook Growl into this.
Also, I wanted to release a gem named caffeine, but someone already has a gem named caffeine, and since I’d already made it work locally, I didn’t care enough to re-package it and figure out a new name. And meth has all the wrong implications. Color me lazy.
Anyway, I ended up adding an at_exit block to my rake and spec executables, and put the following script in my path. It’s resulted in my computer telling me verbally and with sticky notes what’s going on. Who needs a secretary when you have this?
Posted on July 2nd, 2010 No comments
Posted on May 24th, 2010 No comments
One of the really good things I learned from some guy, we’ll call Donkey, is to have everyone write down deployment dependencies in a file, so if someone goes on vacation or whatever during a deploy, you at least have an idea of what you should do.
I put it in /README, and separate everything by sprints. It looks something like this:
Even though everyone tries to remember to put things in there, there’s still a single point of failure, when the idiot who deploys it forgets to check the readme.
So, to prevent future self-indictments, I put a task that runs at the end of our cap script that prints the current release notes.
Ideally all of these are handled with a comprehensive deploy script, but that’s not always feasible/doable. Here’s the cap task, and if you make it the last thing that runs, you’ll get a friendly reminder at the end of every deploy for things you need to do.
Posted on April 20th, 2010 2 comments
Chrome view source
I ran into an incredibly confusing bug last week where I had some fields named user[field_x], but they were being posted to my Rails server as user[field_y]. I started to blame all sorts of crazy things (becoming superstitious in not knowing wtf could be happening).
I pulled my field-level identity map (another blog post). I wrote some Rack middleware to intercept the parameters before Rails’s ParamsParser could touch it and make sure they were good. Fail. Nothing worked. After 2 hours of fighting with it, unable to reproduce in tests or anything, I realized I wasn’t viewing the source of the content I was actually viewing.
Chrome sends another, new request to the server when you say view source. This means that you’re not viewing the source from the page you’re viewing. If the pages are static or if there’s no way server state affects your page, then you’re good. Otherwise you might lose 3 hours.
This bug was painful enough that I’ve dropped Chrome as my default browser and am back on Firefox. I hope that by reading this you’ll think of it when you do view source on Chrome and nothing makes sense.
I learned about git-stash over the weekend. It allows you to stash your changes on a stack, make some other changes to the codebase, do what you want with them, and then pull those stashed changes back out.
Example use case: I branch before everything I work on, but eventually I have to merge stuff back in. So I branch to make feature X, get it written, then merge it back into HEAD. Unfortunately, tests are failing. I’m not going to push my changes yet, but something needs to be changed for another developer who wants something to run locally. git stash –keep-index; do changes; git push; git stash pop. Done and done, and no broken builds!
A developer at our office was working with some relative time methods. These are inherently tricky to test, because if “next week” means “next business week”, you can’t just willy-nilly add 7 days to a day and do tests. Same with “this week”, etc.
Instead of doing a bunch of complicated date math that would arguably make the tests just as scary as the code, we found Timecop, which lets you freeze time for your tests. You can say, “I want Date.today to evaluate to 20Apr2010″, and it will. This allowed for significantly simpler tests and saved a ton of time.
Avoid toggle actions
Even on some of the more popular sites on the internets, there are a lot of AJAX actions that don’t handle network failures well. Backpack (at least on my iPhone) just keeps the in progress icon going forever. We’re not doing anything super-critical with AJAX on our app, but we’d like to at least handle timeouts or failures gracefully.
The simple solution is to check for the HTTP response code and act appropriately (2xx? Yay! 4xx? BOO!). We decided to do this, but realized that it’s possible that your request makes it through to the server, then your network fails, and you get a timeout. What do we say happened on the client side?
If they’re single-direction actions (e.g., marking something done), then you can just say “Something went wrong,” and not worry about changing it back or anything. If it’s a toggle, though, you can’t make any guarantees. “Something went wrong,” sure, but should the user try again? What if it only went through for the first 3 items?
As such, we’re no longer writing toggle_x actions. Now we’ll handle that on the frontend, and have negative_x and positive_x methods.
Posted on April 20th, 2010 No comments
Last week I talked with a guy who noted that the scheduling patch I’d blogged about for DJ was insufficient for scheduling something every day at 8AM. This is because running a job and then scheduling the job again after a success pushes the job out a little bit each time (maybe in a week or so it runs at 10AM, e.g.)
I thought about trying to make a run_at class method that would let you specify it, like “run_at ’8am’”, but that quickly fell through when I realized ’8am’ is ambiguous – 8AM every day? Every Wednesday? Every 4th week in July?
I fall inline with Brandon Keepers when I say that I don’t want to re-implement cron, because (imo) cron is one of the best scheduling tools out there already, and it comes packed onto every Unix distribution that ends with x.
Instead, I changed the run_every method to now accept a block, so you can pick the new best time to run if you’d like, or just keep using the same old (8.hours) syntax.
Think something like this:
If you like it, here’s the GitHub Repo.
Posted on March 21st, 2010 No comments
Most people who’ve worked with me know that I really love my editor, Vim. I use it for pretty much everything. I don’t have a super customized setup, but I have a few plugins, and some of them got to be incompatible as they matured, so I’ve stuck with the less mature, super awesome ones.
Posted on March 11th, 2010 No comments
Posted on March 3rd, 2010 No comments
Yesterday I wrote PeepingTom, a RubyGem that lets you write a script in PT’s DSL to monitor different servers.
I was happy with it, but wanted to add a verify method that let you pass in a regex to see if things were (somewhat) rendering as expected. Unfortunately, this made the DSL a little harder to maintain.
Up until that point, everything was a single method, and nothing was chained together. My first solution was to change verify and ping to be methods on Site, which would make code look like:
Unfortunately, now things look like Ruby, and not like my oh-so-fun PeepingTom. After talking with my boss about it, we decided it should look something like:
This poses a little problem, because I already had a site method. Also, just a sidenote, Ruby parses these things right-to-left, so that statement is equivalent to verify(site(with(/hello/))).
So the first thing I needed was a with method. It turns out this is really easy to write, and isn’t even necessary, but it makes the DSL a lot more readable:
But then I needed to do something with my site method. The existing site method creates and registers a new site with PeepingTom. Watch out, here come the aliases:
Note that this means new watch blocks need to get rid of the |site| part. That’s now a method that proxies things out to the site that you expect to be iterating over. It makes it look a little more like English, too. Win-win.
I’m pretty happy with PeepingTom now; I think the only addition in the next few days will be a Campfire channel.
Posted on March 2nd, 2010 3 comments
I did it as a good way to send out pics from our trip, and because jQuery makes things like that ridiculously fun.
Anyway, I had the source closed, because there were API keys and things on it. I decided tonight to buckle down and read about how to permanently delete things from a repository (hopefully successfully!), and open sourced it.