ruby! rails! kids! oh my! … and other fun from terry heath
RSS icon Email icon Home icon
  • Scheduling Jobs with DelayedJob

    Posted on February 19th, 2010 terry 3 comments

    For a new project at work, I was tasked with reevaluated our async worker system. Previously we’d been using beanstalkd with a jobs model that used AASM for persistence and kept up with when something should be scheduled (and was scheduled with a rake task). After looking at the different things available, I decided DelayedJob was a better way to go, because it involved fewer dependencies (no more cron-rake work, no more beanstalkd) and had the persistence built in, instead of our custom code.

    One shortcoming, though, seemed to be scheduling jobs. After thinking about different ways of keeping up with what jobs could be scheduled, I figured it’d make the most sense for jobs to just re-schedule themselves before execution.

    That said, I didn’t want to put that in every class that had a perform() method as a boilerplate, “reschedule myself for 24 hours,” because it’s harder to maintain and grok what jobs are scheduled, and because it’s not nearly as readable.

    Enter ScheduledJob, a mixin that gives you a run_every method on your class:

    With this, you can create a class and include ScheduledJob, and then just say “run_every 24.hours” or whatever. Like this:

    Hopefully this is helpful. I’ll need to write some tests, and see if the DJ guys want to use it on GitHub.

     

    3 responses to “Scheduling Jobs with DelayedJob” RSS icon

    • I don’t think delayed job is a great tool for recurring jobs (I personally just like cron), but this is one of the best solutions I’ve seen for adding scheduled jobs. If you want to add tests and send a pull request, I’ll pull it into the collectiveidea fork.

    • FYI: The github guys have already moved away from DJ because MySQL is terrible at multi-user locks. (Might run better on Postgres). Using a SQL database as a queue never an optimal solution, and using it as cron is even less efficient.

      http://github.com/blog/542-introducing-resque

    • I don’t think anyone’s going to disagree that MySQL isn’t as good at queueing and locking as a dedicated queue, but for lots of applications that only do a few things asynchronously, it’s good enough and doesn’t have any added potential headaches.

      I think “optimal” is subjective here, where if I can get something out the door and coded and reasonably stable (and battle tested, to boot) in a short amount of time and work on other, more important problems, it’s optimal. If we end up having a really good problem, like too much stuff going through the queue and MySQL locks causing problems, I’m happy to solve it down the road with the added infrastructure and whatnot.