Tag: ruby

Reliable rake task execution

My News Sniffer project needs to regularly do some back-end stuff like checking a bunch of rss feeds and downloading web pages. I do this with some rake tasks, which I call using the cron daemon.  Recently I’ve been having problems where some tasks take a bit longer than usual to complete and end up running in parallel. This slows things down, which means more tasks end up running in parallel and then my little virtual machine eventually falls on it’s face under memory pressure.

I could implement some locking in my application, but it’s always good to avoid as much new code as possible so, in the good old *NIX fashion, I cobbled together a short bash script taking advantage of existing tools. What this does is executes the given rake task in the given rails root using the Debian/Ubuntu tool start-stop-daemon (provided by the dpkg package, which is therefore always installed). start-stop-daemon uses a pid file to keep track of the rake program for the given task, so it will never run a second concurrent instance of rake for this task. Cron just keeps trying to run it every 5 minutes or whatever, but it only runs once concurrently.

Segfault in Ruby Ferret query parser

Whilst working with the Ruby text search engine library Ferret, I came across a segfault in the query parser. It had already been reported and fixed, but I realised it can lead to a denial of service.

If you use Ferret anywhere that allows users to execute queries, those users can crash the Ruby process with a specially crafted query.  This was quite serious for a number of my sites (not to mention slowing development of a current app) so I applied the fix to the released 0.11.4 source and repackaged it as

Obviously this isn’t in any way official, but it works for me and I’m sharing here for anyone else affected. Gem, tgz and zip here and just the patch available here (derived from the author’s changeset to trunk).

The patch is against the release source, as the subversion repository seems to be down atm (I got the changeset from the web bases subversion viewer).

Get upgrading!

local and remote subversion repositories with Capistrano 2

Peeking at the code of the upcoming Capistrano 2, I noticed you can define different scm variables for remote and local use, which is something I need (I was looking at the code in the hope it could do this :)

So, say I have my code stored in a subversion repository on my local disk, say file:///project/trunk. That’s fine for when Capistrano is querying the latest revision, but the remote servers need to use the repository url svn+ssh://mymachine/project/trunk.

Without modifying the code, this was impossible with Capistrano v1. With Capistrano v2, you can prefix any scm configuration variable with local_ and it will be used for local operations:

set :repository, "svn+ssh://mymachine/project/trunk" 
set :local_repository, "file:///project/trunk"

Leeds Ruby on Rails Talk

Ruby on Rails logoI’m talking about Ruby on Rails at the West Yorkshire Linux User Group on Monday 11th June 2007. I’ll be covering what Rail is, how it works, and how you use it. Starts at 1900hrs at the E.C Stoner (snigger) Building at the University of Leeds. There follows a talk about Sun’s ZFS file system by Tom Hall, then we retire to The Victoria Hotel pub for some real ale and whatnot.

I’ll be the tall one with the curly hair… stood at the front… talking about Ruby on Rails.

Directions and stuff to be found on the WYLUG website.

News Sniffer, Ferret and Rails

I’ve been working on my News Sniffer project for the last few days, finishing up a two month experiment with using the Ruby Lucene implementation, Ferret, to index news articles and comments.  More info on the News Sniffer blog.  The project spanned two months due to some instability in the newer versions of Ferret, but the author responded to the bug reports and managed to fix all the problems so I decided to deploy.

Ferret offers huge improvements over the original MySQL full-text search method, and I’m looking forward to adding some fancy keyword statistics graphs in the future – perhaps showing censorship patterns in bbc comments with certain keywords.

Because News Sniffer is distributed across a number of servers, I used DRb (distributed Ruby) to allow them all to update one central Ferret index.  DRb seems to work very well generally, and is amazingly simple to use, but I ran into a few problems with recycled objects and invalid references whilst using Ferret across it, apparently due to the garbage collector on the service side collecting things still in use on the client side.  I think I eliminated most of them but they still crop up once in a while – I’ll be looking into this further.


Daemontools and Ruby on Rails

Dan J Bernstein’s (djb) daemontools is a set of programs to help you manage unix services. It provides a flexible, secure and convenient way of starting, stopping and sending signals to background processes. Combined with his ucspi-tcp tools, it can be used as an awesome replacement to inetd (it’s most often used in this way to run qmail, a secure and high-performance mta). It can be fiddly to set up and has a bit of a steep learning curve but I already use daemontools for various other stuff, so it was just natural for me to use it for Ruby on Rails deployment.


Active Resource not in Rails 1.2!

Whilst planning some changes to my News Sniffer project, I thought I’d have a play with Active Resource.

Currently, all the forum and news article downloading and scraping happens on a different machine to the web server. It has a VPN connection to the database and memcache servers, but I’d like to integrate the Ferret text indexing system for better searching capabilities. To centralise Ferret, I have a three options:

  1. regularly reindex new content from the database on the web server;
  2. DRb a Ferret Object;
  3. or use ActiveResource to access the models via the web service.

DRb-ing a Ferret Object would be quite elegant, but using ActiveResource would also replace the need for a database and memcache connection (and I could do much better fragment caching actually).

Anyway, I searched high and low for some docs – lots of blog entries about how great it is, but no real api docs. When I searched through the Rails code and found nothing either, I got suspicious. Finally I found a couple of blog entries stating that ActiveResource was dropped from Rails 1.2. It seems to be planned for Rails 2.0. Not sure how I missed this. I guess my search-foo is lacking.

I’ll be investigating other options. I’d much prefer not to build a SOAP or XMLRPC interface. Ugh.

Can I Compost This?

Louisa and I am announcing our latest project: Compost This. It’s a directory of items with information about it’s compostability.

For example Tea and Flour can be put on the compost heap, but Bindweed and Walnuts cannot. And it’s not always a good idea to put Orange peel on there either.

For geeks: I wrote Compost This using Ruby on Rails, which is one of the best web frameworks I’ve used, and I’m really starting to love the Ruby language too. I’ll release the code soon as an example.

I changed his life through webdev

A couple of months ago I was having an IM conversation with an old web developer work friend and asked him if he’d played with Ruby on Rails yet. He told me not and I enquired as to whether he’d been living in a mud hut within a rain forest for the last year. He told me not. I pointed him in the right direction and he said he’d take a look sometime.

Today, after no further conversion, I got this message from him:

(16:14:48) Sid: Hey John!! Just wanted to say thanks for introducing me to Ruby on Rails.. I’ve picked up on it and its changed my life. Now I’m working a contract for the government and dating a hot american chick from new york. btw – like the photo. its class.
(16:16:16) Sid logged out.

He used to be a Coldfusion developer. After finding Ruby on Rails he must look back on Coldfusion and laugh up hard matter from his lower intestine.

Anybody else want to comment on how I’ve changed their life? If you only met me once and had to spend the rest of your life avoiding me it still counts.