Visualising the Ruby Global VM Lock

I’m working on Ruby bindings for Ceph’s RADOS client library – it’s the first native C Ruby extension I’ve written so I’m learning lots of new things.

I’m keen to ensure my extension releases Ruby’s Global VM Lock (GVL) wherever it’s waiting on IO, so that other threads can do work and I’ve written a few simple test scripts to prove to myself it’s working correctly. The result is a textual visualisation of how releasing the GVL can improve the behaviour of threads in Ruby.

For example, I just added basic read and write support to my library so you can read and write objects stored in a Ceph RADOS cluster. My first pass was written without releasing the GVL – it just blocks waiting until Ceph has completed the read or write.

My test script starts three threads, one doing rados write operations in a loop and outputting a “w” to STDOUT when they succeed, one doing rados read operations and writing a “r” and one just doing some cpu work in Ruby and writing a “.”

This is the output from the script before I added GVL releases:

As you can see, it’s almost as if Ruby is switching round-robin style between the threads, waiting for each one to complete one iteration. In some cases, the cpu worker doesn’t get a look in for several read and write iterations!

So then I extracted the blocking parts out to a separate function and called them using Ruby 1.9’s rb_thread_blocking_region function, which releases the GVL, and then reran my test script:

As you can see, the thread doing CPU work in Ruby gets considerably more work done when the GVL is released. Those network-based IO operations can block for quite some time.

It’s exactly what is expected, but it’s neat to see it in action so clearly.

The code for the library is here on github, but but it’s under heavy development at the moment and is in no way complete – I’ve only pushed it out so early so I can write this blog. And this is commit showing just where I made the read/write operations release the gvl.

Beautiful command-line interface design talk

I spoke about writing beautiful command-line interfaces at Scottish Ruby Conference back in June and they’ve published the video, which is freely available for viewing now.

The slides are available here in pdf format (if you’re interested, they were made using emacs org mode and beamer.

There were loads of great talks recorded so check out the videos of them all here on the schedule.

Testing XML with rspec, xpath and libxml

I’m currently working with the virtualization API libvirt which uses XML to represent virtual machines and I’m generating this XML using Ruby.  I’m using rspec to test my code and wanted to test that my output was as I expected.  I started out with rspec-hpricot-matchers which worked fine until I started testing slightly more complex xml, which hpricot wasn’t handling well.

So I wrote a have_xml matcher using the rspec dsl which uses the libxml library to do the testing.  It’s so simple it’s not really worthy of a gem, so here it is (licensed under public domain).  The text check is optional and, to be honest, doesn’t belong here really.  It should be a separate matcher.

require 'libxml'

Spec::Matchers.define :have_xml do |xpath, text|
  match do |body|
    parser = LibXML::XML::Parser.string body
    doc = parser.parse
    nodes = doc.find(xpath)
    nodes.empty?.should be_false
    if text
      nodes.each do |node|
        node.content.should == text

  failure_message_for_should do |body|
    "expected to find xml tag #{xpath} in:\n#{body}"

  failure_message_for_should_not do |response|
    "expected not to find xml tag #{xpath} in:\n#{body}"

  description do
    "have xml tag #{xpath}"

So, add that somewhere (usually spec/spec_helper.rb) and use it like this:

it "should include the xen_machine_id" do
  @xml.should have_xml('/domain/name', 'bb-example-001')

it "should include the network devices" do
  @xml.should have_xml "/domain/devices/interface[1]/ip[@address='']"
  @xml.should have_xml "/domain/devices/interface[1]/mac[@address='aa:00:01:02:03:04']"
  @xml.should have_xml "/domain/devices/interface[1]/script[@path='/etc/xen/scripts/vif-bridge']"
  @xml.should have_xml "/domain/devices/interface[1]/source[@bridge='inetbr']"

Xapian Fu: Full Text Indexing in Ruby

Xapian is an Open Source Search Engine Library written in C++. It has Ruby bindings, but they’re generated with SWIG, so they basically just mirror the C++ bindings – not very Ruby-like (and pretty ugly).

Being a self-confessed full text indexing nerd and a Ruby-lover, I wrote Xapian Fu: a library to provide access to Xapian that is more in line with “The Ruby Way”.

I started writing Xapian Fu exactly a year ago today but left it for a couple of months, then restarted work on it on the train on the way back from the 2009 Scotland on Rails conference.  Development was test driven, so it’s got an extensive test suite (using rspec).  Documentation is in rdoc and is quite detailed.  As of the latest version, it supports Ruby 1.9 too.

Xapian Fu basically gives you a Hash interface to Xapian – so you get a persistent Hash with full text indexing built in (and ACID transactions!).


For example, create a database called example.db, put three documents into it and search them and print the results:

  require 'xapian-fu'
  include XapianFu
  db = XapianDb.new(:dir => 'example.db', :create => true,
                    :store => [:title, :year])
  db << { :title => 'Brokeback Mountain', :year => 2005 }
  db << { :title => 'Cold Mountain', :year => 2004 }
  db << { :title => 'Yes Man', :year => 2008 }
  db.search("mountain").each do |match|
    puts match.values[:title]

There are of course a whole bunch more examples in the documentation.

Ruby’s case statement uses ===

I’ve not found this stated clearly enough elsewhere so I’m doing so myself.

Ruby’s case statement calls the === method on the argument to each of the when statements

So, this example:

case my_number
  when 6883

Will execute 6883 === my_number

This is all fine and dandy, because the === method on a Fixnum instance does what you’d expect in this scenario.

However, the === method on the Fixnum class does something different. It’s an alias of is_a?

That is cute, because it allows you to do this:

case my_number
  when Fixnum
    "Easy to memorize"
  when Bignum
    "Hard to memorize"

But it won’t work as you might expect in this scenario:

my_type = Fixnum
case my_type
  when Fixnum
    "Fixed number"

This won’t work because Fixnum === Fixnum returns false because the Fixnum class is not an instance of Fixnum.

My workaround for this is to convert it to a string first. Not sure if that’s the best solution, but it works for me(tm).

my_type = Fixnum
case my_type.to_s
  when "Fixnum"
    "Fixed number"

Song In Code: Ramones, I wanna be sedated

Just the first verse:

go = Proc.new { sleep 24.hours }
self.wants :sedatation
begin ; nil ; end
case go ; where "no" ; nil ; end
self.wants :sedatation
self.get '/airport'
self.put '/airport/plane'
before self.insane? do
  3.times { hurry! }
return if self.can_control? :fingers
return if self.can_control? :brain
5.times { "no" }

I recorded me singing it, which is kinda stupid tbh.

I used mencoder to convert this to something Youtube found tasty. Like this:

mencoder -ss 15 -endpos 1:18 -vf pp=al:f,scale=480:360 -oac mp3lame -ovc lavc -lavcopts vcodec=libx264:mbd=1:vbitrate=2000 MOV01362.MPG -o MOV01362.x264

Also, pimp for another Geek/Ukelele project: Ukepedia, all 3 million Wikipedia articles one song at a time

My NWRUG Ferret Talk

I did a short talk on Ferret, the Ruby “Information Retreival Library”, at the North West Ruby Users Group last Thursday.  We had a bit of a theme too, with Will Jessop speaking about Sphinx and Asa Calow speaking about Solr.

I got to have a bit of a nosey around the Manchester BBC building too – though I was worried I’d open the wrong door and end up on TV. Didn’t fancy having to apologise to Jeremy Paxman.

Brightbox also sponsored some pizza, and gave away t-shirts and stickers like candy (there was no candy though).

My slides are available here, and contain a little example file system indexer. I made my slides with webby and S6 if you’re interested.

Euruko Ruby Conference 2008 in Prague

I’m in Prague with Brightbox for the Euruko Ruby Conference 2008 from tomorrow evening until Monday morning. I’ll post photos to the Brightbox Flickr photostream as we go along.  If anyone wants to meet up for a drink, email me at john at johnleach dotty co dotty uk.

UPDATE: Photos here.

Leeds Ruby Thing #2, Thursday 6th March

The Leeds offshoot of the North West Ruby User Group is meeting again this Thursday, 6th March, 7:00 PM – 11:00 PM.  This time at Mr. Foley’s
Cask Ale House
, on The Headrow (formerly Dr. Okells).

Expect unstructured discussion of Ruby, Ruby on Rails and other random stuff plus nice people, great beer and coffee and geeky tshirts.

The balcony back room of Mr Foley’s has been booked.  Announce that you’re coming on the upcoming page.

Oh, and we now have a website: http://leedsrubything.org/

Leeds Ruby Thing, Victoria Hotel 7th Feb 2008.

Some of the people of the North West Ruby User Group (who usually meet in Manchester) have organised the first little Leeds get together.  No real name yet, so it’s the Leeds Ruby Thing for now.

No clear plan yet either, but expect unstructured discussion of Ruby and Ruby on Rails at least.

Thursday 7th February 2008 at 7pm in the Victoria Hotel pub. All welcome!

More details here: http://upcoming.yahoo.com/event/423116

North West Ruby User Group Talk: Building Brightbox

Oh, btw, I’m doing a talk tomorrow at the North West Ruby User Group in Manchester about how we do the Ruby on Rails hosting at Brightbox.

I’ll be talking about SANs, Centos, Ubuntu, Xen, Apache, Lighty, NGINX, MySQL and other goodies. Heck, I might even mention Ruby, which would be nice considering it’s a Ruby user group.

My business partner Jeremy will be nattering about the business side and various other things.

Update: A couple of photos here and here.

Rubinius multiple instances, one process

Rubinius has support (as of today!) for running multiple instances of it’s VM within one process, each VM on it’s own *native* thread, each VM running many ruby green threads. Each VM has it’s own heap and so each VM could load different apps that wouldn’t interfere with each other. We have plans for a mod_rubinius for apache that takes full advantage of this feature. Stay tuned ;)

Ezra Zygmuntowi on a comment on Ruby Inside.

Very interesting stuff. Why bother making Rails thread safe when you have an awesome Ruby VM such as Rubinius. I’d like to see Mongrel (or FastCGI! Bring back FastCGI!) make use of this somehow, running multiple Rails instances itself in one process and distributing requests between them. Interested in knowing how it’d deal with memory leaks in external libraries though (like rmagick suffers from).

Still, you lose finer grained access to most of the nice UNIX process management stuff though then, like limiting memory usage with ulimits, but nobody seems to be using that for Ruby deployment anyway. It’s all fiddling around with Monit and such instead (why always with the steps backward!).