Visualising the Ruby Global VM Lock

I’m working on Ruby bindings for Ceph’s RADOS client library – it’s the first native C Ruby extension I’ve written so I’m learning lots of new things.

I’m keen to ensure my extension releases Ruby’s Global VM Lock (GVL) wherever it’s waiting on IO, so that other threads can do work and I’ve written a few simple test scripts to prove to myself it’s working correctly. The result is a textual visualisation of how releasing the GVL can improve the behaviour of threads in Ruby.

For example, I just added basic read and write support to my library so you can read and write objects stored in a Ceph RADOS cluster. My first pass was written without releasing the GVL – it just blocks waiting until Ceph has completed the read or write.

My test script starts three threads, one doing rados write operations in a loop and outputting a “w” to STDOUT when they succeed, one doing rados read operations and writing a “r” and one just doing some cpu work in Ruby and writing a “.”

This is the output from the script before I added GVL releases:

As you can see, it’s almost as if Ruby is switching round-robin style between the threads, waiting for each one to complete one iteration. In some cases, the cpu worker doesn’t get a look in for several read and write iterations!

So then I extracted the blocking parts out to a separate function and called them using Ruby 1.9’s rb_thread_blocking_region function, which releases the GVL, and then reran my test script:

As you can see, the thread doing CPU work in Ruby gets considerably more work done when the GVL is released. Those network-based IO operations can block for quite some time.

It’s exactly what is expected, but it’s neat to see it in action so clearly.

The code for the library is here on github, but but it’s under heavy development at the moment and is in no way complete – I’ve only pushed it out so early so I can write this blog. And this is commit showing just where I made the read/write operations release the gvl.

Beautiful command-line interface design talk

I spoke about writing beautiful command-line interfaces at Scottish Ruby Conference back in June and they’ve published the video, which is freely available for viewing now.

The slides are available here in pdf format (if you’re interested, they were made using emacs org mode and beamer.

There were loads of great talks recorded so check out the videos of them all here on the schedule.

Testing XML with rspec, xpath and libxml

I’m currently working with the virtualization API libvirt which uses XML to represent virtual machines and I’m generating this XML using Ruby.  I’m using rspec to test my code and wanted to test that my output was as I expected.  I started out with rspec-hpricot-matchers which worked fine until I started testing slightly more complex xml, which hpricot wasn’t handling well.

So I wrote a have_xml matcher using the rspec dsl which uses the libxml library to do the testing.  It’s so simple it’s not really worthy of a gem, so here it is (licensed under public domain).  The text check is optional and, to be honest, doesn’t belong here really.  It should be a separate matcher.

require 'libxml'

Spec::Matchers.define :have_xml do |xpath, text|
  match do |body|
    parser = LibXML::XML::Parser.string body
    doc = parser.parse
    nodes = doc.find(xpath)
    nodes.empty?.should be_false
    if text
      nodes.each do |node|
        node.content.should == text

  failure_message_for_should do |body|
    "expected to find xml tag #{xpath} in:\n#{body}"

  failure_message_for_should_not do |response|
    "expected not to find xml tag #{xpath} in:\n#{body}"

  description do
    "have xml tag #{xpath}"

So, add that somewhere (usually spec/spec_helper.rb) and use it like this:

it "should include the xen_machine_id" do
  @xml.should have_xml('/domain/name', 'bb-example-001')

it "should include the network devices" do
  @xml.should have_xml "/domain/devices/interface[1]/ip[@address='']"
  @xml.should have_xml "/domain/devices/interface[1]/mac[@address='aa:00:01:02:03:04']"
  @xml.should have_xml "/domain/devices/interface[1]/script[@path='/etc/xen/scripts/vif-bridge']"
  @xml.should have_xml "/domain/devices/interface[1]/source[@bridge='inetbr']"

Xapian Fu: Full Text Indexing in Ruby

Xapian is an Open Source Search Engine Library written in C++. It has Ruby bindings, but they’re generated with SWIG, so they basically just mirror the C++ bindings – not very Ruby-like (and pretty ugly).

Being a self-confessed full text indexing nerd and a Ruby-lover, I wrote Xapian Fu: a library to provide access to Xapian that is more in line with “The Ruby Way”.

I started writing Xapian Fu exactly a year ago today but left it for a couple of months, then restarted work on it on the train on the way back from the 2009 Scotland on Rails conference.  Development was test driven, so it’s got an extensive test suite (using rspec).  Documentation is in rdoc and is quite detailed.  As of the latest version, it supports Ruby 1.9 too.

Xapian Fu basically gives you a Hash interface to Xapian – so you get a persistent Hash with full text indexing built in (and ACID transactions!).


For example, create a database called example.db, put three documents into it and search them and print the results:

  require 'xapian-fu'
  include XapianFu
  db = => 'example.db', :create => true,
                    :store => [:title, :year])
  db << { :title => 'Brokeback Mountain', :year => 2005 }
  db << { :title => 'Cold Mountain', :year => 2004 }
  db << { :title => 'Yes Man', :year => 2008 }
  db.flush"mountain").each do |match|
    puts match.values[:title]

There are of course a whole bunch more examples in the documentation.
Continue reading Xapian Fu: Full Text Indexing in Ruby

Ruby’s case statement uses ===

I’ve not found this stated clearly enough elsewhere so I’m doing so myself.

Ruby’s case statement calls the === method on the argument to each of the when statements

So, this example:

case my_number
  when 6883

Will execute 6883 === my_number

This is all fine and dandy, because the === method on a Fixnum instance does what you’d expect in this scenario.

However, the === method on the Fixnum class does something different. It’s an alias of is_a?

That is cute, because it allows you to do this:

case my_number
  when Fixnum
    "Easy to memorize"
  when Bignum
    "Hard to memorize"

But it won’t work as you might expect in this scenario:

my_type = Fixnum
case my_type
  when Fixnum
    "Fixed number"

This won’t work because Fixnum === Fixnum returns false because the Fixnum class is not an instance of Fixnum.

My workaround for this is to convert it to a string first. Not sure if that’s the best solution, but it works for me(tm).

my_type = Fixnum
case my_type.to_s
  when "Fixnum"
    "Fixed number"