<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>John Leach's Blog</title>
	<atom:link href="http://johnleach.co.uk/words/feed" rel="self" type="application/rss+xml" />
	<link>http://johnleach.co.uk/words</link>
	<description>Stuff I think, see and do</description>
	<lastBuildDate>Fri, 18 Jun 2010 22:57:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>LVM snapshot performance</title>
		<link>http://johnleach.co.uk/words/archives/2010/06/18/613/lvm-snapshot-performance</link>
		<comments>http://johnleach.co.uk/words/archives/2010/06/18/613/lvm-snapshot-performance#comments</comments>
		<pubDate>Fri, 18 Jun 2010 22:57:41 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[GNU/Linux]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[benchmark]]></category>
		<category><![CDATA[btrfs]]></category>
		<category><![CDATA[device-mapper]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[lvm]]></category>
		<category><![CDATA[snapshot]]></category>
		<category><![CDATA[speed]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[zfs]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=613</guid>
		<description><![CDATA[The Linux Logical Volume Manager (LVM) supports creating snapshots of logical volumes (LV) using the device mapper. Device mapper implements snapshots using a copy on write system, so whenever you write to either the source LV or the new snapshot LV, a copy is made first. So a write to a normal LV is just [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)">Linux Logical Volume Manager</a> (LVM) supports creating snapshots of logical volumes (LV) using the device mapper. Device mapper implements snapshots using a copy on write system, so whenever you write to either the source LV or the new snapshot LV, a copy is made first.</p>
<p>So a write to a normal LV is just a write, but a write to a snapshotted LV (or an LV snapshot) involves reading the original data, writing it elsewhere and then writing some metadata about it all.</p>
<p>This quite obviously impacts performance, and due to device mapper having a very basic implementation, it is particularly bad.  My tests show <em>synchronous sequential writes to a snapshotted LV are around 90% slower than writes to a normal LV</em>.</p>
<p><span id="more-613"></span><br />
Once copied and written, writes to the same chunk are only 15% slower.  <span style="font-size: 13.3333px;">Reads are super fast, only a 5% speed impact.</span></p>
<p>Still, not many usage patterns involve huge full speed sequential writes to a filesystem, so LVM is still useful in most circumstances.</p>
<p>I did some tests to see how writes to one snapshotted LV impacted the performance of writes to a completely separate normal LV. Does a snapshotted LV ruin the performance of all your other LVs? Yes, especially if you&#8217;re using the cfq disk scheduler. Switching to the deadline scheduler made things considerably better for the normal LV (but slowed writes to the snapshotted LV a little further).</p>
<p>I did these tests on a 12 disk hardware RAID10 system. The test is a synthetic benchmark so I urge you to do your own tests, but it&#8217;s safe to say that <em>device mapper does not implement clever snapshotting like btrfs or zfs &#8211; don&#8217;t expect great performance from it.</em></p>
<h3>Improving LVM Snapshot performance</h3>
<p>There are a few ways to improve performance of LVM snapshots.  The most obvious one is the chunk size, which can be tweaked when creating the snapshot.  This controls the size of the data that will be copied and written on write operations.  The best setting will depend on lots of stuff, such as your RAID stripe size and your usage patterns.</p>
<p>There is an <a href="http://lkml.org/lkml/2008/9/17/40">as-yet uncommitted patch</a> that improves snapshot write performance a bit by being a bit clever about the disk queuing, but it&#8217;s still slow.</p>
<p>Also, device mapper supports non-persistent snapshots (i.e: lost after reboot), which should avoid having to write the change metadata to disk (which will save a lot of seeks and writes) but LVM doesn&#8217;t seem to support creating these yet.</p>
<p>Putting the snapshot device on a separate disk would help too &#8211; I&#8217;m not sure it&#8217;s possible with LVM, but device mapper does support it.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2010/06/18/613/lvm-snapshot-performance/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Testing XML with rspec, xpath and libxml</title>
		<link>http://johnleach.co.uk/words/archives/2010/04/06/585/testing-xml-with-rspec-xpath-and-libxml</link>
		<comments>http://johnleach.co.uk/words/archives/2010/04/06/585/testing-xml-with-rspec-xpath-and-libxml#comments</comments>
		<pubDate>Tue, 06 Apr 2010 12:28:46 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[rspec]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[spec]]></category>
		<category><![CDATA[tdd]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[xpath]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=585</guid>
		<description><![CDATA[I&#8217;m currently working with the virtualization API libvirt which uses XML to represent virtual machines and I&#8217;m generating this XML using Ruby.  I&#8217;m using rspec to test my code and wanted to test that my output was as I expected.  I started out with rspec-hpricot-matchers which worked fine until I started testing slightly more complex [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m currently working with the virtualization API<a href="http://libvirt.org/"> libvirt</a> which uses XML to represent virtual machines and I&#8217;m generating this XML using Ruby.  I&#8217;m using <a href="http://rspec.info/">rspec</a> to test my code and wanted to test that my output was as I expected.  I started out with <a href="http://github.com/fnando/rspec-hpricot-matchers">rspec-hpricot-matchers</a> which worked fine until I started testing slightly more complex xml, which hpricot wasn&#8217;t handling well.</p>
<p>So I wrote a have_xml matcher using the rspec dsl which uses the <a href="http://libxml.rubyforge.org/">libxml</a> library to do the testing.  It&#8217;s so simple it&#8217;s not really worthy of a gem, so here it is (licensed under public domain).  The text check is optional and, to be honest, <a href="http://blog.thecodewhisperer.com/post/398226883/rspec-have-tag-spec-matcher-and-nokogiri">doesn&#8217;t belong here really</a>.  It should be a separate matcher.</p>
<pre><code>
require 'libxml'

Spec::Matchers.define :have_xml do |xpath, text|
  match do |body|
    parser = LibXML::XML::Parser.string body
    doc = parser.parse
    nodes = doc.find(xpath)
    nodes.empty?.should be_false
    if text
      nodes.each do |node|
        node.content.should == text
      end
    end
    true
  end

  failure_message_for_should do |body|
    "expected to find xml tag #{xpath} in:\n#{body}"
  end

  failure_message_for_should_not do |response|
    "expected not to find xml tag #{xpath} in:\n#{body}"
  end

  description do
    "have xml tag #{xpath}"
  end
end
</code></pre>
<p>So, add that somewhere (usually spec/spec_helper.rb) and use it like this:</p>
<pre><code>
it "should include the xen_machine_id" do
  @xml.should have_xml('/domain/name', 'bb-example-001')
end

it "should include the network devices" do
  @xml.should have_xml "/domain/devices/interface[1]/ip[@address='1.2.3.4']"
  @xml.should have_xml "/domain/devices/interface[1]/mac[@address='aa:00:01:02:03:04']"
  @xml.should have_xml "/domain/devices/interface[1]/script[@path='/etc/xen/scripts/vif-bridge']"
  @xml.should have_xml "/domain/devices/interface[1]/source[@bridge='inetbr']"
end
</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2010/04/06/585/testing-xml-with-rspec-xpath-and-libxml/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Chat Roulette: Eye Vagina</title>
		<link>http://johnleach.co.uk/words/archives/2010/03/10/561/chat-roulette-eye-vagina</link>
		<comments>http://johnleach.co.uk/words/archives/2010/03/10/561/chat-roulette-eye-vagina#comments</comments>
		<pubDate>Wed, 10 Mar 2010 21:46:52 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[chat roulette]]></category>
		<category><![CDATA[chatroulette]]></category>
		<category><![CDATA[eye]]></category>
		<category><![CDATA[flash]]></category>
		<category><![CDATA[prank]]></category>
		<category><![CDATA[vagina]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=561</guid>
		<description><![CDATA[Chat Roulette is a web site that hooks you up to a random person. It streams their webcam video and audio to you, and your&#8217;s to them.  When you&#8217;re done, you click next and get another random person. That&#8217;s the whole thing.  It&#8217;s fun, for a short period of time. Anyway, whilst holding my webcam [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.chatroulette.com/">Chat Roulette</a> is a web site that hooks you up to a random person. It streams their webcam video and audio to you, and your&#8217;s to them.  When you&#8217;re done, you click next and get another random person. That&#8217;s the whole thing.  It&#8217;s fun, for a short period of time.</p>
<p>Anyway, whilst holding my webcam to different parts of my body (if you ever use my webcam, wash your hands) I discovered that my eye, on its side, with the right lighting, and right shadows, and bad focus, through a webcam&#8230; looks kinda, possibly, a bit like girl bits.</p>
<p>It&#8217;s probably fair to say that, for a large proportion of the random strangers on Chat Roulette, the &#8220;Next&#8221; button is usually clicked in the hope of seeing a girl flashing some part of her body.</p>
<p>Combine these two seemingly unconnected facts together, and you get some of the reactions you see in my Eye Vagina video!  The music is &#8220;My Vagina&#8221; by NOFX. I edited out roughly 300 people jerking off.  The vid has had more than half a million hits on you tube. I&#8217;m expecting my share of their fat advertising profits any day now.</p>
<p>I recorded it using ﻿<a href="http://recordmydesktop.sourceforge.net">recordmydesktop</a> and edited it using <a href="http://www.pitivi.org/">Pitivi</a> (which actually had some very annoying audo sync problems I had to jump through hoops to avoid, which was a shame).</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/Bq6xjTyw7zM&amp;hl=en_GB&amp;fs=1&amp;rel=0" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="385" src="http://www.youtube.com/v/Bq6xjTyw7zM&amp;hl=en_GB&amp;fs=1&amp;rel=0" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2010/03/10/561/chat-roulette-eye-vagina/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Advertising and ad blocking</title>
		<link>http://johnleach.co.uk/words/archives/2010/03/07/497/advertising-and-ad-blocking</link>
		<comments>http://johnleach.co.uk/words/archives/2010/03/07/497/advertising-and-ad-blocking#comments</comments>
		<pubDate>Sun, 07 Mar 2010 16:19:10 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[ad-blocking]]></category>
		<category><![CDATA[adblock]]></category>
		<category><![CDATA[adblocking]]></category>
		<category><![CDATA[advertising]]></category>
		<category><![CDATA[externalities]]></category>
		<category><![CDATA[money]]></category>
		<category><![CDATA[pollution]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=497</guid>
		<description><![CDATA[I&#8217;ve thought about advertising and ad-blockers a lot over the years, and the debate is getting some attention right now starting with a recent Ars Technica article, so I thought I&#8217;d put down some of my own thoughts on it. Funding your content through advertising is hugely inefficient. Of the people who visit your site, [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve thought about advertising and ad-blockers a lot over the years, and the debate is getting some attention right now starting with a recent <a href="http://arstechnica.com/business/news/2010/03/why-ad-blocking-is-devastating-to-the-sites-you-love.ars">Ars Technica article</a>, so I thought I&#8217;d put down some of my own thoughts on it.</p>
<p>Funding your content through advertising is hugely inefficient. Of the people who visit your site, usually only a tiny proportion click on (or notice) an advert, and only a tiny proportion of those then spends any money.  So a tiny, tiny proportion of your visitors give any money to your advertisers. So money filters down this system in tiny margins.  Then, at the bottom of the system, a tiny amount of the profits from the income covers the cost of advertising.  Then this money moves back up the system to you, usually via your advertising agent who takes a nice cut (I&#8217;ve heard Google pass as little as one twelfth onto the publisher in some cases).</p>
<p>And this doesn&#8217;t consider the costs of the advertiser choosing and designing the ad or the tonnes of bandwidth and gatrillions of CPU cycles used to serve the actual adverts.</p>
<p>It also does not consider externalities, such as pollution. <strong>Advertising is mind pollution.</strong> Advertising is designed to affect the behaviour of people for the benefit of the advertiser.  Why would anyone willingly expose themselves to something designed to steal their attention?</p>
<p>You might argue that advertising creates value &#8211; some viewers choose to buy when otherwise they wouldn&#8217;t have. But what of the huge proportion of people who just had their attention stolen? No value was created there.</p>
<p>Because not everyone is suckered in by it, advertising squanders billions of hours of attention every day to produce nothing.<br />
<span id="more-497"></span></p>
<h3>Begging</h3>
<p>Walking down a street in town I might get approached by a beggar who needs money to eat.  In the UK we have the <a href="http://www.statutelaw.gov.uk/content.aspx?LegType=All+Primary&amp;PageNumber=94&amp;NavFrom=2&amp;parentActiveTextDocId=1029462&amp;ActiveTextDocId=1029462&amp;filesize=54565">Vagrancy Act of 1824</a> which prevents these people from <em>harassing</em> me. They can be punished by detention, hard labour and whipping apparently.</p>
<p>Advertising is just a company harassing me for money to eat.  They&#8217;re just better funded.  I believe we should be able to detain and whip their advertising departments.</p>
<h3>Unobtrusive payment</h3>
<p>But seriously, advertising is a broken method of paying for stuff.  If we could unobtrusively pay for content on the Internet, I&#8217;m sure enough people would do so to more than cover costs of production.</p>
<p>We need good, unobtrusive payment methods and a change in culture to pay for good content would follow.  We can work on changing the culture now though: support sites that are ad-free (or have ad-free subscriptions).  I pay for a couple of ad-free subscriptions myself &#8211; be the change you want to see in the world.</p>
<p>I look forward to the day when a site with advertising is a clear signal that nobody would pay for it otherwise. My ad-blocker could then just block the whole site to save my wasting my attention.</p>
<h3>Full Disclosure</h3>
<p>My company does a (very small) amount of advertising, though I&#8217;m not involved directly in it. I don&#8217;t know how well it performs. We also sponsor conferences and events, which is of course advertising.  I also help run a couple of web sites that make money from advertising.</p>
<p>From a viewer&#8217;s perspective, I hate advertising. From a publisher&#8217;s perspective, I can make a few quid from it and kinda just hope the pollution isn&#8217;t so bad (we do not allow annoying flashing adverts, and block ads from particularly evil corporations whenever we can but frankly, I&#8217;m mostly getting by on cognitive dissonance).  I&#8217;ve not thought about it until now, but from an advertisers perspective, it&#8217;s of course nice to get new customers (though I&#8217;m not sure of the &#8220;quality&#8221; of the custom we get via advertising &#8211; I&#8217;m now interested in investigating this further).</p>
<p>I&#8217;m in no way dependent on any of my income from advertising, so it&#8217;s hard to speak from these perspectives.</p>
<h3>Update: Poor people</h3>
<p>I&#8217;m basically saying that because adverts are not well targetted, the majority of advert views are wasted. They&#8217;re mind pollution.</p>
<p>But in order for adverts to get more accurate, the ad companies need to collect personal information about us: what we do online, what we like etc.  So we&#8217;re supposed to hand over our privacy, just so we can ethically view &#8220;free&#8221; stuff on the Internet?</p>
<p>Let&#8217;s suppose advertising becomes perfectly targeted. Every advert you see is something you really can&#8217;t do without and something you can afford. Wouldn&#8217;t this mean you buy everything you get shown? Wouldn&#8217;t this mean you&#8217;d run out of money?</p>
<p>Is it unethical for poor people to view ad-supported online content if they can&#8217;t afford anything being advertised? However well targeted the ads are, they have no money to spend so it&#8217;s completely fruitless.  Perhaps ad supported websites should ban public library Internet addresses &#8211; poor people are reading for free!</p>
<h3>Discussions Elsewhere</h3>
<ul>
<li><a href="http://blog.mozilla.com/rob-sayre/2010/03/06/why-ad-blockers-work/">Rob Sayre: Why adblockers work</a></li>
<li><a href="http://briancarper.net/blog/advertising-is-devastating-to-my-well-being">Brian Carpet: Advertising is devastating to my well being</a></li>
<li><a href="http://news.ycombinator.com/item?id=1173582">Comments on Hacker News</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2010/03/07/497/advertising-and-ad-blocking/feed</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Xapian Fu: Full Text Indexing in Ruby</title>
		<link>http://johnleach.co.uk/words/archives/2010/01/31/445/xapian-fu-full-text-indexing-in-ruby</link>
		<comments>http://johnleach.co.uk/words/archives/2010/01/31/445/xapian-fu-full-text-indexing-in-ruby#comments</comments>
		<pubDate>Sun, 31 Jan 2010 23:28:48 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[active record]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[full text indexing]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[stemming]]></category>
		<category><![CDATA[stopping]]></category>
		<category><![CDATA[the ruby way]]></category>
		<category><![CDATA[xapian]]></category>
		<category><![CDATA[xapian-fu]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=445</guid>
		<description><![CDATA[Xapian is an Open Source Search Engine Library written in C++. It has Ruby bindings, but they&#8217;re generated with SWIG, so they basically just mirror the C++ bindings &#8211; not very Ruby-like (and pretty ugly). Being a self-confessed full text indexing nerd and a Ruby-lover, I wrote Xapian Fu: a library to provide access to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://xapian.org/">Xapian</a> is an Open Source Search Engine Library written in C++.  It has <a href="http://xapian.org/docs/bindings/ruby/">Ruby bindings</a>, but they&#8217;re generated with <a href="http://www.swig.org/">SWIG</a>, so they basically just mirror the C++ bindings &#8211; not very Ruby-like (and pretty ugly).</p>
<p>Being a self-confessed full text indexing nerd and a Ruby-lover, I wrote <a href="http://github.com/johnl/xapian-fu">Xapian Fu</a>: a library to provide access to Xapian that is more in line with &#8220;The Ruby Way&#8221;.</p>
<p>I started writing Xapian Fu exactly a year ago today but left it for a couple of months, then restarted work on it on the train on the way back from the 2009 <a href="http://scottishrubyconference.com/">Scotland on Rails</a> conference.  Development was <a href="http://en.wikipedia.org/wiki/Test-driven_development">test driven</a>, so it&#8217;s got an extensive test suite (using <a href="http://rspec.info/">rspec</a>).  <a href="http://rdoc.info/projects/johnl/xapian-fu">Documentation is in rdoc</a> and is quite detailed.  As of the latest version, it supports Ruby 1.9 too.</p>
<p>Xapian Fu basically gives you a Hash interface to Xapian &#8211; so <em>you get a persistent Hash with full text indexin</em><em>g built in</em> (and ACID transactions!).</p>
<h3>Example</h3>
<p>For example, create a database called example.db, put three documents into it and search them and print the results:</p>
<pre><code>  require 'xapian-fu'
  include XapianFu
  db = XapianDb.new(:dir =&gt; 'example.db', :create =&gt; true,
                    :store =&gt; [:title, :year])
  db &lt;&lt; { :title =&gt; 'Brokeback Mountain', :year =&gt; 2005 }
  db &lt;&lt; { :title =&gt; 'Cold Mountain', :year =&gt; 2004 }
  db &lt;&lt; { :title =&gt; 'Yes Man', :year =&gt; 2008 }
  db.flush
  db.search("mountain").each do |match|
    puts match.values[:title]
  end</code></pre>
<p>There are of course a whole bunch more examples in <a href="http://rdoc.info/projects/johnl/xapian-fu">the documentation</a>.<br />
<span id="more-445"></span></p>
<h3>Schema-less</h3>
<p>The hard work of full text indexing and storage is of course done by the Xapian library, but I have added a couple of useful features.  One in particular is the ability to use symbols (or strings) as field names. Xapian has no real concept of fields, but you can store arbitrary data that it calls values in a numbered slot alongside each document.  Instead of making you deal with field numbers, Xapian Fu uses a hash function to convert your field names into numbers.  This means <em>Xapian Fu is schema-less - </em>you can add or omit fields whenever you like.  It&#8217;s useful to define fields when opening databases so that Xapian Fu can recognise them in searches or to give Xapian Fu some clues on the type of data you&#8217;ll be using, but it&#8217;s not necessary.</p>
<h3>Efficient storage of fields for ordering</h3>
<p>If you tell Xapian Fu what type of data you&#8217;ll be storing in your fields, it can store them more efficiently.  For example, if you don&#8217;t specify the type, integers will be converted to strings as is, so 354,441,945,266,899 becomes &#8220;354441945266899&#8243; &#8211; that&#8217;s  fifteen bytes!  When you tell Xapian Fu that your field is going to be an Integer,  it will store them in double precision floating point format which is 8 bytes and can represent up to about 16 decimal digits.  Also, it&#8217;s stored in big-endian format, so Xapian can still use the field for ordering results. XapianFu will store Time objects like this too, so again, it&#8217;s size efficient and can be used for ordering results.</p>
<p>Since Xapian Fu now knows what type the field is, it can convert it back when you access it too, so you get an Integer or a Time object (rather than a String, which is how Xapian represents it internally). It currently supports Integer, Fixnum, Bignum (up to a certain size), Float, Time and Date.  You can add your own types easily by decorating your instances with special methods.</p>
<h3>Stemming and stopping</h3>
<p>Xapian has <a href="http://xapian.org/docs/stemming.html">stemming</a> support for loads of languages (via the <a href="http://snowball.tartarus.org">Snowball</a> library), but no <a href="http://en.wikipedia.org/wiki/Stop_words">stop word</a> lists.  Xapian Fu uses the appropriate stemmer when you specify a language for your document or database and comes with stop word lists for 13 languages (also automatically used).  This means Xapian doesn&#8217;t have to index these common stop words, so you get faster indexing and search times, a smaller database and more relevant search results.</p>
<p>Xapian Fu also knows that your searches won&#8217;t work right unless you stem them too! It automatically stems queries using the database language (though this does fall down a bit if you have different documents with different languages in your database at the moment, but it can be disabled (and isn&#8217;t too difficult to add support)).</p>
<h3>Will Paginate support</h3>
<p><a href="http://wiki.github.com/mislav/will_paginate/">will_paginate</a> is a pagination library for ActiveRecord (and other DB abstraction layers).  It has helpers for drawing page list interfaces.  Xapian Fu supports will_paginate by using the same method names in result sets (such as :current_page and :total_pages).  You can pass a Xapian Fu result set to the will_paginate helpers and you&#8217;ll get lovely page list interfaces (you need to handle accepting the parameters in your action and setting up the next search of course!)</p>
<h3>Active Record support</h3>
<p>Xapian Fu does not yet have an Active Record plugin (I&#8217;ll add one soon) but as Xapian Fu uses the :id field as the Xapian primary key by default, it&#8217;s trivial to use it in your Rails app right now. See the &#8220;ActiveRecord Integration&#8221; section of the README for code examples.  In this case, you probably don&#8217;t need to actually store any data in the Xapian database, just the index information (and the :id field of course, but that&#8217;s stored by default) &#8211; so you get a smaller database (though you still need to store fields that you want to group by (collapse) or order results with).</p>
<h3>Multi-master replicated full text indexing service</h3>
<p>Xapian Fu doesn&#8217;t do this. I&#8217;m designing something that might though :)</p>
<h3>Getting Xapian Fu</h3>
<p>It&#8217;s available in gem form from Rubyforge/cutter.  The code is <a href="http://github.com/johnl/xapian-fu">on github here</a>.  You&#8217;ll need the Xapian Ruby bindings installed &#8211; on Debian/Ubuntu this is the ﻿﻿﻿﻿libxapian-ruby1.8 package.  The gem named <a href="http://gemcutter.org/gems/xapian">xapian</a> claims to provide Xapian and the Ruby bindings but it failed form me on 64bit.  The gem named <a href="http://gemcutter.org/gems/xapian-full">xapian-full</a> claims to provide the Ruby bindings without Xapian (you&#8217;ll obviously need to build and install Xapian yourself) but I&#8217;ve not used that either.  RPMs, source files and other downloads are listed on the <a href="http://xapian.org/download">Xapian downloads page</a>.</p>
<p>There is also this <a href="http://johnleach.co.uk/documents/xapian-fu/">weird kinda splash page</a> I made, in some kind of attempt to host <em>something</em> about Xapian Fu on my own domain. Not really sure what real purpose it serves.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2010/01/31/445/xapian-fu-full-text-indexing-in-ruby/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Ruby&#8217;s case statement uses ===</title>
		<link>http://johnleach.co.uk/words/archives/2009/08/30/402/rubys-case-statement-uses</link>
		<comments>http://johnleach.co.uk/words/archives/2009/08/30/402/rubys-case-statement-uses#comments</comments>
		<pubDate>Sun, 30 Aug 2009 19:48:16 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[case]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[condiitonal]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[switch]]></category>
		<category><![CDATA[when]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=402</guid>
		<description><![CDATA[I&#8217;ve not found this stated clearly enough elsewhere so I&#8217;m doing so myself. Ruby&#8217;s case statement calls the === method on the argument to each of the when statements So, this example: case my_number when 6883 :prime end Will execute 6883 === my_number This is all fine and dandy, because the === method on a [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve not found this stated clearly enough elsewhere so I&#8217;m doing so myself.</p>
<p><strong>Ruby&#8217;s case statement calls the <code>===</code> method on the argument to each of the when statements</strong></p>
<p>So, this example:</p>
<pre><code>case my_number
  when 6883
    :prime
end
</code></pre>
<p>Will execute <code>6883 === my_number</code></p>
<p>This is all fine and dandy, because the <code>===</code> method on a Fixnum instance does what you&#8217;d expect in this scenario.</p>
<p>However, the <code>===</code> method on the Fixnum <em>class</em> does something different.  It&#8217;s an alias of  <code>is_a?</code></p>
<p>That is cute, because it allows you to do this:</p>
<pre><code>case my_number
  when Fixnum
    "Easy to memorize"
  when Bignum
    "Hard to memorize"
  end
</code></pre>
<p>But it won&#8217;t work as you might expect in this scenario:</p>
<pre><code>my_type = Fixnum
case my_type
  when Fixnum
    "Fixed number"
end
</code></pre>
<p>This won&#8217;t work because <code>Fixnum === Fixnum</code> returns <code>false</code> because the <code>Fixnum</code> class is not an instance of <code>Fixnum</code>.</p>
<p>My workaround for this is to convert it to a string first. Not sure if that&#8217;s the best solution, but it works for me(tm).</p>
<pre><code>my_type = Fixnum
case my_type.to_s
  when "Fixnum"
    "Fixed number"
end
</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/08/30/402/rubys-case-statement-uses/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Song In Code: Ramones, I wanna be sedated</title>
		<link>http://johnleach.co.uk/words/archives/2009/08/21/394/song-in-code-ramones-i-wanna-be-sedated</link>
		<comments>http://johnleach.co.uk/words/archives/2009/08/21/394/song-in-code-ramones-i-wanna-be-sedated#comments</comments>
		<pubDate>Fri, 21 Aug 2009 14:40:31 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Music]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[ramones]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[sing]]></category>
		<category><![CDATA[song]]></category>
		<category><![CDATA[songsincode]]></category>
		<category><![CDATA[ukelele]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=394</guid>
		<description><![CDATA[Just the first verse: go = Proc.new { sleep 24.hours } self.wants :sedatation begin ; nil ; end case go ; where "no" ; nil ; end self.wants :sedatation self.get '/airport' self.put '/airport/plane' before self.insane? do 3.times { hurry! } end return if self.can_control? :fingers return if self.can_control? :brain 5.times { "no" } I recorded [...]]]></description>
			<content:encoded><![CDATA[<p>Just the first verse:</p>
<pre><code>go = Proc.new { sleep 24.hours }
self.wants :sedatation
begin ; nil ; end
case go ; where "no" ; nil ; end
self.wants :sedatation
self.get '/airport'
self.put '/airport/plane'
before self.insane? do
  3.times { hurry! }
end
return if self.can_control? :fingers
return if self.can_control? :brain
5.times { "no" }
</code></pre>
<p>I recorded me singing it, which is kinda stupid tbh.</p>
<p><object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/C4j8XAQPoAs&#038;hl=en&#038;fs=1&#038;rel=0"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/C4j8XAQPoAs&#038;hl=en&#038;fs=1&#038;rel=0" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<p>I used mencoder to convert this to something Youtube found tasty. Like this:</p>
<p><code>mencoder -ss 15 -endpos 1:18 -vf pp=al:f,scale=480:360 -oac mp3lame -ovc lavc  -lavcopts vcodec=libx264:mbd=1:vbitrate=2000 MOV01362.MPG -o MOV01362.x264<br />
</code></p>
<p>Also, pimp for another Geek/Ukelele project: <a href="http://www.ukepedia.com">Ukepedia, all 3 million Wikipedia articles one song at a time</a></p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/08/21/394/song-in-code-ramones-i-wanna-be-sedated/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Boron Fights Grass</title>
		<link>http://johnleach.co.uk/words/archives/2009/07/05/389/boron-fight-grass</link>
		<comments>http://johnleach.co.uk/words/archives/2009/07/05/389/boron-fight-grass#comments</comments>
		<pubDate>Sun, 05 Jul 2009 15:56:59 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Photoblog]]></category>
		<category><![CDATA[Photography]]></category>
		<category><![CDATA[armley]]></category>
		<category><![CDATA[bite]]></category>
		<category><![CDATA[boron]]></category>
		<category><![CDATA[cats]]></category>
		<category><![CDATA[claws]]></category>
		<category><![CDATA[fight]]></category>
		<category><![CDATA[grass]]></category>
		<category><![CDATA[leeds]]></category>
		<category><![CDATA[sun]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=389</guid>
		<description><![CDATA[]]></description>
			<content:encoded><![CDATA[<p><a href="http://johnleach.co.uk/photography/selections/photoblog/090605-boron-fight-grass.JPG?info"><img src="http://johnleach.co.uk/photography/selections/photoblog/090605-boron-fight-grass.JPG?preview" alt="Boron Fights Grass" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/07/05/389/boron-fight-grass/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Netfilter Conntrack Memory Usage</title>
		<link>http://johnleach.co.uk/words/archives/2009/06/17/372/netfilter-conntrack-memory-usage</link>
		<comments>http://johnleach.co.uk/words/archives/2009/06/17/372/netfilter-conntrack-memory-usage#comments</comments>
		<pubDate>Wed, 17 Jun 2009 10:56:57 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[conntrack]]></category>
		<category><![CDATA[firewall]]></category>
		<category><![CDATA[iptables]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[limit]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[max]]></category>
		<category><![CDATA[netfilter]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[ram]]></category>
		<category><![CDATA[slab]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=372</guid>
		<description><![CDATA[On a busy Linux Netfilter-based firewall, you usually need to up the maximum number of allowed tracked connections (or new connections will be denied and you&#8217;ll see log messages from the kernel link this: nf_conntrack: table full, dropping packet. More connections will use more RAM, but how much?  We don&#8217;t want to overcommit, as the [...]]]></description>
			<content:encoded><![CDATA[<p>On a busy Linux Netfilter-based firewall, you usually need to up the maximum number of allowed tracked connections (or new connections will be denied and you&#8217;ll see log messages from the kernel link this: <code>nf_conntrack: table full, dropping packet</code>.</p>
<p>More connections will use more RAM, but how much?  We don&#8217;t want to overcommit, as the connection tracker uses unswappable memory and things will blow up. If we set aside 512MB for connection tracking, how many concurrent connections can we track?</p>
<p>There is some <a href="http://www.wallfire.org/misc/netfilter_conntrack_perf.txt">Netfilter documentation on wallfire.org</a>, but it&#8217;s quite old. How can we be sure it&#8217;s still correct without completely understanding the Netfilter code?  Does it account for real life constraints such as page size, or is it just derived from looking at the code? A running Linux kernel gives us all the info we need through it&#8217;s <code>slabinfo</code> proc file.<br />
<span id="more-372"></span><br />
We can peek at how the kernel is using RAM using the proc file <code>/proc/slabinfo</code> and clear this up.  The <code>nf_conntrack</code> entry from here tells us, on one particular firewall, that there are 26,702 active entries (or objects), that each object is 304 bytes in size and 13 of them fit in each slab (and that each slab is 1 kernel page).  So we know that conntrack entries take up 304 bytes each.  But if we&#8217;re going to be accurate, then we have to account for the overhead of the kernel page size.</p>
<p>The Linux kernel uses a <a href="http://en.wikipedia.org/wiki/Slab_allocation">slab memory allocator</a>, so rather than allocating 304 bytes every time a conntrack entry is needed, they are allocated in &#8220;slabs&#8221; of one or more kernel pages which reduces memory fragmentation and improve performance.  When they&#8217;re done with, they&#8217;re not immediately freed &#8211; instead the memory is reused the next time another object of the same type is needed.</p>
<p>In the kernel we&#8217;re using, the page size is 4096 bytes. As slabinfo told us, 13 nf_conntrack objects fit in each slab and each slab takes up 1 page. 13 objects of 304 bytes is 3952 bytes in total, which leaves 144 bytes of waste per slab.  So every 13 objects we waste 144 bytes. So <strong>a <code>nf_conntrack</code> object consumes about 316 bytes</strong> on this box, giving us almost 1.7 million entries for our 512MB.</p>
<p>You can get your kernel&#8217;s page size with the command: <code>getconf PAGESIZE</code>.  The slabtop program, installed on most modern GNU/Linuxes, shows the info from <code>/proc/slabinfo</code> it in a pretty table and lets you sort the values.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/06/17/372/netfilter-conntrack-memory-usage/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>My Ukepedia Talk at Barcamp Leeds 2009</title>
		<link>http://johnleach.co.uk/words/archives/2009/06/05/374/my-ukepedia-talk-at-barcamp-leeds-2009</link>
		<comments>http://johnleach.co.uk/words/archives/2009/06/05/374/my-ukepedia-talk-at-barcamp-leeds-2009#comments</comments>
		<pubDate>Fri, 05 Jun 2009 09:25:19 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[barcamp]]></category>
		<category><![CDATA[bcleeds09]]></category>
		<category><![CDATA[leeds]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[otitis media]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[sing]]></category>
		<category><![CDATA[song]]></category>
		<category><![CDATA[ukelele]]></category>
		<category><![CDATA[ukepedia]]></category>
		<category><![CDATA[wikipedia]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=374</guid>
		<description><![CDATA[Tim Dobson very kindly recorded and uploaded my talk on the Ukepedia at Barcamp Leeds last Saturday. For those of your with short attention spans, I finally get started with the talk at about 2mins 30, and start singing the first article, Otitis Media, at about 7mins.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.tdobson.net/">Tim Dobson</a> very kindly recorded and uploaded my talk on the <a href="http://www.ukepedia.com/">Ukepedia</a> at Barcamp Leeds last Saturday.</p>
<p>For those of your with short attention spans, I finally get started with the talk at about 2mins 30, and start singing the first article, Otitis Media, at about 7mins.</p>
<p><embed src="http://blip.tv/play/AYGHjx8A" type="application/x-shockwave-flash" width="450" height="344" allowscriptaccess="always" allowfullscreen="true"></embed></p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/06/05/374/my-ukepedia-talk-at-barcamp-leeds-2009/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Live this Saturday at the Packhorse in Leeds, The Gillroyd Parade</title>
		<link>http://johnleach.co.uk/words/archives/2009/05/12/367/live-this-saturday-at-the-packhorse-in-leeds-the-gillroyd-parade</link>
		<comments>http://johnleach.co.uk/words/archives/2009/05/12/367/live-this-saturday-at-the-packhorse-in-leeds-the-gillroyd-parade#comments</comments>
		<pubDate>Tue, 12 May 2009 12:57:26 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Music]]></category>
		<category><![CDATA[Personal]]></category>
		<category><![CDATA[folk]]></category>
		<category><![CDATA[gospel]]></category>
		<category><![CDATA[guitar]]></category>
		<category><![CDATA[harmonica]]></category>
		<category><![CDATA[sci-fi]]></category>
		<category><![CDATA[theremin]]></category>
		<category><![CDATA[ukelele]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=367</guid>
		<description><![CDATA[My band, The Gillroyd Parade, are hosting an evening of acoustic music at the Packhorse Pub this Saturday (7pm to 11pm, 16th May). Supported by Ukelele Bitch Slap. Do come along, it&#8217;d be just dandy to see you.  Full poster here.]]></description>
			<content:encoded><![CDATA[<p>My band, <a href="http://www.thegillroydparade.org.uk/">The Gillroyd Parade</a>, are hosting an evening of acoustic music at the <a href="http://www.beerintheevening.com/pubs/s/30/3046/Packhorse/Woodhouse">Packhorse Pub</a> this Saturday (7pm to 11pm, 16th May). Supported by <a href="http://www.myspace.com/ukulelebitchslap">Ukelele Bitch Slap</a>. Do come along, it&#8217;d be just dandy to see you.  <a href="http://www.thegillroydparade.org.uk/packhorse-may-2009.jpg">Full poster here.</a></p>
<p><img class="alignnone size-full wp-image-369" title="The Gillroyd Parade" src="http://johnleach.co.uk/words/wp-content/uploads/2009/05/gillroyd-parade-raygun.jpg" alt="The Gillroyd Parade" width="450" height="411" /></p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/05/12/367/live-this-saturday-at-the-packhorse-in-leeds-the-gillroyd-parade/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>April Fool: A man in Jalawla walked into a bar&#8230;</title>
		<link>http://johnleach.co.uk/words/archives/2009/04/01/364/april-fool-a-man-in-jalawla-walked-into-a-bar</link>
		<comments>http://johnleach.co.uk/words/archives/2009/04/01/364/april-fool-a-man-in-jalawla-walked-into-a-bar#comments</comments>
		<pubDate>Wed, 01 Apr 2009 10:51:19 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Politics]]></category>
		<category><![CDATA[al-qaeda]]></category>
		<category><![CDATA[bbc]]></category>
		<category><![CDATA[bomb]]></category>
		<category><![CDATA[iraq]]></category>
		<category><![CDATA[journalism]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[medialens]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[propaganda]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=364</guid>
		<description><![CDATA[Medialens spotted that the BBC attributed a bomb attack on Monday in Iraq to &#8220;al-Qaeda&#8221;, with apparently little evidence.  They wrote to the BBC&#8217;s &#8220;man in Baghdad&#8221;, Hugh Sykes, and asked him &#8220;what is the evidence that al-Qaeda, rather than some other insurgent group, were behind the attacks&#8221;?. Hugh&#8217;s answer genuinely made me think this [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.medialens.org/">Medialens</a> spotted that the BBC <a href="http://news.bbc.co.uk/1/hi/world/middle_east/7959918.stm">attributed a bomb attack on Monday in Iraq to &#8220;al-Qaeda&#8221;,</a> with apparently little evidence.  They wrote to the BBC&#8217;s &#8220;man in Baghdad&#8221;, Hugh Sykes, and asked him &#8220;what is the evidence that al-Qaeda, rather than some other insurgent group, were behind the attacks&#8221;?.</p>
<p>Hugh&#8217;s answer genuinely made me think this was an early April Fool&#8217;s joke. In fact I&#8217;m still not sure Medialens aren&#8217;t making me look like an idiot:</p>
<blockquote><p>No proof, but circumstantial evidence and reasonable presumption of AQI [al-Qaeda in Iraq] involvement &#8211; very much their modus operandum. Suicide attacks are their signature method, and this was a dramatic detonation suggesting a lot of explosive &#8211; again, very AQI.</p>
<p>And&#8230;who else would do this?</p>
<p>So, process of elimination, history of AQI attacks in Diyala etc.</p>
<p>And the logic of it Sunni Arab vs Iraqi Kurds. As a man in Jalawla told Reuters:</p>
<p><em>&#8220;Al-Qaida is targeting the Kurds because it believes that<br />
we are involved in the political process and collaborating<br />
with the Americans.&#8221;</em></p></blockquote>
<p>This blows my mind. &#8220;very AQI&#8221; and &#8220;a man in Jalawla told Reuters&#8221;. &#8220;Who else would do this?&#8221;</p>
<p>As Medialens point out, the BBC claim they are &#8220;committed to evidence-based journalism&#8221;. Except they pick and choose when their committment applies, such as when they refused to report the use of banned weapons by US forces in their November 2004 assault on Falljuah.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/04/01/364/april-fool-a-man-in-jalawla-walked-into-a-bar/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My NWRUG Ferret Talk</title>
		<link>http://johnleach.co.uk/words/archives/2009/03/24/362/my-nwrug-ferret-talk</link>
		<comments>http://johnleach.co.uk/words/archives/2009/03/24/362/my-nwrug-ferret-talk#comments</comments>
		<pubDate>Tue, 24 Mar 2009 17:12:07 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Ruby on Rails]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[ferret]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[inverse index]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[sphinx]]></category>
		<category><![CDATA[talk]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=362</guid>
		<description><![CDATA[I did a short talk on Ferret, the Ruby &#8220;Information Retreival Library&#8221;, at the North West Ruby Users Group last Thursday.  We had a bit of a theme too, with Will Jessop speaking about Sphinx and Asa Calow speaking about Solr. I got to have a bit of a nosey around the Manchester BBC building [...]]]></description>
			<content:encoded><![CDATA[<p>I did a short talk on <a href="http://ferret.davebalmain.com/">Ferret</a>, the Ruby &#8220;Information Retreival Library&#8221;, at the <a href="http://nwrug.org/events/march09/">North West Ruby Users Group</a> last Thursday.  We had a bit of a theme too, with Will Jessop speaking about Sphinx and Asa Calow speaking about Solr.</p>
<p>I got to have a bit of a nosey around the Manchester BBC building too &#8211; though I was worried I&#8217;d open the wrong door and end up on TV. Didn&#8217;t fancy having to apologise to Jeremy Paxman.</p>
<p><a href="http://www.brightbox.co.uk">Brightbox</a> also sponsored some pizza, and gave away t-shirts and stickers like candy (there was no candy though).</p>
<p>My <a href="http://johnleach.co.uk/documents/talks/090319-ruby-ferret-nwrug/">slides are available here</a>, and contain a little example file system indexer. I made my slides with <a href="http://webby.rubyforge.org/">webby</a> and <a href="http://github.com/geraldb/s6/tree/master">S6</a> if you&#8217;re interested.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/03/24/362/my-nwrug-ferret-talk/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Women in Technology</title>
		<link>http://johnleach.co.uk/words/archives/2009/03/16/350/women-in-technology</link>
		<comments>http://johnleach.co.uk/words/archives/2009/03/16/350/women-in-technology#comments</comments>
		<pubDate>Mon, 16 Mar 2009 23:07:56 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[event]]></category>
		<category><![CDATA[forward-ladies]]></category>
		<category><![CDATA[geek-girl]]></category>
		<category><![CDATA[geeks]]></category>
		<category><![CDATA[geekup]]></category>
		<category><![CDATA[leeds]]></category>
		<category><![CDATA[social]]></category>
		<category><![CDATA[women]]></category>
		<category><![CDATA[women on the web]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=350</guid>
		<description><![CDATA[Dom kicked up a women in technology debate again recently.  I&#8217;ve seen a few responses, from one chap who thinks women have achieved equality already to a woman who doesn&#8217;t think girl&#8217;s brains are generally good for &#8220;programming&#8221; &#8211; and someone else who thinks there isn&#8217;t a problem as long as you&#8217;re thick skinned enough [...]]]></description>
			<content:encoded><![CDATA[<p>Dom <a href="http://www.thehodge.co.uk/random-musings/rants/women-in-technology-events.php">kicked up</a> a women in technology debate again recently.  I&#8217;ve seen a few responses, from one chap who thinks women have achieved equality already to a woman who doesn&#8217;t think girl&#8217;s brains are generally good for &#8220;programming&#8221; &#8211; and someone else who thinks there isn&#8217;t a problem as long as you&#8217;re thick skinned enough to put up with a sexually hostile workplace.</p>
<p>The main gripe appears to be with &#8220;women only&#8221; conferences, such as the <a href="http://womenontheweb.wordpress.com/">Women on the Web conference</a>, organised by a group called <a href="http://www.forwardladies.com/">Forward Ladies</a>, or the <a href="http://londongirlgeekdinners.co.uk/about-us/">Geek Girl dinners</a>.</p>
<p>I think a fair summary of his, and some other commenters, opinion is that these &#8220;women-only&#8221; events don&#8217;t help the effort to get more women involved in technology. Comparing it to positive discrimination in many ways.</p>
<h3><span id="more-350"></span>Women Friendly</h3>
<p>The way I see these events is more &#8220;women-friendly&#8221;, rather than &#8220;women-only&#8221;.  With Geek Girl dinners, this is explicit, as men can attend at the invite of a women. A simple, but generally effective heuristic to select for friendliness to women.</p>
<p>With Women on the Web, it&#8217;s less clear, but <strong>nowhere does it state men are not welcome</strong>.  The same goes for the Forward Ladies membership terms and conditions. I&#8217;m not sure how Dom knows this event is women only (I&#8217;ve emailed Forward Ladies for more info).</p>
<p>Until I&#8217;m shown otherwise, I&#8217;m assuming the Women on the Web conference is women-friendly, not women-only.  Of course, if you browse the site you see photos of women, and all the past speakers appear to be women, and this might not be that inviting to men &#8211; but how inviting do you think <a title="Info-security conference" href="http://www.infosec.co.uk">Infosec</a> is to women, with photos of rooms full of men, and women in skimpy clothes giving out leaflets? Perhaps all the speakers are women because they just don&#8217;t get many men interested &#8211; maybe men just  need to be thicker skinned and ask to be involved.</p>
<p>As an amusing side note, after checking the Infosec url was correct I caught sight of this year&#8217;s branding &#8211; hundreds of what I assume are just people, but they happen to be using the pretty well established symbol for &#8220;men&#8221; (let&#8217;s not get into that though).</p>
<p><a href="http://www.infosec.co.uk"><img class="alignnone size-full wp-image-353" title="Infosec conference logo" src="http://johnleach.co.uk/words/wp-content/uploads/2009/03/infosec-men-logo.png" alt="Infosec conference logo" width="384" height="141" /></a></p>
<h3>Demand</h3>
<p>You only have to <a href="http://www.flickr.com/photos/imran/sets/72157606723254317/">see the turnout of women</a> at Leeds Geek Girl dinners compared to the turnout of women at Leeds Geekup to know there is a demand for these women-friendly events. For whatever reasons, some women are more likely to go an explicitly women-friendly geek event than another random geek event. Of course women should be encouraged to come to all events, but not all events are equally friendly to women, and it&#8217;s often difficult to assess how friendly they are from outside.</p>
<h3>Social Groups</h3>
<p>Women are a social group  &#8211; they&#8217;ll often share some common experiences and outlooks due to their sex &#8211; due to society&#8217;s treatment of them as a group of people. This goes for men too. And gay people. And geeks. Etc.  When you see women in this way, you&#8217;d expect some of them to organise and attend these types of events. Most social groups do.</p>
<p>Geek Girl Dinners just have a convenient way of selecting women-friendly people: women, or men invited by women. Of course, not all women are women friendly  &#8211; that is just a stereotype, but it seems to work for them.</p>
<p>Largely though, I think if you name things and brand things right, you get the people you&#8217;re hoping for. Maybe that&#8217;s why Infosec is a sausage-fest.</p>
<h4>Update</h4>
<p>Forward Ladies has confirmed that men <em>are</em> welcome to Women on the Web, and are welcome as members of Forward Ladies too. They also run 50/50 events &#8220;to which men are specifically invited&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/03/16/350/women-in-technology/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Leeds Market Big Wigs</title>
		<link>http://johnleach.co.uk/words/archives/2009/03/07/348/leeds-market-big-wigs</link>
		<comments>http://johnleach.co.uk/words/archives/2009/03/07/348/leeds-market-big-wigs#comments</comments>
		<pubDate>Sat, 07 Mar 2009 10:48:52 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Photoblog]]></category>
		<category><![CDATA[Photography]]></category>
		<category><![CDATA[heads]]></category>
		<category><![CDATA[leeds]]></category>
		<category><![CDATA[mannequins]]></category>
		<category><![CDATA[market]]></category>
		<category><![CDATA[photos]]></category>
		<category><![CDATA[wigs]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=348</guid>
		<description><![CDATA[More Leeds Market photos here on my Flickr profile.]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone" title="Big Wigs" src="http://farm4.static.flickr.com/3333/3333500433_3b4b7315e7.jpg" alt="" width="500" height="333" /></p>
<p>More Leeds Market photos <a href="http://www.flickr.com/photos/johnleach/sets/72157614816420291/">here on my Flickr profile</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/03/07/348/leeds-market-big-wigs/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Techietubbies live video podcast</title>
		<link>http://johnleach.co.uk/words/archives/2009/02/16/340/techietubbies-live-video-podcast</link>
		<comments>http://johnleach.co.uk/words/archives/2009/02/16/340/techietubbies-live-video-podcast#comments</comments>
		<pubDate>Mon, 16 Feb 2009 16:46:58 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=340</guid>
		<description><![CDATA[I&#8217;m joining Dom and Rahoul tonight on a live video broadcast of their Techietubbies podcast thing. From the site: &#8220;Techietubbies is a weekly podcast covering a multitude of subjects, from a round up of the week&#8217;s tech news, live callers, competitions, questions and answers&#8230; and beer :)&#8221; Though I&#8217;m driving, so no tech news for [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m joining Dom and Rahoul tonight on a live video broadcast of their <a href="http://www.techietubbies.co.uk/podcast/">Techietubbies</a> podcast thing.</p>
<p>From the site:</p>
<blockquote><p>&#8220;Techietubbies is a weekly podcast covering a multitude of subjects, from a round up of the week&#8217;s tech news, live callers, competitions, questions and answers&#8230; and beer :)&#8221;</p></blockquote>
<p>Though I&#8217;m driving, so no tech news for me. I think it&#8217;s recorded if you can&#8217;t see the live thing.  It&#8217;ll be <a href="http://www.ustream.tv/techietubbies">broadcast live here via ustream.tv</a></p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/02/16/340/techietubbies-live-video-podcast/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>My native language</title>
		<link>http://johnleach.co.uk/words/archives/2009/02/16/335/my-native-language</link>
		<comments>http://johnleach.co.uk/words/archives/2009/02/16/335/my-native-language#comments</comments>
		<pubDate>Mon, 16 Feb 2009 14:36:06 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[brains]]></category>
		<category><![CDATA[concious]]></category>
		<category><![CDATA[geek]]></category>
		<category><![CDATA[intuition]]></category>
		<category><![CDATA[neuroscience]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[psychology]]></category>
		<category><![CDATA[subconcious.]]></category>
		<category><![CDATA[thinking]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=335</guid>
		<description><![CDATA[I&#8217;m currently reading Nudge, by Richard H. Thaler and Cass R. Sunstein. It says many psychologists and neuroscientists agree that we humans have two general types of thinking, intuitive and rational. Also known as automatic and reflective.  When dodging a ball thrown at you, getting nervous when your aeroplane hits turbulence or smiling when you [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone size-full wp-image-337" title="Severed head" src="http://johnleach.co.uk/words/wp-content/uploads/2009/02/severed-head.png" alt="Severed head" width="165" height="337" /> I&#8217;m currently reading <a href="http://www.nudges.org/">Nudge</a>, by Richard H. Thaler and Cass R. Sunstein. It says many psychologists and neuroscientists agree that we humans have two general types of thinking, intuitive and rational. Also known as automatic and reflective.  When dodging a ball thrown at you, getting nervous when your aeroplane hits turbulence or smiling when you see a cute cat the automatic system is working.  When doing some mathematics, or writing a blog post, you (mostly) use reflective.  Speaking native, or &#8220;first&#8221; languages uses the automatic.  Speaking a second language usually uses reflective.</p>
<p>I realised that having tinkered with computers heavily almost my entire life, a lot of my &#8220;computer skills&#8221; have shifted into the intuitive, automatic systems.  I obviously (hopefully) use the rational systems a great deal, but underlying it is definitly intuition &#8211; the gut feeling of where to go next to solve the problem.  I regularly come up seemingly random avenues of investigation that lead to gold and I couldn&#8217;t say with any certainty why I thought of it.  I&#8217;m assuming this is the same for most computer geeks (and chess geeks, cooking geeks, music geeks etc. :).  It&#8217;s become a native language for us.</p>
<p>I don&#8217;t think the average rational system can easily deal with very complex problems.  It&#8217;s great for some more-linear concentrated work or planning, but for big stuff with lots of parts &#8211; hard work.  I think I usually research and &#8220;pre-process&#8221; a bunch of material around a problem using my rational system, then my automatic system gets to work mulling over the bigger picture.  Then when I&#8217;m making rational decisions about it, I&#8217;m heavily informed by the intuition. Or sometimes just when I&#8217;m showering.</p>
<p>Anyway, not sure where I was going with this other than a &#8220;aren&#8217;t I great&#8221; blog post. The summary would be, don&#8217;t rely on your rational systems so much. Give the intuitive some good mulling time. And shower regularly.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/02/16/335/my-native-language/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Gravedigg: What will die next?</title>
		<link>http://johnleach.co.uk/words/archives/2009/02/14/331/gravedigg-what-will-die-next</link>
		<comments>http://johnleach.co.uk/words/archives/2009/02/14/331/gravedigg-what-will-die-next#comments</comments>
		<pubDate>Sat, 14 Feb 2009 13:45:56 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[bust]]></category>
		<category><![CDATA[death]]></category>
		<category><![CDATA[digg]]></category>
		<category><![CDATA[failing]]></category>
		<category><![CDATA[vote]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=331</guid>
		<description><![CDATA[Gravedigg is like Digg, but rather than voting for pictures of cute cats or top ten lists of stuff, you vote on what you think will die or fail next.  Companies, celebrities, technologies&#8230; whatever.  So maybe you think the Perl programming language is on it&#8217;s way out very soon, or that Iceland is on its [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone" title="Grave Digg Gravestone" src="http://www.gravedigg.co.uk/themes/gravedigg/images/grave-64.png" alt="" width="64" height="70" /><a href="http://www.gravedigg.co.uk">Gravedigg</a> is like Digg, but rather than voting for pictures of cute cats or top ten lists of stuff, you vote on what you think will die or fail next.  Companies, celebrities, technologies&#8230; whatever.  So maybe you think the <a href="http://www.gravedigg.co.uk/candidates/16-perl-programming-language">Perl programming language</a> is on it&#8217;s way out very soon, or that <a href="http://www.gravedigg.co.uk/candidates/10-iceland">Iceland is on its last legs</a> or that <a href="http://www.gravedigg.co.uk/candidates/18-steve-jobs">Steve Jobs is boned</a>.</p>
<p><a href="http://www.louisaparry.co.uk">Louisa</a> and I put this together in just a few days, me coding and Louisa designing. Was a fun little project to do.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/02/14/331/gravedigg-what-will-die-next/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SAS and the R Programming Language</title>
		<link>http://johnleach.co.uk/words/archives/2009/01/08/327/sas-and-the-r-programming-language</link>
		<comments>http://johnleach.co.uk/words/archives/2009/01/08/327/sas-and-the-r-programming-language#comments</comments>
		<pubDate>Thu, 08 Jan 2009 00:16:21 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[free software]]></category>
		<category><![CDATA[nytimes]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[sas]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=327</guid>
		<description><![CDATA[This New York Times article about the R programming language is pretty good, though there is a hilarious quote in it from proprietary software company that apparently make a similar product. Anne H. Milley, director of technology product marketing at SAS says: “We have customers who build engines for aircraft. I am happy they are [...]]]></description>
			<content:encoded><![CDATA[<p>This <a href="http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=1">New York Times article</a> about <a href="http://www.r-project.org/">the R programming language</a> is pretty good, though there is a hilarious quote in it from proprietary software company that apparently make a similar product. Anne H. Milley, director of technology product marketing at <a href="http://www.sas.com/">SAS</a> says:</p>
<blockquote><p>“We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.”</p></blockquote>
<p>That&#8217;s pretty funny. She&#8217;s basically saying</p>
<blockquote><p>&#8220;It&#8217;s better to build important things with tools you can&#8217;t examine for yourself.&#8221;</p></blockquote>
<p>SAS claim to have over 40,000 customer sites worldwide.  The news article claim 250,000 people use R regularly.  The difference here isn&#8217;t in the numbers of users, it&#8217;s that, with R, every user is a potential developer.  SAS can&#8217;t possibly compete with that.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2009/01/08/327/sas-and-the-r-programming-language/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Virtualized Storage Talk at WYLUG</title>
		<link>http://johnleach.co.uk/words/archives/2008/11/10/325/virtualized-storage-talk-at-wylug</link>
		<comments>http://johnleach.co.uk/words/archives/2008/11/10/325/virtualized-storage-talk-at-wylug#comments</comments>
		<pubDate>Mon, 10 Nov 2008 16:47:32 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[lvm]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[talks]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[wylug]]></category>

		<guid isPermaLink="false">http://johnleach.co.uk/words/?p=325</guid>
		<description><![CDATA[I&#8217;m doing a talk tonight about virtualizing your storage with LVM on Linux at the West Yorkshire Linux User Group. Sorry about the short notice here (it was announced earlier in the week elsewhere though). My mate Paul Brook is talking about RAID on Linux too. Come along for the talk, or the beer, or [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m doing a talk tonight about virtualizing your storage with LVM on Linux at the <a href="http://www.wylug.org.uk/2008/11/wylug-monthly-meeting-monday-10th-november-2008/">West Yorkshire Linux User Group</a>.  Sorry about the short notice here (it was announced earlier in the week elsewhere though).</p>
<p>My mate Paul Brook is talking about RAID on Linux too.</p>
<p>Come along for the talk, or the beer, or the socialising &#8211; or all three.</p>
]]></content:encoded>
			<wfw:commentRss>http://johnleach.co.uk/words/archives/2008/11/10/325/virtualized-storage-talk-at-wylug/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
