time to bleed by Joe Damato

technical ramblings from a wanna-be unix dinosaur

Fibers implemented for Ruby 1.8.{6,7}

View Comments

At Kickball Labs, Aman Gupta (http://github.com/tmm1) and I (http://github.com/ice799) have been working on an implementation of Fibers for Ruby 1.8.{6,7}. It is API compatible to Fibers in Ruby 1.9, except for the “transfer” method, which is currently unimplemented. This patch will allow you to use fibers with mysqlplus and neverblock.

THIS IS ALPHA SOFTWARE (we are using it in production, though), USE WITH CAUTION.

Raw patches

Patch against ruby-1.8.7_p72: HERE.

Patch against ruby-1.8.6_p287: HERE.

To use the patch:
Download ruby source Ruby 1.8.7_p72, or if you prefer: Ruby 1.8.6-p287

Then, perform the following:

cd your-ruby-src-directory/
wget http://timetobleed.com/files/fibers-RUBY_VERSION.patch
patch -p1 < fibers.patch
./configure —-disable-pthread —-prefix=/tmp/ruby-with-fibers/ &&  make && sudo make install
/tmp/ruby-with-fibers/bin/ruby test/test_fiber.rb

This will patch ruby and install it to a custom location: /tmp/ruby-with-fibers so you can test and play around with it without overwriting your existing Ruby installation.

Github

I am currently working on getting the ruby 1.8.6 patched code up on github, but Aman has a branch of ruby 1.8.7_p72 called fibers with the code at http://github.com/tmm1/ruby187/tree/fibers

What are fibers?

Fibers are (usually) non-preemptible lightweight user-land threads.

But I thought Ruby 1.8.{6,7} already had green threads?

You are right; it does. Fibers are simply ruby green threads, without preemption. The programmer (you) gets to decide when to pause and resume execution of a fiber instead of a timer.

Why would I use fibers?

Bottom line: Your I/O should be asynchronous whenever possible, but sometimes re-writing your entire code base to be asynch and have callbacks can be difficult or painful. A simple solution to this problem is to create or use (see: NeverBlock) some middleware that wraps code paths which make I/O requests in a fiber.

The middleware can issue the asynch I/O operation in a fiber, and yield. Once the middleware’s asynch callback is hit, the Fiber can be resumed. Using NeverBlock (or rolling something similar yourself), should require only minimal code changes to your application, and will essentially make all of your I/O requests asynchronous without much pain at all.

How do I use fibers?

There are already lots of great tutorials about fibers basics here and here.

Let’s take a look at something that drives home the point about being able to drop in some middleware to make synchronous code act asynchronous with minimal changes.

Consider the following code snippet:

require ‘rubygems’
require ‘sinatra’

# eventmachine/thin
require ‘eventmachine’
require ‘thin’

# mysql
require ‘mysqlplus’

# single threaded
DB = Mysql.connect

disable :reload

get ‘/’ do
  4.times do
    DB.query(‘select sleep(0.25)’)
  end
  ‘done’
end
 

This code snippet creates a simple webservice which connects to a mysql database and issues long running queries (in this case, 4 queries which execute for a total of 1 second).

In this implementation, only one request can be handled at a time; the DB.query blocks, so the other users have to wait to have their queries executed.

This sucks because certainly mysql can handle more than just 4 sleep(0.25) queries a second! But, what are our options?

Well, we can rewrite the code to be asynchronous and string together some callbacks. For my contrived example, doing that would be pretty easy and it’d be only slightly harder to read. Let’s use our imaginations. Let’s pretend the code snippet I just showed you was some huge, ugly, scary blob of code and rewritting it to be asynchronous would not only take a long time, it would also make the code very ugly and difficult to read.

Now, let’s drop in fibers:

require ‘rubygems’
require ‘sinatra’

# eventmachine/thin
require ‘eventmachine’
require ‘thin’

# mysql
require ‘mysqlplus’

# fibered
require ‘neverblock’
require ‘never_block/servers/thin’
require ‘neverblock-mysql’
class Thin::Server
 def fiber_pool() @fiber_pool ||= NB::Pool::FiberPool.new(20) end
end

DB = NB::DB::PooledDBConnection.new(20){ NB::DB::FMysql.connect }

disable :reload

get ‘/’ do
  4.times do
    DB.query(‘select sleep(0.25)’)
  end
  ‘done’
end
 

NOTICE: The application code hasn’t changed, we simply monkey patched Thin to use a pool of fibers.

Suddenly, our application can handle 20 connections. This is all handled by NeverBlock and mysqlplus.

  • NeverBlock uses the fiber pool to issue an asynch DB query via mysqplus.
  • After the asynch query is executed, NeverBlock pauses the executing fiber
  • At this point other requests can be serviced
  • When the data comes back from the mysql server, a callback in NeverBlock is executed.
  • The callback resumes the paused fiber, which continues executing.

Pretty sick, right?

Memory consumption, context switches, cooperative multi-threading, oh my!

In our implementation, fibers are ruby green threads, but with no scheduler or preemption. In fact, our fiber implementation shares many code-paths with the existing green thread implementation. As a result, there is very little difference in memory consumption between green threads and our fiber implementation.

Context switches are a different matter all together. The whole point of building a fiber implementation is to allow the programmer to decide when context switching is appropriate. In most circumstances, the application should be undergoing many fewer context switches with fibers and the context switches that do happen occur precisely when needed. As a result, the application can tend to run faster (fewer context switches ==> fewer stack copies ==> fewer CPU cycles).

The major advantage of fibers over green threads is that you get to control when execution starts and stops. The major disadvantage of fibers is that if you have to code carefully, to ensure that you are starting and stopping your fibers appropriately.

Future Directions

Next stop will be “stackless” fibers. I have a fork of the fibers implementation in the works that pre-allocates fiber stacks on the ruby process’ heap. I am hoping to eliminate the overhead associated with switching between fibers by simply shuffling pointers around.

A preliminary version seems to work, although a few bugs that crop up when you use fibers and threads together need to be squashed before the code can be considered “alpha” stage. When it’s done, you’ll find it right here.

Written by Joe Damato

February 5th, 2009 at 5:25 pm

  • ARJ

    We've modified Joe's 1.8.7p72 fibers patch to work with 1.8.7p174 and 1.8.7p249. It's great and very useful to us.

  • FWIW, I've solved the same problem using Ruby's very own 'Kernel#callcc'. Pros: small (~40 lines), robust, works with any Ruby that has 'callcc', true drop-in replacement (no patching necessary). Cons: Probably less efficient (dependent on the run-time characteristics of the 'callcc' implementation).

    So as long as you don't want to create thousands of fibers per second, this is an alternative:
    http://www.khjk.org/log/2010/j...

  • Awesome! Keep up the good work!

    Any word from Japan on whether this could be accepted into 1.8.8?

  • Just curious, what are you using it in production for, and what were other alternatives?

  • awe

    awesome thanks tmm1

blog comments powered by Disqus