time to bleed by Joe Damato

technical ramblings from a wanna-be unix dinosaur

I/O models: how you move your data matters

View Comments

Above picture was shamelessly stolen from: http://computer-history.info/Page4.dir/pages/IBM.7030.Stretch.dir/

In this blog post I’m going to follow suit on my threading models post (here) and talk about different types of I/O, how they work, and when you might want to consider using them. Much like with threading models, I/O models have terminology which can be confusing. The confusion leads to misconceptions which will hopefully be cleared up here.

Let’s start first by going over some operating system basics.

System Calls

A system call is a common interface which allows user applications and the operating system kernel to interact with one another. Some familiar functions which are system calls: open(), read(), and write(). These are system calls which ask the kernel to do I/O on behalf of the user process.

There is a cost associated with making system calls. In Linux, system calls are implemented via a software interrupt which causes a privilege level change in the processor – this switch from user to kernel mode is commonly called a context-switch.

User applications typically execute at the most restricted privilege level available where interaction with I/O devices (and other stuff) is not allowed. As a result user applications use system calls to get the kernel to complete privileged I/O (and other) operations.

Synchronous blocking I/O

This is the most familiar and most common type of I/O out there. When an I/O operation is initiated in this model (maybe by calling a system call such as read(), write(), ioctl(), …), the user application making the system call is put into a waiting state by the kernel. The application sleeps until the I/O operation has completed (or has generated an error) at which point it is scheduled to run again. Data is transferred from the device to memory and possibly into another buffer for the user-land application.

Pros:

  • Easy to use and well understood
  • Ubiquitous

Cons:

  • Does not maximize I/O throughput
  • Causes all threads in a process to block if that process uses green threads

This method of I/O is very straight forward and simple to use, but it has many downsides. In a previous post about threading models, I mentioned that doing blocking I/O in a green thread causes all green threads to stop executing until the I/O operation has completed.

This happens because there is only one kernel context which can scheduled, so that context is put into a waiting state in the kernel until the I/O has been copied to the user buffer and the process can run again.

Synchronous non-blocking I/O

This model of I/O is not very well known compared to other models. This is good because this model isn’t very useful.

In this model, a file descriptor is created via open(), but a flag is passed in (O_NONBLOCK on most Linux kernels) to tell the kernel: If data is not available immediately, do not put me to sleep. Instead let me know so I can go on with my life. I’ll try back later.

Pros:

  • If no I/O is available other work can be completed in the meantime
  • When I/O is available, is does not block the thread (even models with green threads)

Cons:

  • Does not maximize I/O throughput for the application
  • Lots of system call overhead – constantly making system calls to see if I/O is ready
  • Can be high latency if I/O arrives and a system call is not made for a while

This model of I/O is typically very inefficient because the I/O system call made by the application may return EAGAIN or EWOULDBLOCK repeatedly. The application can either:

  • wait around for the data to finish (repeatedly calling its I/O system call over and over)  — or
  • try to do other work for a bit, and retry the I/O system call later

At some point the I/O will either return an error or it will be able to complete.

If this type of I/O is used in a system with green threads, the entire process is not blocked but the efficiency is very poor due to the constant polling with system calls from user-land. Each time a system call is invoked a privelege level change occurs on the processor and the execution state of the application has to be saved out to memory (or disk!) so that the kernel can execute.

Asynchronous blocking I/O

This model of I/O is much more well known. In fact, this is how Ruby implements I/O for its green threads.

In this model, non-blocking file descriptors are created (similar to the previous model) and they monitored by calling either select() or poll(). The system call to select()/poll() blocks the process (the process is put into a sleeping state in the kernel) and the system call returns when either an error has occurred or when the file descriptors are ready to be read from or written to.

Pros:

  • When I/O is available is does not block
  • Lots of I/O can be issued to execute in parallel
  • Notifications occur when one or more file descriptors are ready (helps to improve I/O throughput)

Cons:

  • Calling select(), poll(), or epoll_wait() blocks the calling thread (entire application if using green threads)
  • Lots of file descriptors for I/O means lots that have to be checked (can be avoided with epoll)

What is important to note here is that more than one file descriptor can be monitored and when select/poll returns, more than one of the file descriptors may be able to do non-blocking I/O. This is great because it increases the application’s I/O throughput by allowing many I/O operations to occur in parallel.

Of course there are two main drawbacks of using this model:

  • select()/poll() block – so if they are used in a system with green threads, all the threads are put to sleep while these system calls are executing.
  • You must check the entire set of file descriptors to determine which are ready. This can be bad if you have a lot of file descriptors, because you can potentially spend a lot of time checking file descriptors which aren’t ready (epoll() fixes this problem).

This model is important for all you Ruby programmers out there — this is the type of I/O that Ruby uses internally. The calls to select cause Ruby to block while they are being executed.

There are some work-arounds though:

  • Timeouts – select() and poll() let you set timeouts so your app doesn’t have to sleep endlessly if there is no I/O to process – it can continue executing other code in the meantime. This what Ruby does.
  • epoll() (or kqueue on bsd)- epoll() allows you to register a set of file descriptors you are interested in. You then make blocking epoll_wait calls (they accept timeouts) which will return only the file descriptors which are ready for I/O. This allows you to avoid searching through all your file descriptors every time.

At the very least you should set a timeout so that you can do other work if no I/O is ready. If possible though, use epoll().

Asynchronous non-blocking I/O

This is probably the least widely known model of I/O out there. This model of io is implemented via the libaio library in Linux.

In this I/O model, you can initiate I/O using aio_read(), aio_write(), and a few others. Before using these functions, you must set up a struct aiocb including fields which indicate how you’d like to get notifications and where the data can be read from or written to. Notifications can be delivered in a couple different ways:

  • Signal – a SIGIO is delivered to the process when the I/O has completed
  • Callback – a callback function is called when the I/O has completed

Pros:

  • Helps maximize I/O throughput by allowing lots of I/O to issued in parallel
  • Allows application to continue processing while I/O is executing, callback or POSIX signal when done

Cons:

  • Wrapper for libaio may not exist for your programming environment
  • Network I/O may not be supported

This method of I/O is really awesome because it does not block the calling application and allows multiple I/O operations to executed in parallel which increases the I/O throughput of the application.

The downsides to using libaio are:

  • Wrapper may not exist for your favorite programming language.
  • Unclear whether libaio supports network I/O on all systems — may only support disk I/O. When this happens, the library falls back to using normal synchronous blocking I/O.

You should try out this I/O model if your programming environment has support for it and it either has support for network I/O or you don’t need it.

Conclusion

In conclusion, you should use synchronous blocking I/O when you are writing small apps which won’t see much traffic. For more intense applications, you should definitely use one of the two asynchronous models. If possible, avoid synchronous non-blocking I/O at all costs.

Remember that the goal is to increase I/O throughput to scale your application to withstand thousands of requests per second. Doing any sort of blocking I/O in your application can (depending on threading model) cause your entire application to block, increasing latency and slowing the user experience to a crawl.

Written by Joe Damato

October 27th, 2008 at 8:58 am

Posted in systems

Tagged with , , , , ,

  • Ryutlis Wang

    Thanks for this nice article.

  • Filipe Santos

    Great Round-Up!

  • Mike Ryan

    Wow, remind me to drink my coffee before replying next time. At 10 am no less..

  • joe

    @Steve -- that isn't asynchronous because you don't receive asynchronous notifications. Doing select()'s with a zero timeout is like synchronous non-blocking I/O.

    @mike -- ???? The model you describe is not synchronous blocking -- it is asynchronous non-blocking which I actually recommended.

    Thanks for reading guys.

  • Mike Ryan

    Synchronous blocking IO is GREAT when you have a lot of file descriptors to perform IO on. The key is to use a polling mechanism such as select or poll. You put all your FDs in non-blocking mode and poll a list of them using one of those system calls. Your application will then block until any one of them is ready to be read or written.

    This is the model used in many high-performance, high-throughput servers. At one point, it was even the default model used in Apache (still exists in mpm-event). I would hardly consider it something "to be avoided at all costs"!

blog comments powered by Disqus