Why do I care?
Threading models are often very confusing; there are many different models with different trade-offs and dissecting the details can be tough the first time around. It is important for any large scale project to consider what threading model(s) a programming language supports and what implications the model(s) will have on the system design so that the software system that performs as optimally as possible can be built.
Probably the source of a lot of the confusion surrounding threading models is the terminology used to describe the different components. I am going to try to explain some terminology which, to my knowledge, is the most commonly used.
This could be a blog post in and of itself, but let’s try to stay high level here. When I write “user-land” I am referring to the context in which normal applications run, such as a web-browser, or email client. When I write “kernel-land” I am referring to the context in which the kernel executes, typically a more privileged execution context that allows interaction with memory, I/O ports, process scheduling, and other funky stuff.
What is a process?
A process is a collection of various pieces of state for an executable that includes things such as virtual address space, per process flags, file descriptors, and more.
What is a thread?
A thread is just a collection of execution state for a program. Depending on the implementation this can include register values, execution stack, and more. Each process has at least one thread, the main thread. Some processes will create more threads. How new threads are created is where we begin considering the trade-offs.
Let’s look at some different threading models. I’m going to list the Pros and Cons first in case you don’t feel like reading the full explanation :) Let’s get started.
The 1:1 model, or one kernel thread for each user thread, is a very widespread model that is seen in many operating system implementations like Linux. It is sometimes referred to as “native threads.”
- Threads can execute on different CPUs
- Threads do not block each other
- Shared memory
- Setup overhead
- Linux kernel bug with lots of threads (read more here)
- Low limits on the number of threads which can be created
What does this mean? This means that each user-thread (execution state in user-land) is paired with a kernel-thread (execution state in kernel-land). The two commonly interact via system calls and signals. Since state exists in the kernel, the scheduler can schedule threads created in the 1:1 model across different CPUs to execute in parallel. A side effect of this is that if a thread executes a system call that blocks, the other threads in the process can be scheduled and executed in the mean time. In this model, different threads can share the same virtual address space but care must be taken to synchronize access to the same memory regions. Unfortunately, since the kernel has to be notified when a new thread is created in userland so corresponding state can be created in the kernel, this setup cost is overhead that must be paid each time a thread is created and there is an upper bound on the number of threads and thread state that the kernel can track before performance begins to degrade.
You may be familiar with libpthread and the function pthread_create. On Linux, this creates user and kernel state.
The 1:N model, or one kernel thread for N user threads, is a model that is commonly called “green threads” or “lightweight threads.”
- Thread creation, execution, and cleanup are cheap
- Lots of threads can be created (10s of thousands or more)
- Kernel scheduler doesn’t know about threads so they can’t be scheduled across CPUs or take advantage of SMP
- Blocking I/O operations can block all the green threads
In this model a process manages thread creation, termination, and scheduling completely on its own without the help or knowledge of the kernel. The major upside of this model is that thread creation, termination, cleanup, and synchronization is extremely cheap, and it is possible to create huge numbers of threads in user-land. This model has several downsides, though. One of the major downsides of not being able to utilize the kernel’s scheduler. As a result, all the user-land threads execute on the same CPU and cannot take advantage of true parallel execution. One way to cope with this is to create multiple processes (perhaps via fork()) and then have the processes communicate with each other. A model like this begins to look very much like the M:N model described below.
MRI Ruby 1.8.7 has green threads. Early versions of Java also had green threads.
The M:N model, or M kernel threads for N user threads, is a model that is a hybrid of the previous two models.
- Take advantage of multiple CPUs
- Not all threads are blocked by blocking system calls
- Cheap creation, execution, and cleanup
- Need scheduler in userland and kernel to work with each other
- Green threads doing blocking I/O operations will block all other green threads sharing same kernel thread
- Difficult to write, maintain, and debug code
This hybrid model appears to be a best of both worlds solution that includes all the advantages of 1:1 and 1:N threading without any of the downsides. Unfortunately the cost of the downsides outweighs many of the advantages to such an extent that it isn’t worth it in many cases to build/use an M:N threading model. In general, building and synchronizing a user-land scheduler with a kernel scheduler makes programming in this model extremely difficult and error prone. Research on M:N threading vs 1:1 threading was done for the Linux kernel to determine how threading was going to evolve. Research into performance implications and use cases on Linux showed the 1:1 model to be superior in general. On the other hand, in specific problem domains that are well understood M:N may be the right choice.
Erlang has what many consider to be an M:N threading model. Prior to Solaris 9, Solaris supported an M:N threading model.
So which should I use?
Well, it is a tough call. You need to sit and think awhile about what your specific system needs and how intelligent your libraries are. In some implementations of the 1:N threading model I/O operations will all be abstracted away into a non-blocking I/O subsystem. If your library of choice does not (or cannot due to language design) hook into this non-blocking I/O subsystem, your library may block all your green threads clobbering your performance.
You should strongly consider the threading model(s) supported by the programming language(s) and libraries you choose because this decision will have impact on your performance, application execution time, and I/O operations.
Thanks for reading!