time to bleed by Joe Damato

technical ramblings from a wanna-be unix dinosaur

Archive for the ‘package management’ tag

Yo Dawg: Using a package management system to install a package management system

View Comments

Consider the following scenario: You would like to run a common Linux distro (Debian Etch, Centos/RHEL, whatever) for stability, the large community surrounding it, and maybe even for third-party support.

There’s a catch though.

You also want to easily use and deploy a small number of custom packages. Why? Maybe you want to apply a patch for a library, compiler, interpreter, or something else you use. Sure, you could build a .deb or .rpm, but there is a bit of a learning curve; is that learning curve worth it just so you can apply a handful of patches?
At Kickball Labs, we wanted to use the “stable” versions of packages that come bundled with Debian for the base system, but we also wanted to be able to use new packages that have features we are interested in. We decided to layer pacman on top of apt and install a small number of custom packages to a /custom directory on the filesystem. This enables us to use stable packages by default, but let’s us override them when we feel it is necessary.

What sucks about RPM and APT (imho)

  1. Getting other people to use them – OK, so you’ve bought in to RPM or APT and you don’t mind reading all the docs and cuddling up with the man pages. But what about the rest of your team? Unless there is only one person constantly cranking out custom packages, everyone is going to have to learn RPM or APT. Do you really want to waste valuable engineer brain cycles reading and debugging busted packages when instead you could be writing code?
  2. Too much work to add 1 patch – Let’s say I want to add one patch to fix a memory leak to libX. Here’s what I have to do for debian packages:
    1. Download and unpack the library source.
    2. Add a debian/ sub-directory.
    3. Create a changelog, control, and files file.
    4. Create a file with a list of the patches that are being applied.
    5. Drop in the patch.
    6. Test the package.

    Wow. Extremely painful. Especially for just one patch. Hell, you might even throw the deb away after if you decide you don’t like the patch.

  3. Source control – So you don’t mind the previous points. They don’t bother you all that much. But what about source control? How do you keep track of your Debian package files? You could keep an entire copy of the library’s source with your debian/ sub-directory in your git/svn/whatever. That kind of sucks, though. What if you got your source code from the git/svn of the project instead of via a tarball? Yeah, I guess you could put all that into source control too. You could also check in your debian/ sub-dirs into a repository and then symlink them into the source for the library…. What a pain.

pacman and the almighty PKGBUILD

This is where pacman saves the day.

  1. pacman is simple – It doesn’t try to solve Global Warming. It just provides a dead simple set of command line switches for installing, removing, upgrading, and syncing packages. Not many options, but that is exactly what I want. You can just put a bunch of packages in a directory, point a webserver at it and its a pacman package server.
  2. PKGBUILD files are simple – PKGBUILD files are just plain text files with a few fields. The fields are easy to understand and you can learn how to write your first PKGBUILD in 5 minutes.
  3. Easily use with source control – Since the actual PKGBUILD file is plain text, your source control system should be able to easily keep track of changes. You don’t need to check in all the source, either. You can just point the PKGBUILD at a URL and it will automagically run wget and unpack the source. You can include a source tarball if you really want to, of course.
  4. Quickly create create a new PKGBUILD or add a patch to an existing one – To add a new patch to an existing PKGBUILD I just add the filename to the source = line, and add a patch -p N < file line and I'm done. If the PKGBUILD doesn't exist, I can easily create a new one because the file format is dead simple

Getting it on Debian

This part is kind of weird. We want to get pacman on Debian. There isn't an apt package, so what now? Well, we can build a .deb file that installs pacman so we can use PKGBUILDs. Basically, we use a package management system to install a package management system.

There's gotta be a "Yo Dawg" in there somewhere.

Get it here and be sure to get its dependency (libdownload) here.

A look at some PKGBUILDs

Let's take a look some PKGBUILDs that we use at Kickball Labs.

The first is a simple PKGBUILD for ltrace, a program like strace but for library calls. It just downloads the source, passes in some custom options to configure, builds the binary, and then installs to the package directory.

pkgdesc="ltrace is a debugging program which runs a specified command until it exits"

  cd $startdir/src/$pkgname-$pkgver

  ./configure --prefix=/custom --sysconfdir=/custom/etc
  make || return 1
  make DESTDIR=$startdir/pkg install

Download it here.

This next PKGBUILD is a bit more intense. It is our PKGBUILD for Ruby, with a bunch of extra patches (fibers, ruby GC patches, and ruby thread bugfixes).

pkgdesc="An object-oriented language for quick and easy programming"
arch=(i686 x86_64)
source=(ftp://ftp.ruby-lang.org/pub/ruby/stable/ruby-${_pkgver}.tar.bz2 thread_timer.patch fibers.patch ruby-186-gc-new.patch dump_heap.patch)

options=('!emptydirs' 'force')

build() {
  sudo apt-get install libreadline5-dev zlib1g-dev libncurses5-dev libssl-dev libgdbm-dev libdb4.4-dev

  cd ${startdir}/src/${pkgname}-${_pkgver}

  patch -p1 < ${startdir}/src/fibers.patch || return 1
  patch -p0 < ${startdir}/src/thread_timer.patch || return 1
  patch -p1 < ${startdir}/src/ruby-186-gc-new.patch || return 1
  patch -p1 < ${startdir}/src/dump_heap.patch || return 1

  # include /custom in cflags/ldflags so extensions compile
  export CFLAGS="-I/custom/include -g3 -gdwarf-2 -ggdb -O0"
  export LDFLAGS="-L/custom/lib"
  export LIBS="-L/custom/lib -ltcmalloc_minimal"

  ./configure --prefix=/custom --enable-shared --disable-pthread
  make || return 1
  make DESTDIR=${startdir}/pkg install

Download it here.


Package management is painful. If you have any plans on building a service that scales to multiple machines, you had better have a good solution for creating and distributing packages. pacman is good for this because:

  1. It's easy to learn and use, encouraging you to make everything (from libraries to configuration files and more) a PKGBUILD.
  2. The simple plain text file format works great with your source control system of choice.
  3. Applied a patch you didn't like? Just roll the PKGBUILD file back with your package manager.
  4. Create a PKGBUILD repository by just putting the tarballs generated from your PKGBUILD files in a directory and pointing a web server at it. This is great for bringing up new hardware in a datacenter - just install pacman, point it at your repository, and install your base package which sets up the all your passwd, host, or other config files.

Written by Joe Damato

April 27th, 2009 at 12:21 am