Arch Linux, systemd, and RAID.

May 13, 2016

Long, long time readers of my blog might remember Leandra, the server that I've had running in my lab in one configuration or another since high school (10th grade, in point of fact). She's been through many different incarnations and has run pretty much every x86 CPU ever made since the 80386. She's also run most of the major distributions of Linux out there, starting with Slackware and most recently running Arch Linux (all of the packages of Gentoo with none of the spending hours compiling everything under the sun or fighting with USE flags). It's also possible to get a full Linux install going with only the packages you need in a relatively small amount of disk space; my multimedia machine, for example, is only 2.7 gigabytes in size and Leandra as she stands right now has a relatively svelte 1.1 gigabytes of systemware. However, Arch Linux was an early adopter of something called systemd, which aims to be a complete replacement of the traditional UNIX-like init system that tries to manage dependencies of services, parallelize startup and shutdown of system features, automatically start and stop stuff, replace text-based system logs with a binary database, and all sorts of bleeding edge stuff like that.

Some people love systemd. Some people hate systemd. Personally, I think it is what my besainted grandmother would say, enough to piss off the Pope. That's not really what I'm writing about, though. What I'm writing about is a problem I ran into getting Leandra back up and running after building a fairly sizeable RAID array with logical volumes built on top of it.

Here's the situation: Leandra has a RAID-5 array for a couple of file structures that are going to get pretty large, on the order of multiple terabytes each. That RAID-5 has a physical volume inside of it, and a volume group inside of that (I'm breaking this down really finely for Linux users and not sysadmins). The volume group has a couple of logical volumes inside of it to hold the soon-to-be very full filesystems. So, here's what's supposed to happen:

  • Boot loader pulls the kernel and initramfs off of the disk.
  • Kernel comes up, unpacks the initramfs into memory, mounts it as the root partition, and runs a bunch of stuff inside of it to initialize the system.
  • RAID starts up.
  • LVM starts up, turning on the file systems that need to be mounted for the boot process to continue.
  • The kernel pivots to the real root partition, mounts the LVM file systems, and system services start up.
  • Leandra is online. Yay.

Here is what was happening:

  • ...
  • RAID starts up.
  • LVM doesn't start up. There is no output, and the boot process blithely continued.
  • The LVM file systems can't be mounted because they're not available.
  • Several dozen system services try to start up and time out 90 seconds later because the LVM file systems aren't there.
  • Error. Enter root password to drop to an emergency shell to figure out what the hell went wrong.

After much trial and error, scratching my head, picking through debugging output (the output of journalctl -xb), searching the web, and reading forum posts and bug reports I figured out what was going wrong. Under ordinary circumstances, when you get to the mkinitcpio stage of installing the system you're supposed to select the elements of the HOOKS (discrete functional modules of the boot process) list you need and put them in the correct order. I was following the steps in the official documentation to build an "LVM on top of RAID" system, which are unfortunately incorrect due to the presence of systemd. Here's what was going sideways under the hood:

  • ...
  • RAID starts up because systemd activates it.
  • LVM does, in fact, start up the way it's supposed to because udev (the device manager) activates it.
  • You can't see it because the status output on the screen goes away when the display kicks into framebuffer mode (it changes from 80x25 VGA text mode to 1280x1024 high-resolution graphics mode that just happens to implement the console).
  • The kernel pivots out of the initramfs into the real root partition and starts systemd.
  • systemd re-runs the LVM service because it doesn't know that it's already online. This is a pre-requisite for all of the other system services systemd is in charge of.
  • The already-initialized LVM subsystem freaks out.
  • The boot process crashes to the enter root password to drop to an emergency shell part of the show.
  • The Doctor says several things that, were he several centuries younger would have gotten his mouth washed out with soap by his grandmother.

The official documentation is incorrect because it's written from the perspective of a system that's still using the old-school init system and not systemd. The existing system service scripts and those inside of the initramfs are in conflict with one another. HOWEVER - here's how to fix it.

When you build the /boot/initramfs-linux*.img files (the initramfs I mentioned earlier) with the command mkinitcpio -p linux, you're building udev support into it if you don't tinker overmuch with the HOOKS list in the file /etc/mkinitcpio.conf, which is generally a good idea. What you need to do is the following:

  • Put "dm_mod" and "raid456" into the MODULES list in /etc/mkinitcpio.conf. "dm_mod" is the kernel module for LVM. "raid456" is the kernel module for the RAID type you're using (I'm using RAID-5, you might use another).
  • Take "udev" out of the HOOKS list and replace it with "systemd". This is your solution!
  • Right after "block" in the HOOKS list, add "mdadm" to activate the RAID array.
  • Right after "mdadm" in the HOOKS list, add "sd-lvm2". This means "systemd, activate LVM" rather than letting udev do it.
  • Your HOOKS list should look something like this: HOOKS="base systemd autodetect modconf block mdadm sd-lvm2 filesystems keyboard fsck"
  • Save /etc/mkinitcpio.conf.
  • mkinitcpio -p linux
  • Reboot. Sacrificing Windows NT 4 installation disks optional. Invoking the shade of Alan Turing is never a bad idea.
  • That should do it.

If you want to see exactly what I did I checked Leandra's basic configs into Github here: https://github.com/virtadpt/arch-boxen/tree/master/leandra/fakeroot

Good luck, and happy hacking.

Note: Yes, I know something's weird with my CSS and my <ul> lists aren't showing up correctly. I'll mess with it later.