RAID Configuration on My Home Virtual Machine Server

| Comments

I didn’t think the low-level details of my virtual machine server’s disk configuration is a terribly interesting topic, but I get asked about it quite often, so I figured that it might be time to document it!

NOTE: I’m planning to post a video on YouTube related to this post. I’m going to recreate my KVM server’s setup in a virtual machine. I’ll jabber about what I’m doing as I’m doing it, and I’ll be able to grab screenshots of the various steps. The device names won’t quite match up, but I can massage things to match my actual setup. I’ll be adding those screenshots to this blog when they become available!

NOTE: I recorded a rough draft of that YouTube video. I made lots of stupid mistakes, did things in the wrong order, and had to backtrack to make corrections. I took notes, though, and I’m ready to record a better video. I’m doing a bad job, because I took those notes two months ago!

Choosing your RAID configuration

Writing this post got me thinking a little. Are folks having trouble trying to decide what sort of RAID configuration to run on their homelab or home NAS servers? I decided to write a whole mess of words about choosing a RAID configuration and posted it over at Butter, What?!

The article at Butter, What?! is rather long and detailed. You don’t need to read it. You should probably be running RAID-Z2 or RAID 6. If you think I’m wrong about that, go read the article and tell me why!

Where does the operating system live?

My server has a pair of 240 GB Samsung 850 EVO SATA SSDs. At the time, these were just about the smallest SSDs you could buy that also had good performance. I knew this was much more space than my Ubuntu installation would require, but I knew I would be dedicating the majority of these SSDs to an lvm-cache. We’ll talk about that later.

I didn’t do anything remotely fancy at this point. I simply used Ubuntu Server’s install utility to create a 30 GB partition at the beginning of each drive, and I told the installer to use those partitions in a RAID 1 configuration, and I had the installer create an LVM Volume Group (VG) on top of that RAID 1 array.

I didn’t have to use the LVM layer on the RAID device where my operating system is stored, but LVM gives me flexibility. If I do something stupid, and I fill up that 30 GB volume at some point in the future, it will be easy to expand it using LVM!

My host operating system is not encrypted. I only encrypt the storage used for the virtual machines, and I don’t use any swap space.

When the server boots, I have to SSH in and manually run a short script that unlocks the rest of the storage, mounts anything that needs to be mounted, and fires up all my virtual machines. Someone has to enter the passphrase to unlock the LUKS encrypted devices, so I can’t start the virtual machines automatically.

Why not use full-disk encryption for the host?

Full-disk encryption is a bit of a misnomer. If a machine is going to boot, something has to be left unencrypted. You can get tricky and use a removable USB disk as your boot device, but that introduces its own problems.

These days, it is possible to have GRUB load a kernel and initrd from an encrypted disk. Traditionally, a LUKS encrypted machine has an unencrypted kernel and initrd on your boot partition.

In either case, I would need a way to unlock the disks, and I don’t have a monitor or keyboard plugged into this machine. There are some tricks you can play by loading the Dropbear SSH server in your initrd. That still leaves you with an unencrypted SSH server anyway, right?

Leaving the entirety of the host operating system bootable without unlocking the rest of the system doesn’t leave me much more vulnerable. This server is up and running 24/7. Disk encryption isn’t protecting me from remote attackers. My VMs are useless if they’re left encrypted, so if they’re running, they have access to the encrypted data.

Encryption is protecting me from physical theft. If someone breaks in and hauls away my hardware, whoever they sell the stolen goods to will not have access to my data. That’s 99% of what I’d be worried about.

What is the biggest flaw of this encryption strategy?

I’m probably leaking some sort of data on my KVM host. The config files for all my VMs are stored unencrypted, so if someone steals my hardware, they know the names of all my virtual machines. There are log files storing things like my SSH session information, so the thief can find out when I logged in.

It would be simple to disable logging and move all the libvirt configuration data to the encrypted file system.

If your desired level of paranoia requires you to hide this sort of information, you should just take the extra steps and encrypt as much data as possible. It is better to hide everything, because it is easy to accidentally let something leak!

Storage for the virtual machines

When I set this server up, I manually created a two disk RAID 10 array. Yes, Linux’s MD RAID allows you to create a RAID 10 array with fewer than four disks. Converting a RAID 1 to a RAID 10 is a real pain in the neck. Adding additional disks to a small, existing RAID 10 is easy. Planning ahead is smart!

I run a LUKS layer on top of the RAID 10 array. You could do it the other way around, and put a LUKS device on each disk that makes up that RAID device, but your server will have to calculate AES for each disk. If LUKS runs on top of the RAID 10 instead of underneath, it will consume half as much CPU.

I run LVM on top of that LUKS device. Some people prefer to use Logical Volumes (LV) as block devices for their KVM virtual machines. If I had any virtual machines that required optimal disk performance, I would do the same. My requirements are quite light, so the majority of this Volume Group (VG) is allocated as an ext4 partition mounted at /var/lib/libvirt/images.

I tend to use qcow images for all my virtual machines. They’re easy to back up. They’re easy to duplicate. They’re easy to shuffle around from my server to my desktop or laptop. They can also be deduplicated, though I don’t use that feature often anymore. For me, the convenience is worth the decreased disk performance.

Tell me more about this oddball implementation of RAID 10!

Linux’s MD RAID 10 implementation is designed to keep a copy of each block on two separate disks. Of course, you can configure it to keep more copies if you like, but we’re just going to talk about what it does when keeping two copies.

If your RAID 10 has two disks, this works exactly like a RAID 1 mirror. If your RAID 10 has four, six, or eight disks, then Linux’s RAID 10 functions exactly like you’d expect a RAID 10 to work.

It gets more interesting when you use an odd number of disks. I currently have three 4 TB 7200 RPM disks in my RAID 10. A copy of the first block of data is stored on disk 1 and disk 2. A copy of the second block is stored on disk 2 and disk 3. A copy of the third block is stored on disk 3 and disk 1.

This staggering of your data repeats until you reach the end of your disk.

Why RAID 10? Why not RAID 5 or RAID 6?

I am certain that I could dedicate an entire blog post to this section, but I’ll try to keep it down to a handful paragraphs!

First of all, don’t use RAID 5. Once hard drives started pushing past a few hundred gigabytes in size, the odds of encountering a read error during a RAID rebuild quickly started to approach 100%. Also, failure rates are high enough that the odds of a second disk failing while an array is rebuilding make me uncomfortable. You should have two disks’ worth of redundancy.

RAID 10 performs better than RAID 5 or RAID 6. The differences in performance characteristics are more complicated than that, but I’m trying to keep this to a few paragraphs. I wasn’t too concerned with the performance, especially since I was planning to use lvm-cache. The performance advantage helped lean me in this direction, though!

I was more concerned about my initial costs. My largest VM was going to be my NAS, and I had only about 1 TB of data that needed to be stored there, and my RAW photo collection was only growing by 5 to 10 GB per month. A pair of 4 TB disks in a mirror would have me covered for a long time.

RAID 6 would give me more storage for my dollar over the long term, but I would have had to buy four drives right away. In my case, though, it would have cost a lot more up front!

Odds are that the math will work out different in your case!

lvm-cache

I had to stop writing when I got to this heading. My lvm-cache configuration is a bit off. We didn’t have lvm-cache when I set this up originally. We had to use dm-cache directly; lvm-cache is a convenience layer on top of dm-cache. lvm-cache requires the cache devices to be a part of the volume group that you want to cache, and I’m not set up to allow that!

I have one Volume Group (VG) built on my SSDs, and a second VG built on my RAID 10 array. I can’t easily shrink my SSD VG, so I cheated. I used a Logical Volume (LV) on the SSDs as another Physical Volume (PV) that I could add to the RAID 10’s VG. You have more information than I did when I built this, so you can avoid this workaround!

I also forgot that I’ve had my lvm-cache disabled for almost an entire year! I recall having to turn it off around the time when we moved into our new house, but I don’t specifically remember why.

I just set my lvm-cache back up—that’s why I had to take a break from writing! I can’t properly explain how I set up my cache layer, because my setup is extra convoluted.

I’m also not prepared to tell you how much value there is in running lvm-cache. I remember being disappointed for a number of reasons, but that was over a year ago. I’m sure progress has been made, and I know the usage of my NAS VM has shifted quite a bit over the last year.

I’m going to let the cache populate itself, and I’ll report back in a few weeks!

What should I have done to make using lvm-cache easier?

I should have added the SSD RAID 1 and the spinning RAID 10 arrays to a single VG. I tend to dislike doing this. Putting different classes of storage in different VGs is a good mental abstraction, and it means you’re less likely to accidentally create an LV on the wrong grade of storage.

I don’t want to accidentally put a PostrgreSQL server’s database files on a RAID 5 with its slow write speeds, and I don’t to put my NAS on an expensive NVMe RAID 1! When both arrays live in the same VG, you have to explicitly state which Physical Volume (PV) you want to use when creating your LVs.

Conclusion

I hope you found this interesting. My disk configuration may not be the right choice for you, but I hope that the description of my setup might make you think of some possibilities you hadn’t considered before!

I’ve had this setup running since 2015, and it has been faring quite well. It started with the pair of 4 TB disks in a RAID 10, and I added a third disk to that array last year. I don’t expect to make any major changes, and it won’t surprise me if I continue to use this setup for another four years!

I did my best to include the right amount of detail in this overview blog. Do you think I should dive deeper into any particular aspect of this setup? Do you have any questions? Leave a comment below, or stop by the Butter, What?! Discord server to chat with me about it!

Comments