I am excited! I have been working toward this goal for quite some time. As I am writing this, I am migrating data off my server’s old RAID 10 onto a single 14 TB USB hard drive. I should be finished and ready to physically remove all but one of the old 4 TB hard drives from my homelab server. Isn’t that awesome?!
I don’t know who this blog post is for. Sometimes I write things with an audience in mind, but this isn’t one of those times. I had some plans, I executed those plans, and I thought I should document that. I hope something in here is of use to you!
I can’t decide if I am talking about my NAS or just the large RAID storage
I am quite certain that when I chat about my NAS that I am also referring to the big, honkin’ stack of disks in a RAID, and not just the fact that the server shares files over my network. I am also aware that enough time has gone by that my quartet of 4-TB drives in a RAID 10 are no longer big nor honkin’.
I have had a RAID of one sort or another in my home since the very end of the twentieth century. In those days, individual hard disks just weren’t big enough to store all my data on a single file system, and we had nearly an entire decade in the middle of that time span where hard drives were poorly manufactured.
Hard drives have gotten reliable again, and disk sizes have outpaced the rate at which my data is growing. I don’t actually need a RAID to store the bulk of my data anymore.
I need a redundant array of inexpensive computers instead of just disks
I think it is safe to say that Tailscale was the tiny domino in front of the bigger choices that led me to the point I am at right now. Tailscale and fast Internet connections mean I can drop a tiny server anywhere in the world and treat it like it is on my local network.
Today I have just over six terabytes of data, and that is growing at a rate of around one terabyte each year. I figure I should have at least three copies of that data, and at least one of those copies should be in a different physical location. At least one of those copies should have some amount of history available just in case I accidentally delete an important file.
One of those copies of my data has been living on the RAID 10 array on the NAS virtual machine running on my homelab server, but those drives are getting full, and they are very old now.
tl;dr Let’s just list all the places where my data lives!
Hello. This is Pat from about six paragraphs in the future. I realized that I am going to use a lot of words explaining all the pieces of my backup and redundancy plan, and it might be prudent to just make a list of all the places where my data lives before I do that.
- Seafile on a Raspberry Pi
- 14 TB of storage
- off-site at Brian’s house
- 90 days of history
- Homelab server
- 14 TB of storage
- I can bug out with the USB hard drive
- opposite side of the house! (might survive a fire?!)
- 90+ days of daily btrfs snapshots
- 14 TB of storage
- My workstation
- 12 TB of storage
- No snapshots
- My laptop
- 1 TB of storage (not enough!)
- Bulky data like video files aren’t synced here
Three full copies of 100% of my data. One copy is in a different location. Two copies have history. Backups are out of band.
A quick note about the Western Digital Easystore!
I am probably more excited about this than I should be, but the Western Digital Easystore USB hard drive that I bought has S.M.A.R.T. support!
1 2 3 4 5 6 7 8 9 10 11 12 13
This has been hit or miss for me over the years with USB enclosures and hard drives. Sometimes cheap USB enclosures work while expensive ones don’t. In my experience, though, the majority of USB drives don’t support S.M.A.R.T. at all. The 14 TB Seagate USB hard drive that I bought for my Seafile Raspberry Pi has no S.M.A.R.T. support.
I have no idea if the extremely similar Western Digital Elements drives work with S.M.A.R.T. I don’t even know that all or even most Western Digital Easystore USB drives support S.M.A.R.T. I only know that I have had luck with the one I just bought.
What’s involved in my storage syncing and backups?
The heart of my storage and backup plan is my Seafile server. That server is a Raspberry Pi 4 with a 14 TB USB hard drive, and it lives at Brian Moses’s house. The server is only accessible via my encrypted Tailscale network.
You can think of Seafile as self-hosted Dropbox. As I am writing these words, this Markdown file is synced up to my Seafile server at Brian’s house every time I hit
save. Not long after that, the Seafile clients on my laptop and homelab server will download a copy of those changes. This takes about 30 seconds.
I have Seafile set to keep file-change history for 90 days. I could probably already pull six different versions of this blog post out of that history, and I have only written five paragraphs so far!
Any files that Seafile drops on the USB hard drive on the server will be snapshotted once each day.
Why did I plug a USB hard drive into my homelab server?
I haven’t decided how useful this is, but I think the concept is pretty neat!
I have a hole in my backup plan. I would really like to have an up-to-date copy of all my data on my laptop, but the 1 TB NVMe in my laptop is just too small. I keep saying that if there is a fire, I can grab my laptop on the way out of the house. It would be nice if I didn’t have to worry about pulling down 6 TB of video from Brian’s house right after a fire, wouldn’t it?
NOTE: I still have to drill a hole to run the wires to the network cupboard correctly. Only one of those computers is doing anything. I thought my homelab server looked lonely, so I brought in some old computers from the garage to sit on the table with him.
When I talk about a fire, I don’t literally mean a fire. Maybe the weather forecast is predicting massive flooding, and we should drive to Oklahoma. Maybe there is an earthquake and we notice a terrifying cracks in the walls. Maybe there is a gas leak. Maybe a CIA agent shows up, and we have to take a glass of water to a UFO.
There are a lot of bad things that can happen where I wouldn’t have to get out of the house in seconds. Emergencies where I would have time to pack my laptop bag.
In those cases, I can just grab the USB hard drive and take it out the door with me!
I am keeping a RAID for the virtual machines on my homelab server
I set aside a 1 TB partition on my new 14 TB USB hard drive for storage of virtual machine disk images. Why 1 TB?!
If I don’t count my existing NAS virtual machine, my VM disk images add up to something not much more than 200 GB. That is just too big to comfortably fit on the old SATA SSDs that my homelab uses for booting and
lvmcache. One full terabyte is plenty of room for these virtual machines to grow, and it will be easy to replace this volume with a $50 SSD if I have to.
I built a RAID 1 out of the 1 TB partition on the USB hard disk and one of the old 4 TB hard disks, then I moved all my KVM
qcow2 images to that new 1 TB RAID 1.
I think this is super cool! I can walk away with that USB hard disk and my virtual machines will just keep running. Home Assistant will continue to run my home automation, and Octoprint will continue to run my Prusa MK3S.
The opposite will work. If the aging hardware in my homelab machine fails, I can install Debian on any other computer. Then all I have to do is plug the USB hard drive in, point the QEMU configuration to the right place, and I can boot Home Assistant and Octoprint on the new machine.
NOTE: I need to remember to set up a job to regularly back up my QEMU config files to the USB hard drive, or else it will be a pain to replicate all the machines on a new server!
I think it is really cool that I will be able to easily carry all my virtual machines away with me if I ever have to run out the door.
I thought of a new goal that I should keep in mind!
This is something I have been doing ever since I hauled the Seafile Pi to Brian’s house. I just never put the idea into words.
Faster is always nice, but I am going to do my best to make sure my storage, synchronization, and backups work well even if my network is limited to 100-megabit Ethernet.
Last year, my workstation just didn’t have enough storage to hold much video, so I would edit files stored on NFS over my Infiniband link. That was great, but Infiniband only works over extremely short distances or with fiber optics.
Installing a big disk in my workstation and putting it behind an
lvmcache fixed that problem. I can accumulate terabytes of video, but the files I am currently working on will always be accessible at NVMe speeds.
My Raspberry Pi is stuck at 100 megabit for some reason. I tried quite a few cables, switch ports, and switches. Those are cables and ports that negotiate gigabit just fine with my other Raspberry Pi. My Internet connection at home is only 150 megabit, anyway, so this hasn’t been a big deal.
Just about the only time this causes any sort of issue is when we record podcasts. We generate a few dozen gigabytes of video files in two or more locations, and it takes an hour or three to get all those files uploaded or downloaded.
This only happens about twice a month, but it is rare that I am in a rush to open Davinci Resolve immediately after an interview. It is usually fine letting this wait a day.
How much did this cost? Is this a better value than using smaller disks in a RAID 5 or RAID 6?
Ugh! This is one of those situations where it is tough to make direct comparisons. It would have surely cost more money if I put two hard disks on the Pi, the homelab server, or both, but maybe three or four smaller drives in a RAID 5 could provide some redundancy without bringing up the cost by much.
More disks would require more SATA ports or more USB ports, and I am not terribly confident that sticking three or four USB disks in a RAID 5 would be stable. It would probably work, but
mdadm might kick good drives out if they happen to respond too slowly.
You can get 14 TB USB hard drives for about $200, assuming you wait for a good deal. I think it is safe to say that even if we include tax, I paid less than $700 for my three hard drives.
I bought the Pi long enough ago that I got a good deal on it, so that part of my math would feel like cheating, so I am going to ignore the compute side of things. I am just going to assume you already have some sort of server setup at home like I do.
I have the Seafile Pi hosted for free at Brian Moses’s house, and it is currently storing just under six terabytes of data. That would cost me $300 annually if I were using Google Drive or $360 with Dropbox, and I think I am about to be at the point where I would be charged for my third year with either of those services. Thank goodness I hosted my own file-sync service!
RAID is not a backup!
I always feel like I need to say this. RAID is there to reduce downtime or maybe increase performance. If one or maybe two drives fail, you can just replace them, and everything will be fine. That can save you hours of work. You won’t have to reinstall an operating system. You won’t have to restore from backup. You won’t have to reconfigure anything.
If your disk controller or its driver goes wonky, you might ruin the data on every disk in your RAID. That could take your data and every single one of your ZFS or btrfs snapshots with it. Snapshots are nice to have, and can be a vital part of a backup plan, but snapshots aren’t much of a backup on their own!
Earlier, I mentioned that my backups are out of band. That means my backups are done outside of normal channels. In my case, Seafile is copying data to and from the server via its own protocol.
If your backup destination shows up as a normal disk to your operating system, then it is potentially open to most of the same problems, accidents, and attacks as the data you are trying to back up. This is even worse if you leave that backup storage connected all the time. If some ransomware can encrypt and hijack your files, then it can do the same to the backups on your USB drive or mapped share.
You should have another layer in there to make sure you can’t lose your backup.
Did I get to the end of this weird sideways-upgrade project?
I am willing to answer that question in the affirmative! Three of the four disks from the old RAID 10 array have been removed from the server. All my virtual machines are booted from disk images stored on the 1 TB partition on the USB hard disk. That 1 TB partition is now in a RAID 1 array with a 1 TB partition on the youngest of the ancient 4 TB disks. That
mdadm RAID 1 array is encrypted using LUKS.
The fresh NAS virtual machine is running Debian 11. There are no file shares on this NAS, so it probably isn’t really a network-attached storage, but the hostname implies that it is still a NAS! The remaining 12 TB of the USB drive is encrypted using LUKS and attached directly to this new NAS virtual machine. It now has a big btrfs file system with 99% of the contents of the old, retired NAS virtual machine.
I have a Seafile client running on the new NAS, and that client seems to be syncing every relevant Seafile library that should have a copy on the NAS.
My homelab server has always had a simple script that I run manually after boot. It unlocks the LUKS-encrypted storage, scans that storage for logical volumes, mounts everything that needs mounting, then fires up the virtual machines that live on that encrypted storage. That has all been updated to match the new disk layout.
I have not set up automated daily btrfs snapshots. I will do this soon. I promise!
What is my next home network or server upgrade?!
It seems like I made a lot of changes in 2022! I upgraded my OpenWRT router to a much more capable piece of hardware, so I am now fully prepared to call the ISP and upgrade to symmetric gigabit fiber. I installed the latest OpenWRT on a couple of other routers and sprinkled them around the house and set up 802.11r WiFi roaming.
I have done some work to get my aging AMD FX-8350 homelab server down under 2 kWh of power consumption per day. I probably just shaved a bit more off that by removing some hard drives, but I wouldn’t mind taking this further!
I have been watching my friends on Discord pick up tiny boxes like the Beelink SER5 5560U for $300 or the Beelink with an N5095 for $140. The Ryzen 5560U is a huge upgrade for me, and also extremely overkill. The N5095 would sip power but is comparable in speed to my overpowered dinosaur of an FX-8350, though my FX-8350 has four times the RAM of the $140 Beelink. That’s something a cheap RAM upgrade could fix, but the more a sideways move like this costs, the longer it will take to pay for itself.
What do you think? Should I downsize into a Beelink N5095 whether it is cost effective or not? I do enjoy the idea of seeing how much homelab and NAS can be crammed into a Star Wars lunch box, but I am also not excited about turning my ancient FX-8350 into e-waste for no real reason. Let me know what I should do in the comments, or stop by the Butter, What?! Discord server to chat with me about it!