Deciding Not To Buy A Radeon Instinct Mi50 With The Help Of Vast.ai!

| Comments

I am quite grumpy about both cloud AI and local AI. I just can’t figure out where I want to land, and I don’t even want to choose just one or the other. I am happy to use the cloud where it makes sense, then use a local LLM or image generator where that might work out better. The trouble is that none of it makes sense!

I have been interested in the idea of grabbing a used server GPU on eBay for a long time. The 24 GB Tesla P40 used to be a great deal, but they increased in price from $150 to nearly $400 almost overnight. They still might be a reasonable deal at $400 if you really do need 24 GB of VRAM, but the even older 24 GB Tesla M40 can be had for less than $100.

Pat with a GPU

This is an AI-generated image of me holding an immitation of my MSI Radeon 6700 XT generated using OmniGen V1 at fal.ai. I tried to generate a picture of me with a Radeon Instinct Mi50, but that didn’t go so well!

My friend Brian Moses beat me to that, and his results were more than a little underwhelming. Don’t get me wrong here! What you get for $100 is fantastic, but my 12 GB Radeon 6700 XT in my desktop is significantly faster. I feel like I am better off dealing with the LLM startup times just so I can keep my performance.

Why is the 16 GB Radeon Instinct Mi50 interesting to me?

The Tesla GPUs are awesome because they run CUDA. That makes everything related to machin learning so much easier. Some software won’t even run with AMD’s ROCm, and AI software tends to run slower on ROCm even when the AMD’s hardware should be faster.

That means that there are things that just won’t currently run on an Instinct enterprise GPU, and getting the things that will work up and running will be more work. I am already running a ROCm GPU, so I am confident that the things that I already have working could be made to work easily enough on a server with a Radeon Instinct GPU.

I would love to have 24 GB of VRAM, but I am mostly happy being stuck with my current 12 GB, so the 16 GB Radeon Instinct ought to be comfortable enough. The Mi50 has a massive 1 terabyte per second of VRAM bandwidth. That is three times more bandwidth than my GPU or either of the reasonably priced used Nvidia Tesla GPU models. That should be awesome, because LLMs love memory bandwidth.

There are currently several 16 GB Instinct Mi50 GPUs listed on eBay for $135. They will require a custom cooling solution, but I have a 3D printer, so that shouldn’t be a problem. There are 32 GB Instinct Mi60 GPUs with slightly more performance starting at around $300. I think that is quite a reasonable price for 32 GB of VRAM, but I don’t even know if these Instinct GPUs will be suitable for my needs, so I have only been looking at the 16 GB cards.

Why is 16 GB enough VRAM for my needs?

I am using a 5-bit quantization of Gemma 2 9B to help me write this blog post. It is definitely up to the task. This model uses around 9 gigabytes of VRAM on ROCm with the context window bumped up to 12,800 tokens. Dropping down to a 4-bit quant and lowering the context window a bit would allow this model to fit comfortably on a GPU with 8 gigabytes of VRAM.

Older Stable Diffusion models run great on my current GPU, and my settings give me decent images at around 20 seconds each. I have to run Flux Schnell’s VAE on the CPU, and I can generate decent 4-step images in around 40 seconds. I believe I am using a 4-bit quant of Flux Schnell.

AI Guy holding a floppy disk in a datacenter

Squeezing all of Flux Schnell into VRAM would be a nice upgrade.

My Radeon 6700 XT has roughly 35% faster gaming performance than an Instinct Mi50 according to Tech Power Up’s rankings. My hope was that the Mi50 would process more tokens per second since LLMs are usually limited by memory bandwidth, and the Mi50 has boatloads of bandwidth.

I keep waffling back and forth!

I just couldn’t bring myself to order an Instinct Mi50.

I do hate that I have to remember to spin up the Oobabooga webui every time I work on a blog post, and I have to remember to shut it down before playing games. I am also not good at remembering that I even need to spin it up before I start writing, so I wind up having to wait an extra 20 seconds while I am already prepared to paste in my first query.

I would like to have a dedicated GPU with reasonable performance in my homelab. Then I could just hit a key from Emacs and have some magic show up in the buffer five seconds later. I also feel that I would do a better job at even bothering to set things up to automatically query the LLM if I knew the LLM would always be there.

Flux Schnell Waffle Guy

What’s wrong with the Instinct Mi50? It has no fans, yet it needs to be cooled. I would have to choose and buy a blower fan, and I would have to find or design a 3D-printed duct.

I don’t have any PCIe slots in my homelab any longer, because I have downsized to mini PCs. I can definitely plug my old FX-8350 server back in, but I am not confident that there is enough room for the length of an Instinct Mi50 with a blower fan clamped to the end. I can for sure make it work, but it might require swapping to a different case.

I have been on a repeating cycle in my brain for weeks. “I want an LLM available 24/7!” “The Instinct Mi50 is so cheap!” “I don’t want to do the work only to find out it is too slow!” “I want an LLM available 24/7!”

I managed to find a cloud GPU provider with a 16 GB Instinct Mi50 GPU available to rent. That meant that I could test my potential future homelab GPU without spending hours getting the hardware assembled and installed in my homelab.

Trying out the Radeon Pro VII at Vast.ai

I was disappointed at first, because I didn’t see any Instinct GPUs in the list of currently available machines. I only saw some goofy Radeon Pro VII. What the heck is that?

It turns out these are the workstation version of the Instinct Mi50. Same cores. Same VRAM. Same clock speeds. They are almost identical except for the cooler and the extra video output ports. While the Instinct cards rely on a rackmount server’s own airflow to keep them cool, the 16 GB Radeon Pro VII has its own fan. You can get these on eBay, but they cost more than a 32 GB Instinct Mi50. I don’t know about you, but I’d rather put in the effort to finagle a fan onto the 32 GB card than pay the same for a 16 GB card!

The Xeon Gold CPU in the available server is a pretty good match for my old FX-8350. They have similar single-core performance, while the Xeon has twice as much multi-core performance. That is because my old FX-8350 has half as many cores!

My Radeon Pro VII testing was disappointing

I used Vast.ai’s Ubuntu 22.04 image with ROCm 6.2 preinstalled. I cloned Oobabooga’s text-generation-webui repository, and I copied up the exact same Gemma 2 9B 5-bit model that I use locally.

Everything installed fine. My model loaded right up. I also saw that rocm-smi had me at 71% VRAM usage, and it was peaking at 236 watts during inference. Everything seemed to check out. My LLM was running on the GPU, and the GPU was running at full speed.

I pasted in a blog-editing session from my local Oobabooga install’s history, and I got the Radeon Pro VII into precisely the same state as my local machine. Then I had them both regenerate the conclusion section for my last blog post, since that is my most successful use case for my local LLM so far. It also uses as much of the context window as I am likely to ever need to use.

In fact, I clicked the regenerate button three or four times just to be sure the results were consistent.

Instinct Mi50 Gemma 2 9B results

My Radeon 6700 XT managed to crank out results at 14.75 tokens per second. The Radeon Pro VII’s best run managed 7.64 tokens per second.

I think it is fair to say that the Instinct Mi50 equivalent machine that I rented at Vast.ai was roughly half as fast as my own GPU. I tried poking around with the model settings, but I didn’t manage to coax any extra performance out of it.

Some posts in r/LocalLlama lead me to believe that I might have been able to push something closer to 40 tokens per second with an Instinct Mi50 GPU with a model like Gemma 2 9B. I had high GPU utilization. I was consuming the right amount of VRAM. It is possible something else is going wrong, and it might be something out of my control, but I don’t have a good guess as to what that could be.

I don’t think I have much use for a 24/7 LLM server today if it can only manage 7 tokens per second. I was already having trouble convincing myself that I needed to buy an Instinct GPU to try out. Even though I am concerned that this could just be a flaw in the cloud-GPU setup, this still just might be how fast this GPU will be. I don’t want to put in the effort with a used GPU just to wind up with an LLM that is too slow to be useful.

I have only nice things to say about Vast.ai so far

I have not tried any other cloud GPU services, and I definitely need to tell you that I didn’t do a ton of comparative research here. Vast.ai was on one of three or four price lists that I checked. They are at just about the lowest price point of any cloud GPU provider, but that means they are also one of the more manual options. They were saying nice things about Vast.ai all over Reddit, the prices are great, and most importantly, they had the GPU I was most interested in testing available.

How is Vast.ai more manual than some of their pricier competition? There are companies that will let you fire up a preconfigured LLM, Stable Diffusion, or Flux Schnell service in a matter of seconds. You have to bring your own Docker configuration to Vast.ai, though they do have a library of basic templates that you can choose from.

I added $5 to my account a few days ago, and I have $4.78 left over. It only cost me three cents to do my little benchmark of the Gemma 2 9B on the Radeon Pro VII.

I spent the other 19 cents messing around with the Forge webui and Flux Schnell on a middle-of-the-road 16 GB Nvidia A4000 GPU. Why I chose this GPU will probably be a good topic for another blog post, but telling you how things went will fit well in this blog post!

Flux Schnell Confused Robot

I was generating nice 4-step 1152x864 images at around 20 seconds per image. That’s half the time it takes to generate a 640x512 image on my Radeon 6700 XT. The A4000 didn’t go any faster at lower resolutions. That was a bit of a surprise but is also kind of neat! It is unfortunate that this is still miles away from sites like Mage.space that will generate similar Flux Schnell images in 3 seconds.

Spending 20 cents per hour to generate Flux Schnell images in 20 seconds seems like a good deal compared to Mage.space’s $8 per month subscription fee. The problem is that it took 15 minutes for my Forge Docker image to download Flux Schnell from Huggingface.com. This is mostly a problem with Huggingface, because I have the same complaint here at home.

You can avoid most of that 15-minute wait by storing your disk image at Vast.ai with Forge and Flux Schnell already installed, but that would cost almost $9 per month. Why pay $9 per month for slower image generation when you can pay Mage.space to manage everything for you at $8 per month?

This even more of a bummer since I discovered that Runware.ai added an image-generation playground to their API service. I can pop in there and generate all the Flux Schnell images that I need for a blog post in less time that it takes a server at Vast.ai to boot. Booting the machine at Vast.ai costs me a nickel. Spending the same time generating images at Runware.ai costs a fraction of that, and I don’t have to wait 15 minutes to get started.

I am excited that I bought some credits to use at Vast.ai. When I can’t get something cool to work locally, I can spend a few nickels or dimes messing around on someone else’s GPU. That is a cheap way to find out if it is something that I can’t live without having locally. I can always decide that I just have to buy a 24 GB Radeon 7900 XTX or Nvidia 4090 later, right?!

The costs are small

I am grumpy about all of this. The prices on everything short of a current-generation GPU with 24 GB of VRAM are almost inconsequentially small.

I can add sluggish but not glacially slow LLM hardware to my homelab for $135, and I can power that new hardware for $70 per year. I don’t know how you want to amortize those dollars out, but lets call it $17 per month for the first year and $6 per month for every year after. That is peanuts if you ignore the work of configuring and maintaining the software.

That is about what it would cost to pay for a service like [Mage.space][(https://mage.space)] for image generation, and they’ll do the work of installing new models and keeping the software up to date for you. I can hit OpenAI’s API as hard as I know how to, and it will only cost me pennies a month. Both of these services are faster than anything I can run locally even if I shelled out $1,800 on an Nvidia RTX 4090.

Flux Schnell Robot Counting Money

The cost of keeping a disk image ready to boot at Vast.ai would be about the same. There is a ton of flexibility available with this option, but they still can’t boot a machine instantly, because they have to copy my disk image across the globe to the correct server. It will still take two or three minutes at best for me to start generating images with Flux Schnell.

You can be flexible, cheap, and slow at home. You can give up instant 24/7 availability to be flexible, cheap, and fast at Vast.ai. Or you can give up flexibility to be fast and cheap with services like Mage.space and OpenAI’s API.

AI in the cloud is almost free when you pay by the image!

Everything in this blog post fit together really well when I started writing it, because a Mage.space subscription, storing a disk image at Vast.ai, or idling your own machine-learning server at home would all cost a bit less than $10.

I didn’t discover that Runware.ai implemented an image-generation playground until I was almost finished writing this entire blog post. Runware.ai didn’t have a playground a month ago!

This fits my needs so much better. I can generate 4-step 1024x1024 Flux Schnell images at Runware.ai in about 8 seconds, and I can log in and start typing the prompt for my first image in about 10 seconds. They will let me generate 166 Flux Schnell images for a dollar. All of this is fantastic, and it meets my personal needs so well.

I don’t think it is possible for me to use up more than 20 cents a month in OpenAI credits, and I would be surprised if I ever manage to use up an entire dollar’s worth of Runware.ai credits in any given month. Both of these services are five to ten times faster and cheaper than running local LLMs or local image generation.

Runware.ai’s playground isn’t as flexible as Mage.space. There are fewer models to choose from, and there is no way to automatically run a matrix of models, prompts, and scales like I can locally. It is tough to complain with Runware.ai’s price and performance.

I did some more searching just before publishing this blog when I wanted to try out OmniGen. I figured it’d be fun to combine a photo of myself with a GPU in an odd location, and that was fun! My search helped me discover that Fal.ai has much fancier Flux Schnell image generation features than Runware.ai. Fal.ai costs a bit more, but it is still tiny fractions of a penny per image.

I don’t know why I didn’t figure this out sooner. I remember when Brian was using Fal.ai to train a lora to generate weird images of himself!

What about sharing an LLM server with your friends?!

I don’t have any good plans here, but I thought the idea was worth mentioning. Tailscale would make it ridiculously easy to share an Oobabooga text-generation webui and a Forge image-generation webui with a bunch of friends. Splitting the energy costs of a small server and the price of a used Tesla M40 or Instinct Mi50 GPU would quickly approach the price of a cheap cup of gas-station coffee every month with only a handful of friends contributing.

Flux Schnell three GPUs

That handful of three or four friends could easily make a used 24 GB Nvidia 3090 in a shared server approach the cost of the cloud services, and that would be fast enough to be properly competitive with OpenAI or Runware.ai.

Just about the most interesting reason to run your AI stuff locally is to protect your privacy. Running on a GPU at your friend’s house may or may not be something you consider private. Is it better to be one of millions of queries passing through OpenAI’s servers, or would I prefer to have all my queries logged by Brian Moses?

Conclusion

I started writing this conclusion section an hour before I discovered that Runware.ai has a playground, and my original plan was to explain how it makes me grumpy that both local and cloud machine-learning services have roughly equal pros, cons, and costs. Now I am suddenly happy to learn that I can run both LLM queries AND generate images with cloud services faster than I ever could at home for literal pocket change.

I am still excited about local LLMs. They have gotten quite capable at sizes that will run reasonably fast on inexpensive hardware. There is a good chance I will still wind up buying an Instinct Mi50 to run a local LLM, but I am waiting until I actually NEED something running 24/7 here in my home.

I would love to replace my Google Assistant devices around the house with something that runs locally, and an in-house LLM that can respond in a few seconds would be a fantastic brain to live near the core of that sort of operation. I would want this to be tied into Home Assistant, so I certainly wouldn’t want it to rely on a working Internet connection.

What do you think? Are you running AI stuff locally or using cloud services? Do you keep flip-flopping back and forth on what you run in the cloud vs. what you run locally? Tell us what you prefer doing in the comments, or join the Butter, What?! Discord community and tell me all about your machine-learning shenanigans. I look forward to hearing about your ML adventures!

Should You Buy An Intel X540-T2 10-Gigabit Ethernet Card?

| Comments

I bought a couple of 2.5-gigabit Ethernet switches with 10-gigabit SFP+ ports, and if I wanted to be able to utilize the 10-gigabit connection from my home office to the other side of the house, then I needed to buy a 10-gigabit Ethernet NIC for my PC. I did absolutely zero research. I saw a 2-port 10gbe PCIe card on Aliexpress for $17, so I added it to my cart alongside the three Xicom 10-gigabit SFP+ copper modules that I was already ordering.

Stable Diffusion and/or Flux Generated AI Person

The Intel X540 card has been working well, but I have had two problems. I am going to tell you about my problems so that you can make an informed decision. The only thing that I learned while searching the Internet to see if my problems were common is that the general vibe is that these cards can be problematic. Not in any specific way. They just aren’t great cards.

I think it is important for me to say that I am currently having zero problems with my 10-gigabit Intel X540-T2 Ethernet card. It was a good gamble at $17, and I will buy another if I need more 10-gigabit cards in the near future.

Why are these network cards so cheap?

The cards on Aliexpress were originally intended for something not quite compatible with the PCIe standard. There is an extra connector in the back that has been cut off. You wouldn’t be able to put this card in a normal PCIe slot if that connector were still intact. I don’t know what sort of proprietary machine these were originally meant for.

My Intel X540T2 10-gigabit Ethernet Card

The part circled in red notes where the PCB was cut at the factory after manufacturing. You can find nearly indentical cards in an image search where there is another edge connector at this location.

People on the Internet would like you to believe that these cards are assembled from salvaged components from used servers. I don’t think that is likely, but I wouldn’t be surprised if they’re assembled from Intel X540 chips that didn’t make the cut after testing by one of the reputable manufacturers.

The Intel X540 also happens to be a rather outdated 10-gigabit Ethernet chipset. This means that it doesn’t support 2.5-gigabit or 5-gigabit Ethernet, and it uses quite a bit more power than something newer. My cable only had to run 10’ to reach the switch, so I was confident that I would get a stable connection at full speed. You might want to find something that supports 5-gigabit Ethernet if your cabling situation is sketchier than mine.

My unique and serious problem

I couldn’t find anyone else with this problem, and I am not even quite 50% certain what actually solved my problem. I don’t expect that you will encounter this problem, but I feel like I need to tell you about it.

When I first installed the X540-T2 in my workstation, I couldn’t get it to boot. My fans and lights would turn on for a second, then everything would shut off. I reseated the card, and everything was happy. At least, I thought everything was happy.

When I rebooted my computer a few days later to check a BIOS setting, it shut off and wouldn’t turn back on again. Reseating the card didn’t help this time, so I scrounged up an old computer and tried the Intel X540 card in there. It worked just fine. It worked over, and over, and over again.

The most likely cause of this symptom is inadequate power delivery. My 850-watt power supply’s fan was making a gentle ticking sound last month, so I decided to replace it with a quieter fan than it came with. I have been running an old 500-watt power supply in the mean-time, and since it has been working just fine with my CPU and GPU maxed out, I haven’t been in a hurry to swap the repaired PSU back in.

Stable Diffusion and/or Flux Generated AI Person

So I did the work to swap the correct power supply back into my case. I noticed while doing this that the metal bracket on the Intel card wasn’t lining up well with the back of the case. I thought maybe it was pushing the 8x PCIe card slightly out of alignment in the 16x slot, so I gave the bracket a slight bend, and it fit much better.

A misalignment would explain why the computer wouldn’t boot, but it wouldn’t explain why a reboot would fail. You would expect it to run into a problem while running for days.

I checked the 12- and 5-volt lines on a drive power cable of the old power supply with a multimeter, and they weren’t dipping at all when powering up the computer, so I am not terribly confident that swapping the power supply was the fix. Those were the only two rails I could easily test.

One of these two things definitely helped, because I have been rebooting and power cycling my computer over and over again without issue. I am sure someone with a keen eye will notice the missing screw on the PCIe bracket in one of the photos on this blog post. I can only assume that I took that photo while swapping out the half-height bracket that was installed at the factory. There have been two screws installed every time I have had the card installed!

These Intel X540 cards tend to overheat

At least, I think an overheating issue is what I have run into. Heat is the number one complaint from owners of these cards. I don’t care how hot my silicon runs, but it definitely needs to be running within the design specifications.

I noticed the other day that some of my pings in Glances were failing. I pulled up speednest.net, and the results weren’t consistent. Reddit wasn’t loading full pages of content successfully, and my iperf3 tests to one of my mini PCs were bouncing between 0, 500, and 2,200 megabits per second.

I temporarily fixed the problem in an odd way. I fired up enough openssl speed benchmarks to max out all my CPU cores to get the chassis fans choochin’. It didn’t take long before all my network problems just straightened themselves out.

Stable Diffusion and/or Flux Generated AI Person

I have yet to properly address this issue. It hasn’t happened again, but I plan to bump up my minimum chassis fan speeds next time I am in the BIOS. My fan speeds are set really low to keep the noise down in here when I am recording podcasts, but I don’t think an extra 5% or 15% will make an audible difference.

The fans are usually set to spin at the minimum possible RPM. This is enough to keep my Ryzen 5700X and my Radeon RX 6700 XT at or below 45C when idle. That just doesn’t seem to move enough air past that poor Intel NIC.

The $60 Intel X540 cards on Amazon seem to ship with much larger heat sinks, and some cards ship with a tiny fan. I would expect the large heat sink to be a nice upgrade, but I definitely wouldn’t install a card with one of those tiny fans. I am trying to keep my office quiet, and I would expect to hear a high RPM fan like that.

My ACTUAL solution to my overheating Intel X540T2

My 10-gigabit LAN connection started getting weird again. My small set of hosts that I track in glances was showing timeouts again, so I did an iperf3 test. It came back wit the same weird results where it was bouncing between zero, hundreds of megabits, or several gigabits per second.

I know that I explained in the previous section that I bumped up my minimum case fan speeds in an attempt to help with this problem, but I wasn’t so certain that I ACTUALLY went into the BIOS to make that change. I was about to do boot into the BIOS to make those tweaks, but I changed my mind. I decided to pull the Intel NIC, pop off the heat sink, and see how things were doing in there.

Smokeping monitoring my Intel X540-T2 NIC

The heat sink was getting really hot. That should have been a good sign. That means that heat was leaving the chip on the NIC and transferring into the heat sink. Even so, whatever material they used for thermal transfer was dry and cracking, and there wasn’t actually any thermal compound directly between the chip and the heat sink.

I slathered on way too much of my old but preferred thermal compound, which is a tube of dielectric grease from Autozone that I have been using since 2008. I will probably be using this stuff for the rest of my life unless I manage to lose this tube. I didn’t know how hard the springs press the heat sink to the chip, so I figured I should err on the side of too much thermal grease. That way there wouldn’t be any air gaps.

Dielectric Grease as Thermal Paste

I honestly didn’t expect this to work, because the heat sink was already pulling what felt like a ton of heat out of the NIC. I have been monitoring the connection with Smokeping for more than ten days now, and I haven’t seen a single missed packet. I guess the upgrade to some nice, soft thermal compound was just enough to make the difference!

I do wish that I was smart enough to add my workstation to my Smokeping server BEFORE I repasted the heat sink.

Alternatives to the Intel X540

There’s no shortage of used Mellanox 10-gigabit Ethernet cards on Amazon between $25 and $50, but they all have SFP+ ports. You will need to add a $20 transceiver to plug in your Cat 6 cables. I could write an entire blog post about retired enterprise 10-gigabit network equipment, but that is a bit out of scope this time. I am trying to make use of the existing cables in my house this time instead of running short 40-gigabit connections between nearby computers.

Old Mellanox cards are probably the best option if you need 10-gigabit Ethernet and don’t want to use the inexpensive Intel X540 cards from Aliexpress.

Ethernet megabytes/s Rough Equivalent
100 megabit 12.5 Slow
gigabit 100 500 GB laptop hard disk
2.5-gigabit 250 3.5” hard disk or older SATA SSD
5-gigabit 500 the fastest SATA SSDs
10-gigabit 1,000 the slowest NVMe drives
40-gigabit 4,000 mid-range NVMe drives

There are plenty of PCIe 5-gigabit Ethernet adapters on Aliexpress that use the RTL8126 chipset for about the same price as the older Intel X540 cards. The only thing disappointing about these cards seems to be the speed.

The Intel cards consume almost 20 watts of power, while the Realtek 5-gigabit cards claim to use less than 2 watts. That doesn’t seem like too big a deal when you only have one card, but I have mini PC servers in my homelab that use less than 10 watts.

Since the Realtek cards fit in a smaller 1x PCIe slot, they are easier to fit into your build than the bigger Intel X540 cards.

I have a working 10-gigabit Ethernet card, and I am not going to put in work for a downgrade, but I think the right choice for me would have been buying a 5-gigabit Realtek card instead. Those cards just sip power, and I don’t have a use for the extra 5 gigabits today outside of running iperf3 tests to make sure the 10-gigabit link between the switch in my office and the switch in my network cupboard is actually running at 10 gigabits per second!

The problem with 40-gigabit and 56-gigabit network cards

The 40-gigabit Infiniband cards I was using a few years ago were fantastic. Those same Mellanox cards could run either 10-gigabit Ethernet firmware or 40-gigabit Infiniband firmware, so they’re the same family of 10-gigabit Ethernet cards I was recommending in the previous section—except you can get 40 gigabits per second out of them when running Infiniband.

That sounds awesome, except it is extremely challenging to find enough fast PCIe lanes to max out even one of the pair of 40-gigabit ports on those cards. You’re going to be buying older enterprise hardware, so the maximum supported PCIe version of those cards will be a generation or two out of date. If you are anything like me, the servers in your homelab are often made up of old computers that you used to run at your desk.

Those older computers might have even older PCIe slots, and the newest machines on your Infiniband network might have a GPU in the only properly fast PCIe slot on the motherboard. The best I ever managed in my setup was 16 gigabits per second of PCIe bandwidth. That worked out to 12.8 gigabits per second via iperf3 on my IP-over-Infiniband link.

That’s only 30% faster than 10-gigabit Ethernet, and I had no easy way to extend that connection from one end of the house to the other. I can run plain old 10-gigabit Ethernet over the old Cat 5e cable in my attic.

My first 20-gigabit Infiniband setup was a fantastic deal in 2016. I was getting 8 gigabits per second over iperf, and my entire setup to connect two computers cost less than a single 10-gigabit Ethernet card.

10-gigabit Ethernet is priced much better in 2024.

Conclusion

In the end, the Intel X540-T2 might be tempting with its low price, but it might be a bit of a gamble. I suspect that you will almost definitely get lucky and have a perfectly stable card, but there is the possibility that you will spend days wrestling with issues. While I personally wouldn’t hesitate to grab another X540 for myself, I think you should consider something like the $17 Realtek 5-gigabit cards unless you absolutely require 10-gigabit speeds—just make sure the other end of your connection actually supports 2.5- and 5-gigabit Ethernet!

This journey into the world of 10-gigabit Ethernet has been a reminder that even seemingly simple hardware choices can come with unexpected challenges. It highlights the importance of thorough research, understanding the nuances of different technologies, but also not being afraid to experiment. After all, what’s the fun in building a homelab without a few learning curves along the way?

If you’ve had similar experiences with networking hardware, or if you have any insights to share about overcoming these challenges, please leave a comment below! I’d love to hear your thoughts and experiences. And for even more in-depth discussions about homelab gear and tech troubleshooting, join the Butter, What?! Discord community.

10-Gigabit Ethernet Connection Between MokerLink Switches Using Cat 5e Cable

| Comments

I ordered and installed a pair of MokerLink 2.5-gigabit Ethernet switches in my home about two weeks ago. I put an 8-port managed switch with one 10-gigabit SFP+ port in my network cupboard, and I put a 4-port unmanaged switch with two 10-gigabit SFP+ ports in my home office. I got that all running at 2.5-gigabit speeds immediately, but my intention was to set up a path from my workstation to the cupboard at 10 gigabits per second.

My MokerLink 4-port 2.5-gigabit ethernet switch in its 3D-printed stand

The MokerLink 2.5-gigabit switch sitting in its custom 3D-printed stand with all its 10-gigabit SFP+ ports running

The hardware to set up the required 10-gigabit Ethernet just wound up taking a little longer to arrive.

Let’s talk about the title of this blog!

Brian Moses and I ran Cat 5e cable to almost every room in this house in 2011. That was eight or nine years before I bought the place!

Both 2.5-gigabit and 5-gigabit Ethernet are designed to run over Cat 5e or Cat 6 cable with lengths up to 100 meters. 10-gigabit Ethernet over copper requires Cat 6A cable. I am not using any Cat 6A cable, but I also don’t have 328 feet of cable between my office and the network cupboard!

You can run 10-gigabit over pretty crummy cables if they’re short enough, but how short is short enough? I haven’t found any good data on this. I have mostly only read anecdotes, so I figured I should document my own anecdote here!

Some SFP+ modules only support 10-gigabit Ethernet. I decided to try these Xicom SFP+ modules because they support 2.5-gigabit and 5-gigabit Ethernet. The latter would still be an upgrade, and the 3-pack of modules from Aliexpress cost about half as much as three of the cheapest SFP+ modules I could find on Amazon.

I had good luck. My connection across the house negotiated at 10 gigabits per second, and I haven’t seen any errors accumulating on the counters of the managed switch.

Exactly what kind of cabling am I working with here?

My house is a little over 50 feet wide, and the network jack in my office is almost that far away from the cupboard. The cable has to go up 8 feet to get to the attic, then it has to come back down 4 feet to reach the patch panel in the network cupboard. It isn’t a perfectly straight line through the attic, and Brian and I definitely didn’t pull the cable taut.

Let’s just say there might be 70 feet of Cat 5e between the port on the wall in my office and the patch panel.

Brian's Network Cupboard Conduit

The tube that collects all the cables leading to the 48-port patch panel in my network cupboard

The cable from the patch panel to the 10-gigabit SFP+ module is an extremely short color-coded patch cable that I crimped myself. It is probably Cat 5e.

The 10-foot cable connecting my office’s switch to the wall has some historical significance for me! It is a yellow Cat 5 cable that almost definitely somehow made the journey here from the days when I worked for Commonwealth Telephone in Pennsylvania. Almost all our patch cables there were yellow, and this cable is from the days before the Cat 5e standard even existed.

What if I couldn’t get 10 gigabit out of these cables?

What would I personally have done? Me?! I would have swapped that old Cat 5 cable from the nineties between the 4-port switch and the wall, and if I was still stuck at 5 gigabits per second, I would have stopped there. That is more speed than I actually need, anyway!

What could you do if you REALLY want that faster connection? The easiest thing to do would be to try a different brand of SFP+ module. They are not all identical. Maybe you could borrow a module or two from friends to see if you have better results.

Old 3com switches at Commonwealth Telephone

The only relevant photo I could find. This is a stack of old 3com switches stacked on what was at one time my desk! Every one of these switches would have had 23 yellow patch cables plugged in.

I honestly expect that most people will be lucky enough to have 10-gigabit Ethernet work across old Cat 5 or Cat 5e wiring in their homes. I definitely don’t live in the biggest house around, but I also don’t live in a small house, and it is only a single story. That makes it a pretty wide house. Your longest cable run is likely to be shorter than mine if you are in a two-story house unless you’re approaching 5,000 square feet.

Maybe. I don’t know. I don’t live in your house!

I don’t think my MokerLink 8-port switch supports LACP, but I was able to bond four ports!

There are four 2.5-gigabit Ethernet ports on my router-style Intel Celeron N100 mini PC. I don’t really need to bond these ports together because a single 2.5-gigabit Ethernet port is about as fast as the 14-TB USB hard disk connected to my NAS virtual machine. Even so, trunking these ports made it easier to properly verify that I actually have 10 gigabits of bandwidth between my office and my network cupboard.

10-gigbit iperf3 test

The documentation for MokerLink’s 5-port managed switch has slightly different screenshots that what show up in my 8-port switch’s trunk configuration. The 5-port switch lets you choose whether a trunk group is set to static or LACP. I can put any four ports in a single trunk group, but I don’t have a choice as to what sort of group it is, and the link doesn’t work when I set Proxmox to use LACP.

The trunk works great with the Proxmox’s default setting of balance-rr for the bond0 device, and I am able to push data at 9.36 gigabits per second using iperf3 as long as I use at least four parallel connections. Sometimes I have bad luck, and two or more of those parallel connections get stuck on the same 2.5-gigabit interface, but bumping it up to six or eight parallel connections almost always breaks 9 gigabits per second.

That is part of the bummer about channel bonding. A single TCP connection will max out at the speed of just one network interface. You need multiple connections to fully utilize all the ports in the group, and they don’t always wind up attaching to the ideal interfaces.

This isn’t a problem in situations where you would use bonded network interfaces in a business environment, because you probably have hundreds or thousands of clients sharing those two, three, or four network ports. When I want more speed at home, it is always between two clients.

Troubles with my Intel X540-T2 PCIe card from Aliexpress

I could easily burn through 2,500 words explaining the problems that I have had with this 10-gigabit Ethernet card in great detail. I will probably dedicate an entire blog post to this card when I am confident that everything is going smoothly, but I will try to hit the important bits here.

I did not carefully choose this card. I had the three pack of Xicom SFP+ modules in my cart when I searched for 10-gigabit PCIe cards. I saw a dual-port card with an Intel chipset for $16 and almost immediately added it to my cart.

Stable Diffusion or Flux Man Installing a Network Card

When I installed the card, my PC wouldn’t power on. I pulled the card, plugged it back in, and it booted just fine.

Then I noticed that I couldn’t get more than 6 gigabits per second out of the card, and dmesg said that the card was using two PCIe lanes. I wanted to reboot to check the BIOS, but the machine shut down instead and wouldn’t power back up. Many attempts at reinstalling the PCIe card failed to improve the situation.

I set the BIOS to 4x instead of auto for that slot, and I dug out an old computer for testing. The Intel card from Aliexpress worked just fine over and over again in that machine, so I gave it another shot in my PC. It worked on the first try now, it is using four PCIe lanes, and I can reach over 9 gigabits per second.

Will it work when I reboot next week? I did get the Intel X540-T2 card working just fine, but I am somewhat less than 50% certain what my problem actually was!

Should I have bought a different 10-gigabit PCIe card?

For my situation, I don’t think it makes much sense to spend more than $20 on a NIC for my workstation. There are no other 10-gigabit Ethernet hosts on my network. I can max out the 2.5-gigabit Ethernet ports on three devices and still hit 900 megabits per second in both directions at speedtest.net. That’s pretty neat, but not terribly useful.

This older Intel NIC is pretty neat at $16. That’s about what I would have to pay for a 2.5-gigabit Ethernet PCIe card anyway, so it felt like a no-brainer to give it a try. The Intel X540 is old enough that it predates the 2.5-gigabit and 5-gigabit Ethernet standards, so this card will only work at 1-gigabit or 10-gigabit and not anything in between.

My Network Cupboard

I still have the 8-port switch dangling in the network cupboard. I need to set aside some time to unscrew one of the old 1-gigabit switches and get that thing mounted correctly!

The next step up would be more legitimate-looking Intel X540 cards on Amazon for $60. After trying this one, I am not excited about paying so much for another 17-watt NIC. Prices only go up from there.

Conclusion

I think it is definitely time to stop buying 1-gigabit Ethernet hardware. You can get 2.5-gigabit switches for quite a bit less than $10 per port now. So much hardware is shipping standard with built-in 2.5-gigabit network adapters now, and you can sneakily upgrade some of your older machines with $7 2.5-gigabit USB dongles.

The 10-gigabit Ethernet stuff is fun, but it is both more AND less persnickety than I anticipated. I expected to have trouble with the inadequate wiring in the attic. I absolutely did not expect to have weird issues with an Intel network card, even if that network card might be made from discarded or unused enterprise network gear from a decade ago.

The 10-gigabit link between switches on opposite sides of my house is fantastic. This has turned this pair of switches into a single switch for all practical purposes, so it doesn’t matter which room my mini PC servers call home. They will always have a full 2.5-gigabit connection directly to any other server no matter how many of them are moving data at the same time.

I would very much love to hear from you in the comments! Are you running 10-gigabit Ethernet at home across older wiring? How old is that wiring, and how long do you think the runs are? What make and model of SFP+ module do you have on each end? It would be awesome if we could figure out which gear is working for people, and also what sorts of wiring they have managed to get a good connection on. If you are interested in chatting about DIY NAS and homelab shenanigans, then you should considering joining our Discord community. We are always chatting about servers, network gear, and all sorts of other geeky topics!

Upgrading My Home Network With MokerLink 2.5-Gigabit Switches

| Comments

I am changing course again. I used to have a small 40-gigabit Infiniband network segment in my home office. This was necessary at the time because I stored video files on my NAS that I needed access to on my workstation, and video editing is a lot smoother when those files feel just as fast as if they were sitting on a SATA SSD installed in my workstation. It was also quite inexpensive and exceedingly cool.

I eventually decided to move my NAS and homelab server out of my office to cut down on the fan and hard disk noise in here. I didn’t have a good way to extend my 40-gigabit connection all the way to the other side of the house, so I wound up installing a 12-terabyte hard disk and a 1-TB SSD for lvmcache in my workstation. That let me keep a synced copy of my video files on my workstation, my NAS on the other side of the house, and my off-site server at Brian Moses’s house.

My new MokerLink 2.5-gigabit Ethernet Switch

My home office’s new 4-port Mokerlink 2.5-gigabit switch with two 10-gigabit SFP+ ports in its new 3D-printed home

While I was setting that up, I made a sort of informal decision that I would attempt to keep my file storage setup configured in such a way that I would never need more than a 100-megabit Ethernet connection. This was a good fit for the symmetric 150-megabit Internet connection I had at the time.

Things have changed since then. I have upgraded our home FiOS internet service to symmetric gigabit. Brian upgraded his service to symmetric two gigabit. I also swapped out my old homelab server and NAS for a mini PC with four 2.5-gigabit Ethernet ports. I knew it would only be a matter of time before I added a second device with 2.5-gigabit Ethernet ports.

That day has already passed, and I have enough USB 2.5-gigabit Ethernet dongles to upgrade two more machines. It seems like it is about time to upgrade my network cupboard to 2.5-gigabit Ethernet, and it also seems like it would be fun to connect my home office and my workstation back to the cupboard using 10-gigabit Ethernet.

Here’s the tl;dr

Everything is working great. It took less than five minutes to swap out the switch in my office, and I was immediately seeing better than 2.3 gigabits per second between my desktop PC and my office’s Proxmox server via iperf3.

It didn’t take much longer to temporarily stash the 8-port managed MokerLink switch into my network cupboard. The web interface is basic but capable, and iperf3 is running just as fast in both directions between devices connected anywhere in my home.

The real summary is that the MokerLink 2.5-gigabit switches are exactly what they are supposed to be: boring! I plugged them in. They work. They are faster than the switches they replaced.

How fast is 2.5-gigabit Ethernet? Is it fast enough?

The speed of 2.5-gigabit Ethernet is just math, but math is boring, and it may not help us understand what these speeds are equivalent to. You want to know if the network is going to be your bottleneck, and you’d like to figure out if you could use the upgrade.

I put together a little table that describes what each Ethernet standard’s maximum performance is roughly equivalent to. The equivalents aren’t exact, but they’re reasonably close.

Ethernet megabytes/s Rough Equivalent
100 megabit 12.5 Slow
gigabit 100 500 GB laptop hard disk
2.5-gigabit 250 3.5” hard disk or older SATA SSD
5-gigabit 500 the fastest SATA SSDs
10-gigabit 1,000 the slowest NVMe drives
40-gigabit 4,000 mid-range NVMe drives

What is wrong with these rough equivalents? Hard disks don’t have a steady throughput. They are faster at the outer edge of the platter and slow down as they approach the center. The largest modern 7200-RPM disks can reach a little over 300 megabytes per second on the fast end, but they’ll drop down to something a little below 150 megabytes per second on the slow end.

Your 10-gigabit Ethernet network also has a lot more latency than your local NVMe drive. While it is no trouble for a mid-range NVMe to manage 500,000 input/output operations per second (IOPS), introducing a fraction of a millisecond and an extra protocol like CIFS or NFS will severely impact your IOPS when accessing a remote NVMe drive over 10-gigabit or 2.5-gigabit Ethernet.

Upgrading to 2.5-gigabit Ethernet is kind of a no-brainer these days. We are at or below $10 per switch port, USB 2.5-gigabit Ethernet dongles are under $10, and PCIe cards are in that ballpark as well. Not only that, but both 2.5-gigabit and 5-gigabit Ethernet will almost definitely work with the cables that are successfully carrying gigabit Ethernet signals.

I can tell you that one switch for each end of my house, three 10-gigabit SFP+ modules, a two-port 10-gigabit Ethernet card for my workstation, and a couple of USB 2.5-gigabit Ethernet adapters cost less than $200.

Is 10-gigabit Ethernet worth the hassle?

This is a fantastic question! 10-gigabit Ethernet pricing is getting pretty good, but the price of the switches and network cards isn’t the only thing to consider.

Both 10-gigabit Ethernet switches and network interface cards tend to use a lot more electricity, so they also generate a lot more heat. The protocol is also much pickier about the cabling that it will manage to run over. Some people have success on shorter runs over Cat 5e cable. You’re only supposed to need Cat 6 cable, but some people have trouble even then, and they have to try multiple SFP+ transceivers in the hopes that they will get a good connection.

Stable Diffusion or Flux Schnell Network Man

I wouldn’t even be worrying about connecting my office to my homelab with 10-gigabit Ethernet if I weren’t writing a blog. I don’t need that extra performance, but it will be fun to tell you whether things work or not!

I helped Brian Moses run Cat 5e to every room in this house almost a decade before I bought the house from him. I know that there is somewhere around 50’ to maybe 60’ of Cat 5e cable between my home office and the switches in my network cupboard.

That is short enough that I may have success over Cat 5e. The RJ-45 transceivers I picked out also support 2.5-gigabit and 5-gigabit Ethernet, so it will still be a worthwhile upgrade even if I only manage half the speed.

UPDATE: The trio of Xicom SFP+ modules arrived, and they are working great. The connection to the other side of the house is running from a 10-gigabit port on my office switch to the wall via a Cat 5 cable from the nineties, then through the attic on around 40 to 50 feet of Cat 5e cable Brian and I pulled in 2011, then connected from the patch panel in the network cupboard to the 10-gigabit port on the switch. It connected at 10 gigabits per second, and I managed to run enough simultaneous iperf3 tests over various 2.5-gigabit and 1-gigabit Ethernet connections to push around 6 gigabits per second over that link.

When are you going to tell us what you ACTUALLY BOUGHT?!?!

MokerLink has three sizes of switch that I was considering, and each size is available in a managed or unmanaged variety. I waffled a lot on whether I was going to pay the extra $20 to put a managed switch in my network cupboard.

The number one reason on most people’s list for using a managed switch is to set up secure VLANs for their skeevy, untrusted Internet of Things hardware and oddball IP cameras with sketchy firmware that might phone home, and they don’t want that gear with janky firmware being used as a foothold to hack their important computers.

I am not worried about that in the slightest. I don’t run anything important outside of my Tailscale network. As far as I am concerned, my private and protected VLAN is my Tailnet. That is all I need, but I bought a managed MokerLink switch for my cupboard anyway. I thought $20 was a small price to pay to let you know how I feel about MokerLink’s managed switches.

The three sizes on my radar were the 8-port and 5-port 2.5-gigabit Ethernet switches, each with a single 10-gigabit SFP+ port, and the 4-port model with a pair of SFP+ ports. I thought that I wanted two SFP+ ports in the cupboard so that future Pat would have more options available, but I realized that I could probably connect every in-use port in the house to a single 8-port switch. Math also told me that four 2.5-gigabit ports in the cupboard wasn’t going to be enough.

I put a MokerLink 8-port managed 2.5-gigabit switch in the cupboard. Then I placed a 4-port unmanaged switch with a pair of SFP+ ports in my home office.

I only need a single 10-gigabit port in the cupboard to connect to my office. When they day comes that I need a second port, then it will definitely be time to pick up a dedicated 10-gigabit switch!

Fanless switches are awesome!

We put a pair of small, inexpensive, fanless gigabit Ethernet switches in Brian’s network cupboard in 2011. They are still running in my network cupboard today. Aside from power outages, they have been running 24/7 for 13 years.

They don’t use much power. They don’t generate much heat. Even more important, though, is that they don’t have any moving parts.

Moving parts like fans will eventually fail. It was important to me that my new switches have no moving parts, and these two MokerLink switches have no fans. There is a very good chance that both of these switches will still be in service in my house in another 13 years.

The network is the road that all of my computers have to travel on. I don’t want that road to be exciting, fancy, or even interesting. I want it to work, work well, and continue to function almost indefinitely.

Why MokerLink?

The homelab community in general seems to be pleased with MokerLink’s hardware, and I have friends who are already using some of their gear. The prices are pretty good, too!

I was extremely tempted to try one of the more generic-looking 2.5-gigabit switches. There are more than a few that look physically identical and have the same specs as MokerLink’s offerings, and those switches are 30% to 40% cheaper than MokerLink’s devices. The price gets even lower if you shop at Aliexpress.

I know that 40% sounds like a lot, but with these small switches, that winds up being only $25 or so.

My Network Cupboard

The 8-port MokerLink 2.5-gigabit switch is just hanging by its cables because I can’t install it until the 10-gigabit modules arrive!

MokerLink doesn’t have an 8-port 2.5-gigabit switch with TWO SFP+ ports. I did see some rather sketchy-looking and quite inexpensive 8-port switches that do, but I was weirded out by their odd faux-VLAN toggle button on the front of their hardware. This option seems similar to WiFi client isolation. I was tempted by the extra SFP+ port, but extremely turned off by the odd VLAN option.

I am a dopefish and have a poor memory!

The back corner of where my two desks meet is REALLY far back there. I can’t reach that far unless I stand up and bend way over. It is a reasonable place to keep my office’s network switch. I can see if things are working, and I can reach back there if I have to when I am setting up a new laptop or server.

I designed a custom bracket to hold my old gigabit switch, and that bracket helps wrangle the patch cables and keeps everything tidy. I figured I should print a new bracket for my smaller MokerLink switch.

When I was working on that, I noticed that I was already out of switch ports, and I was kicking myself for not buying an 8-port switch for my office. I wound up redesigning the bracket to hold both the new MokerLink switch and my old gigabit switch.

The bracket is awesome, and I will leave it installed, but I didn’t realize until after I was finished installing it that I won’t need both switches.

I completely forgot that I would soon be using the two SFP+ ports! I’ll be moving the uplink cable from the wall, and I’ll be moving my workstation to the other. I will have two ports free, and that is more than enough for testing things at my desk.

I also remembered why I didn’t order an 8-port switch for my office—MokerLink doesn’t even make a fanless 8-port switch with TWO SFP+ ports!

How is everything working out?

You have already read more than 1,500 words, and I haven’t even told you how the new hardware is working out so far. I will have to stick a tl;dr up at the top!

Everything is working great. I am getting slightly better than 2.3 gigabits per second via iperf3 in either direction. This is true between devices connected to my office switch, and between devices plugged into my office switch to and from devices connected to my network cupboard switch.

The web interface on the managed MokerLink switch is basic but quite functional. It is possible that I missed something, but I couldn’t find a way to limit MokerLink’s web management interface to particular ports of particular VLANs.

I would expect that if you want to divvy up your network into VLANs for security purposes, that you’d want to make sure your guest WiFI, IoT, and DMZ VLANs wouldn’t be able to hammer away at the password to your management interface. I also wouldn’t expect that interface to be well hardened. It is up to you to decide if this is a deal-breaker for you.

I don’t have enough 2.5-gigabit devices to simultaneously beat the crap out of every port on these switches, but the review on the unmanaged 8-port Mokerlink switch at servethehome seems to indicate that you can get quite close to maxing every port out.

I thought I would finally be able to max out Tailscale on the Intel N100 CPU!

I have a vague recollection of us learning that Tailscale’s encryption speed tops out at somewhere around 1.2 gigabits per second on the older Intel N5095 CPU. Depending on what you are measuring, the Celeron N100 is a 25% to 40% faster CPU. Surely it won’t quite be able to max out a 2.5-gigabit Ethernet port, right?!

My older model but roughly comparable mini PC with a Ryzen 3550H CPU gets really close to maxing out all its CPU cores while reaching 2.15 gigabits per second via iperf3 over the encrypted Tailscale interface.

My Intel N100 mini PC manages 2.26 gigabits per second with a sizable chunk of CPU going unused. That is around 90 megabits short of maxing out the physical link, which is about how far short Tailscale falls when running over gigabit Ethernet as well.

I won’t have a good way to find the actual limit until my 10-gigabit hardware arrives in a couple of weeks, but I did skip ahead and set up a bonded LACP link using two of my unused ports on my router-style Proxmox server. We’ll see if that will help me max out Tailscale on the Intel N100 in a couple weeks!

UPDATE: I am able to get pretty close to 100% CPU utilization running iperf3 over Tailscale over my four bonded 2.5gbe ports. I am stuck at about 2.26 gigabits per second in the inbound direction, but I can reach 2.84 gigabits per second when going outbound.

Honorable mention

I picked up a 5-pack of CableGeeker’s 5’ flat-pack Cat 6 patch cables in a variety of colors. I dropped these cables down through a hole in the network cupboard to the table where our mini PCs and servers sit.

I am finally running out of good patch cables after more than two decades of hoarding. I have enough stock that I could crimp some fresh ones, but I thought it would be useful to color-code these cables that are difficult to trace, and I don’t know that I have enough colors on hand to do that.

I am using 10’ CableGeeker flat-pack cables all over my office now, and I have packed my various network toolkits and laptop bags with their cables. They are well made, and they have all done a good job of carrying 2.5-gigabit signals. We will find out if the modules capable of carrying 10-gigabit Ethernet in a couple of weeks.

I am a big fan of flat network cables. They roll back up nicely, and they take up less room in your bag even if you’re not careful about how you wind them up. They’re also easier to hide along floorboards.

What’s next?

I wasn’t going to publish this until after I had a chance to properly install the 8-port switch in my network cupboard, but I realized that I might need to bring it into my office for testing if I wind up having trouble with the 10-gigabit SFP+ modules. It isn’t going to be easy to fish it back out once installed in its proper home in the cupboard!

I have a three-pack of Xicom 10-gigabit SFP+ copper modules and a 2-port Intel 10-gigabit PCIe card ordered and on the way. The SFP+ modules arrived and they are working splendidly!

I did stick the corner of the new switch in between the two pieces of pegboard in my network cupboard, and the new switch will definitely fit where it needs to go. I am definitely relieved by this. Cutting a new swinging 19” rack would be a fun little CNC project, but the project that Brian and I worked on 14 years ago has historical significance. I’d like to see that simple rack still doing its job in another 14 years!

If you want to dive deeper into the world of home networking and connect with other tech enthusiasts, you should check out the Butter, What?! Discord community. We have a dedicated channel for homelab setups and discussions where you can share your experiences, ask questions, and get some helpful advice.

Leave a comment below and let me know what you think of this post!

I Am Running XFCE On My Proxmox Host?!

| Comments

I was supposed to buy one power-sipping Celeron N100 mini PC to replace the aging and inefficient AMD FX-8350 in my homelab. That is what I did at first, and it worked out great. My Intel N100 with a single 14-TB USB hard disk averages around 15 watts, while the FX-8350 with four 4-TB hard disks was averaging just over 70 watts. Not only is it more efficient, but the mini PC is very nearly as fast and has nearly double the usable amount of storage as the old server. How awesome is that?

Then I bought a second N100 mini PC and decided to experiment with it before putting it directly into my homelab. I learned that the N100 has enough GPU to play some reasonably modern games, and it can emulate consoles up to and including the Nintendo Wii.

Minisforum UM350 Proxmox Machine In Its Temporary Home

My Proxmox host mini PC with XFCE in its temporary home where I can easily plug and unplug USB devices while I continue testing

That’s what put this weird thought in my head. Why not load Steam and Retroarch on the host OS of one of my Proxmox nodes and leave it on the TV in the living room? That is as far as I got with the question because I don’t have wired Ethernet running to the TV, and I am not going to put one of my Proxmox nodes on WiFi.

Then I bought a third mini PC with a Ryzen 3550H CPU. This older CPU roughly comparable to the Intel N100 in both power consumption and horsepower, but the Ryzen has an integrated GPU that is 3 or 4 times faster.

These two extra Proxmox nodes have both been sitting on my desk for enough weeks just waiting for me install them in their permanent homes that it getting another hare-brained idea was inevitable. I at least am hoping that my new idea is a good idea!

Installing XFCE on the Proxmox install was easy!

I just had to run apt install task-xfce-desktop. Aside from creating a non-root user account for myself, that was it. You will have to either reboot or manually fire up lightdm yourself, and you’ll be able to log into XFCE.

I don’t believe Proxmox uses network-manager, so the network controls in XFCE aren’t going to work. That is fantastic, because I wouldn’t want XFCE goobering up my Proxmox networking settings!

I had to add myself to the sudoers group and log back in before I could install Steam. That was easy.

OBS Studio via VNC

I installed flatpak so I could use it to install OBS Studio with VAAPI. That was also easy, and it went quite smoothly.

It took me longer than I want to admit to remember the name of the project that replaced Synergy. I remembered that the modern fork of Synergy is called Barrier. Barrier was in apt, and it only took me a few minutes to get that working. Now I could move my mouse to the right edge of my monitor and start controlling the Proxmox host on my office TV.

One of our lovely friends in our Discord community pointed out that Barrier is also dead now, and that the new fork is called Input Leap. This was not in apt, so I am just going to leave my barrier setup running for now. It is locked down to my Tailscale network, so I am not worried about it being an older piece of network code.

Why on Earth am I doing this?!

I have two reasons for wanting to run a desktop GUI, and I have one extra reason for wanting one of my Proxmox nodes to live in my office.

The first was inspired by a recent hardware problem on my desktop PC. My PSU fan started making some noise, so I had to shut it down to work on that. I happened to have a spare power supply that was beefy enough to swap in its place while I replaced the power supply’s fan, but only just barely. This could have easily been a situation where I would have had to wait for Amazon to deliver me a new piece of hardware to get up and running again!

Proxmox with desktop logged in

The Proxmox server is using less than 3 GB of RAM with the XFCE desktop running, OBS Studio running, and the Octoprint container running

I thought it would be nice to have a spare mini PC with a desktop GUI at my desk that I could use in an emergency. I could quickly plug in my monitor, keyboard, and mouse so I could use Discord, Emacs, and Firefox. That is plenty of functionality to keep me chugging along, and it is way more comfortable than using my laptop.

This idea all by itself seems like a good enough excuse to try this out.

If you’ve been wondering why I wouldn’t attempt to pass the GPU through to a virtual machine and isolate the desktop GUI inside, here is the first part of the explanation. When my computer stops working, I want to be able to plug my important peripherals right in and use them. I don’t want to be futzing around with passing through all the appropriate USB devices. I wouldn’t even have a good place to sit down and do that work!

A dedicated OBS streaming box might be a nice thing to have!

I am running out of USB 3 ports on my desktop PC. Some of my video gear is plugged in via USB hubs. Sometimes I need to disconnect and reconnect my USB to HDMI adapters to get them to work correctly. Even just reducing the number of USB cables that have to run toward my desktop computer will be an improvement.

I have my podcasting camera mounted upside down. This lets me see my Sony ZV-1 vlogging camera’s little display while pushing the lens as close to my monitor as possible. The USB to HDMI dongle that I use doesn’t allow for simple flipping and inverting via the V4L2 driver, so I have to pass it through something like OBS Studio to transform the camera.

Having a dedicated streaming mini PC to handle my cameras, encoding, and streaming seems like it’d be handy.

My desk with my podcasting camera

The latency and frame rate of the OBS Studio output from my mini PC over VNC are both terrible, but it is more than adequate for making tweaks to the OBS Studio settings!

I am able to encode 1080p HEVC video using the GPU via VAAPI. It only uses around 40% of this tiny iGPU’s horsepower, but it has had weird encoder hiccups during the first couple seconds of recording. I suspect it just isn’t ramping up the speed of the GPU quickly enough, but it is fine once it settles in. Worst case I have to dial that back from h.265 to h.264.

I haven’t decided exactly how I am going to tie this into my recording and streaming setup, but I am excited to see that this little box is more than capable of handling these tasks.

I simplified my virtualized Octoprint setup

A lot of people run Octoprint on a Raspberry Pi. I’ve always run it in a virtual machine on my homelab server. When I moved my old FX-8350 homelab server out of my office to quiet things down in here, that meant my 3D printer was now 60’ away from the Octoprint virtual machine, and I surely wasn’t going to run a 60’ USB cable across my house!

I brought one of my old OpenWRT routers out of retirement and installed socat. That let me extend a virtual serial port across my network so I was able to keep using my Octoprint instance. It also gave me an extra access point to add to my 802.11r WiFi roaming network, which was a nice bonus.

Octoprint on my Proxmox host via Tailscale

The socat setup worked most of the time, but every once in a while I would have to restart the socat processes. This didn’t happen often, but these days I only fire up my last remaining Octoprint server once every month or two. The socat process now almost always needs to be kicked before I can start printing.

I was able to plug my Sovol SV06 directly into my new quiet Proxmox server in my office. I did have to cheat, though. If I wanted to run the cable along the wall, I would have needed a 30’ USB cable to make that journey. I decided to run a cable directly across the carpet from my 3D-printer stand to my desk, so there’s 32” of USB cable on the floor with a temporary piece of duct tape helping to make sure I don’t snag it with my foot.

This cable across my floor is a terrible solution, but the Sovol SV06 will definitely be my last printer that is slow enough or old enough to be used with Octoprint. I don’t have a timeline for retiring it, but it is definitely going to happen. That means this short span of USB cable across my office floor is temporary. We just don’t know how temporary!

Is this a good idea?

Should you do this with one of your production Proxmox servers at your company? Absolutely not. Should you do this at home? If you know what you are doing, I would say that it is worth a shot. I am only eating up 2.6 gigabytes of RAM having a desktop session logged in with OBS Studio running. That will be less than 10% of the available RAM in this machine once I am finished shuffling SO-DIMMs around.

Way back when I built my arcade cabinet, I started telling people that the best computer in the house to use as a NAS is the one that already has to be powered up 24 hours a day. An arcade cabinet is way more fun when you can just walk up to it, hit a button, and immediately start jumping on goombas.

Mario Galaxy on an Intel N100 Mini PC

If you’re already paying a tax on your electric bill to keep one computer running all day long, why not give it one more task instead of buying another machine and paying that same tax again?

The important thing to bear in mind is that you may be tying the uptime of these tasks together. When you can’t play Dead Cells because your GPU has somehow gotten itself into a weird state, and the only way to fix it is a reboot, then all the virtual machines on that Proxmox server will also wind up being restarted.

If that means your PiHole VM and Jellyfin container have to be stopped, then nobody in your house will be able to access websites, and someone’s movie stream may stop. Nobody is going to lose any sales, but it is up to you to decide how much money it is worth to avoid this situation.

Conclusion

This little mini PC project has proven to be fun, and it was as easy to get a desktop GUI running on Proxmox as I had hoped! Having a single power-sipping mini PC fitting into two or three roles at the same time seems like a good value, and I am excited to see just how much use I get out of these tertiary use cases.

Odds are pretty good that I will never need to use this Proxmox mini PC as an emergency workstation, but I will feel better knowing that it is available. I do expect to get some use out of the video recording and streaming capabilities. I have some work to do there because I want to be able to use the output from OBS Studio on the mini PC as a virtual webcam in Chrome on my desktop when I connect to Riverside.fm.

What do you think? Do you think it is silly to run a GUI on a server? I would usually be the first person to think so! Are you already doing something similar with one of your Proxmox servers? Or are you running your emergency GUI in a virtual machine? Does it seem like a good idea to overload one of my Proxmox nodes as a video capture, encoding, and streaming machine? Tell me about it in the comments, or join the Butter, What?! Discord community to chat with us about it!

I Added A Refurbished Mini PC To My Homelab! The Minisforum UM350

| Comments

I saw the refurbished Minisforum UM350 on Amazon for $155, and my brain immediately started tingling. Geekbench 5 says that this older Ryzen 3550H is a little faster than an Intel N100, though the spec sheet implies that it may use a lot more power than an Intel N100 when you start to push it hard, but surely it can’t be too bad at idle! It has TWO DDR4 SO-DIMM slots, and the listing INCORRECTLY says that it can run with a pair of 32 GB SO-DIMMs. Not only that, but this Minisforum box has a 2.5-gigabit Ethernet port.

Minisforum UM350 Ryzen 3550H mini PC

So you’re saying that for $10 to $15 more than a brand new Trigkey N100, I can get a comparable machine that supports twice as much RAM, has a faster network port, and should have a much faster GPU?! The lowest price I have seen on a mini PC with 2.5-gigabit Ethernet is around $200. This seemed like it was worth a try!

Mine doesn’t work with 32 GB SO-DIMMs

It sort of works. Memtest86 sees that there are two sticks of RAM installed, but the BIOS and Memtest86 both only see 32 GB. Memtest86 was getting 15 GB/s throughput with the stock pair of 8 GB SO-DIMMs, but it was only getting 9 GB/s with the pair of 32 GB SO-DIMMs. That makes me suspect that it is really only using one of the DIMMs. It did pass a full run of Memtest86 with the pair of 32 GB SO-DIMMs installed.

The BIOS is pretty basic and didn’t give me much insight here.

My suspicion is that there is something about my particular pair of 32 GB SO-DIMMs that this mini PC doesn’t like. Just the fact that it can fully utilize one 32 GB SO-DIMM is a strong indication that it should be able to use a second, and there isn’t anything about the first-generation Zen CPUs that would suggest that it won’t support 64 GB.

I had a lot of weird problems with the 32 GB SO-DIMM that went away when I put the stock RAM back in. My USB keyboard had to be unplugged and replugged a lot. I was able to install Proxmox, but it kept locking up shortly before the fresh install finished booting. It didn’t lock up every time, though. It booted clean the first time. It was definitely acting weird.

If my Minisforum UM350 would have worked with 64 GB of RAM, it would have been an AMAZING value for my homelab. Even so, I feel that the 2.5-gigabit Ethernet port alone was worth the extra $10!

Minisforum vs. Beelink or Trigkey

I am pretty convinced that Beelink and Trigkey are the same company. The packaging has the same corporate address, and their cases are identical.

All of these mini PCs share some common features. They vary in size, but they would all be considered small. Almost every model ships with some sort of hardware to mount the mini PC to the VESA mount on the back of your monitor.

This particular Minisforum PC is a little bigger than any of the Beelink or Trigkey boxes that I have seen, but not by a huge margin. I enjoy that it has a tool-free lid granting you access to the RAM, NVMe, and WiFi card. That kind of feels like a wash because the NVMe is tucked underneath the edge of the shell, and it is a bit of a challenge to get it in to place to push it straight into the slot.

Minisforum UM350 Ryzen 3550H mini PC

You can see some scuff marks on the tool-free lid of my refurbished Minisforum UM350!

I don’t know why, but I was under the impression that Minisforum wasn’t a Chinese company. I thought they were a European company shipping rebranded hardware shipped from China.

Buying a Minisforum product has all the same advantages and disadvantages from Beelink, Trigkey, Topton, or Sovol. The documentation is thin and often poorly translated. Customer service is probably on a different schedule than you. Even so, you will be getting a lot of hardware for your money.

I am extremely pleased with the Ryzen 3550H!

I expected that the 3550H would be reasonably comparable to the much newer Intel Celeron N100. I didn’t expect the aging Ryzen mobile CPU to be 15% faster, and I also didn’t expect the Minisforum box to idle at a lower wattage than either of my Intel N100 mini PCs!

After installing and updating Proxmox on my new mini PC, I plugged it into my spare Tasmota smart outlet, and I have been logging electricity usage. It is using 0.11 kilowatt hours each day. That works out to an average of only 4.6 watts.

Ryzen 3550 Minisforum UM350 Kilowatt hours per day and watts

That is 1.6 watts less than I have ever seen on my Topton CWWK N100 router-style mini PC!

I also ran a couple dozen concurrent copies of openssl speed to keep all the CPU cores pegged, and the Minisforum UM350 topped out at 28 watts. That is only 2 watts more than I have ever seen my Intel N100 mini PCs use. That isn’t too bad, either. It is using 7% more electricity while being about 15% faster. That is a decent trade!

How is the iGPU?

I was really curious about this. The Ryzen 3550H has the exact same iGPU silicon with the same core counts as my Ryzen 5700U laptop, but while the Ryzen 3550H’s GPU is clocked at only 1.3 GHz, my laptop’s GPU is clocked at 1.9 GHz.

None of these machines have what you would call powerful GPUs, but the Intel N100 can easily play emulated games up to and including the Nintendo Wii, and I was able to play Gunfire Reborn at just a little under 60 FPS at 720p. My laptop is a much more capable gaming machine than my N100 mini PCs, and I kind of expected the Ryzen 3550H to be closer to my laptop.

Why Gunfire Reborn? It is a fun game that isn’t very old, and it is still being updated and improved. It also runs reasonably well on even the slowest mini PCs.

I am quite pleased with how the Ryzen 3550H stacked up. I swapped my Intel N100 gaming PC’s NVMe right into the UM350, and I fired up Gunfire Reborn with the exact same settings. It runs at 102 FPS at 720p, and it still managed 92 FPS at a full 1080p. Mind you, this is with all the settings dialed down to low.

Intel N100 vs. Ryzen 3550H Geekbench 5

Just for reference, my Ryzen 5700U laptop runs at about 132 FPS at 720P and 118 FPS at 1080p. That is about 30% faster than the Ryzen 3550H. That is in the ballpark of what I expected.

Does this matter? Probably not. I don’t think anyone is buying a refurbished Minisforum UM350 to play games. It is likely more interesting to you just how well this thing transcodes video for Jellyfin or Plex. The silicon used for encoding video is entirely separate from the silicone used for playing games, so how much better this Ryzen iGPU runs Gunfire Reborn doesn’t mean it will transcode video faster than the Celeron N100!

I had a little trouble getting the Jellyfin Proxmox helper script working. I do know from friends in our Discord community that the Plex and Jellyfin hardware encoding on this Ryzen iGPU is more than up to the task, but I am quite curious how it stacks up to the Intel QuickSync hardware on the Celeron N100. I am going to give this another try soon, but I didn’t want to let it slow down the publishing of this blog post!

Did Pat get hosed by not being able to use 2x32 GB of RAM in the Minisoforum UM350?!

Nope! I am going to turn this into a really good value for me!

My $155 Minisforum UM350 came with 2x8 GB SO-DIMMs. That is technically enough for my needs, but at the same time it feels like a bummer.

My $143 Trigkey N100 came with a single 16 GB SO-DIMM. I then spent $105 on 2x32 GB of DDR4 SO-DIMMs. The intention was to buy a second N100 mini PC and upgrade both to 32 GB.

Minisforum UM350 Ryzen 3550H Mini PC

Can you see how the NVMe with the thick heat sink is partially blocked by the chassis?

Are you keeping score? That means I was going to pay $390 for a pair of N100 mini PCs each with 32 GB of RAM, and I would have had an extra pair of 16 GB SO-DIMMs left over.

What if I buy another Trigkey N100? I can upgrade it to 32 GB, then I can use my leftover pair of 16 GB SO-DIMMs to upgrade the Minisforum UM350.

This may mean my homelab is getting out of control, but it WILL give me three comparable mini PCs each with 32 GB of RAM for my homelab for $545.

It might just be an excuse to add one extra mini PC to my homelab, but it’d only lower the cost of each mini PC with 32 GB of RAM from $195 to $180.

Conclusion

I could write twice as many words about how pleased I am with this Ryzen 3550H mini PC from Minisforum, but it won’t do you any good. This is an older model that you won’t likely be able to buy, and that bums me out!

That said, you CAN still buy a Beelink or Trigkey mini PC with a Ryzen 5560U or Ryzen 5700U, and those models are priced quite aggressively. They both have more CPU and GPU chooch than my Ryzen 3550H, but they are missing the 2.5-gigabit Ethernet. That might be just fine for your use case, and you can add 2.5-gigabit Ethernet for about $7. My suspicion is that the 5560U and 5700U would both idle just as well as my refurbished Minisforum PC because my Ryzen 5700U laptop uses 5 or 6 watts at idle, and that includes the extra power for the screen.

What do you think? Did I get a good deal on my UM350 mini PC? Do you own one, and are you happy with it? Do you think I need to grab a Ryzen 5560U mini PC to test its power efficiency? Let me know in the comments, or join the Butter, What?! Discord community to tell us about your experience or see what we are up to with out homelab mini PCs!

I Bought The Cheapest 2.5-Gigabit USB Ethernet Adapters And They Are Awesome!

| Comments

I very nearly typed “so you don’t have to” as the end of this blog post’s title, but that wouldn’t be the least bit accurate! I bought these because I posted a link to a deal for a Sabrent 2.5-gigabit USB Ethernet adapter for $20 on our Discord server, and someone piped up and said they buy theirs from Aliexpress because the cheap ones over there use the same Realtek R8169 Ethernet chip. I fully expected these inexpensive network dongles to work great!

I now have two mini PCs in my homelab with 2.5-gigabit Ethernet ports, so it feels like it is going to be time to start ordering some 2.5-gigabit switches. I figured I could order a couple of 2.5-gigabit dongles to upgrade some other machines around the house, and I could also use those dongles to make sure that the Cat 5e cable that Brian and I pulled through this house a decade ago is up to the task of pushing faster speeds from my homelab on the opposite side of the house to my home office!

Aliexpress 2.5 Gigabit Ethernet

I also don’t know that I literally chose the cheapest dongles on Aliexpress. Some cost more but had free shipping. Others never had free shipping. Some vendors had free shipping over $10. I figured that last one would be where I’d get a deal, so I bought a USB-A and USB-C dongle from the same store for a total of about $14 for the pair.

Oddly enough, these are the most premium-feeling USB network adapters I have ever owned. My 1-gigabit Ethernet dongle in my laptop bag is all plastic and weighs 20 grams. These new dongles are in an aluminum shell, and they weigh over 30 grams each!

Why am I excited about USB network adapters? Aren’t PCIe cards better?!

The best bang for the buck in mini PC gear for your homelab cluster will have DDR4 memory and almost always have 1-gigabit Ethernet ports. Part of the reason the older DDR4 mini PCs are such a good value is because large DDR5 SO-DIMMs cost 40% more. That isn’t a big difference if you’re only buying one mini PC or if you aren’t maxing out the RAM, but it can add up to a sizable amount of cash if you’re building a small cluster.

Not only that, but some of us already have mini PCs in our homelab that only have gigabit Ethernet. These $7 dongles are a fantastic way to turbocharge our older mini PCs, and they are such an inexpensive way to add 2.5-gigabit Ethernet without paying an extra $80 to $120 for a more expensive mini PC.

I now have two mini PCs in my homelab with built-in 2.5gbe ports. The third Celeron N100 mini PC in my cluster only has 1gbe. I will definitely be adding one of these $7 dongles to that $143 mini PC once I get around to upgrading my network cupboard to 2.5-gigabit Ethernet!

The cheap 2.5-gigabit dongles passed their first test without any hiccups!

I plugged the USB-A dongle into my workstation, then I plugged the USB-C dongle into my Minisforum UM350 mini PC, and I connected them up with one of my new 10’ flat-pack Cat 6 cables.

I have been a fan of flat Ethernet cables for many years, but I haven’t bought any in ages, and I needed a reasonably long cable for my network toolkit. The cables I decided to try were excited to tell me that they were Cat 6 cables at Cat 5e prices, so I was a little skeptical. Seeing them move data via iperf at 2.36 gigabits per second made me feel better about my purchase!

Aliexpress 2.5 Gigabit Ethernet

There isn’t much to say. The kernel modules automatically loaded on both my Ubuntu workstation and my Proxmox mini PC. Pings were fast, and throughput was right where it should be.

I would call that a successful first test.

How about testing the Cat 5e in my walls?!

Brian and I ran Cat 5e to every room in this house about five years before I moved in. I bet Brian wrote down which ports on the patch panel lead to each room, but I don’t have that document. I was smart enough to make a note of some of the important ports when I was setting things up, so I knew that my office was on port 13 on the 48-port patch panel.

I plugged one of the 2.5-gigabit USB dongles into my laptop and connected my laptop directly to the RJ-45 jack on my office’s wall using one of those 10’ flat-pack Cat 6 cables.

Then I walked across the house with a second 10’ flat-pack cable and plugged one of the extra 2.5-gigabit Ethernet ports on my router-style mini PC Proxmox server into port 13 on the patch panel.

I configured each of those network interfaces with new IP addresses on an unused subnet, and then I ran my usual iperf tests. I am not surprised to report that I was 2.36 gigabits per second in both directions. Huzzah!

It was nice to see my Realtek R8169 USB adapter communicating with an Intel i226-V chipset, and it was also great to see that adding a second inexpensive 10’ flat-pack cable to the connection didn’t degrade my signal.

Will the cheap 2.5-gigabit Ethernet adapters be stable over the long term?

I put both 2.5-gigabit adapters away after my tests were completed. That is kind of a bummer because I definitely want to make sure they continue to work for weeks, months, and years.

I am pondering which Mokerlink 2.5-gigabit switches I need to buy, and I don’t have much use for extra 2.5-gigabit network interfaces until I order and install a new switch in both my network cupboard and my home office. That work is at least a few weeks away.

We have several people on our Discord server who have had good luck with their R8169 USB adapters, and Brian Moses tells me he had his homelab server connected with an R8169 USB dongle for a few months. I don’t expect to see any long-term problems once I put these into service, but I will definitely report back with anything I see!

What’s next?

It is a good sign that the cabling in my walls is in good enough shape to run at 2.5 gigabits per second, but the Mokerlink 2.5-gigabit switches I want to put in my network cupboard and home office have 10-gigabit SFP+ ports. I do not have proof that my wiring will get me that far!

The next step is definitely to buy a pair of Mokerlink switches, and I will definitely shop around for 10-gigabit modules that actually support 2.5-gigabit and 5-gigabit Ethernet. My hope is that they will negotiate down if my office can’t support a full 10 gigabit.

Want to follow along on this journey? We’re building a homelab and DIY NAS community on the *Butter, What?! Discord server where we share tips, tricks, and troubleshoot issues. Swing by, join the conversation, and share your homelab setup with us! I’m also curious to hear your thoughts. What kind of network gear are you using in your own setups? Are you using inexpensive USB 2.5-gigabit Ethernet adapters? How are they working out for you? Are you using the Mokerlink switches that I am currently shopping for? Let me know how things are working out for you in the comments below!

When An Intel N100 Mini PC Isn’t Enough Build a Compact Mini-ITX Server!

| Comments

This really depends on what you mean when you say that a Celeron N100 mini PC isn’t enough. If an N100 mini PC isn’t powerful enough, you can spend a bit more on a Ryzen 5560U mini PC that is more than twice as fast, or splurge on a mini PC with a Ryzen 7840HS that is more than four times faster. Heck! You could buy a stack of four N100 mini PCs, or you can mix and match models to fit your unique needs!

Maybe that isn’t what you mean. Maybe you need to cram in more hard drives to store backups, the videos you record yourself on your vlogging camera, or movies and TV shows you acquire on the high seas. You should have no shame in hanging a big USB hard disk off the back of your mini PC, but if that isn’t enough, what are you supposed to do?

Flux Schnell Small Form Factor NAS Guy

You can definitely build something for your homelab that is much more well equipped than a mini PC, but you can also keep your up-front costs and electricity bill down by sticking with a mobile CPU like the Intel Celeron N100. You just have to upgrade your form factor!

My friend Brian Moses has been using, buying, and selling all sorts of mini-ITX motherboards with low-power CPUs, lots of 2.5-gigabit Ethernet ports, and plenty of SATA ports.

You can probably build what you need, and it might only be as big as two or three stacked shoe boxes. That isn’t as small as a stack of mini PCs, but it isn’t exactly huge!

Why is the Intel Celeron N100 so popular now?

Brian and I have both been excited about the various small form factor motherboards built around mobile CPUs for a long, long time. I built my homelab server around a motherboard that was rocking an AMD 5350 processor nine years ago, and Brian put together one of his DIY NAS builds a few years ago based around an Intel N5105 CPU.

I could write an entire blog post about how much of a bummer it was that there weren’t any good homelab or NAS motherboard choices with mobile CPUs in the seven years in between our two builds. Instead, I am just going to say that it is fantastic that power-sipping CPUs are showing up on motherboards again, and they have been showing up on motherboards with plenty of SATA ports. There’s no shortage of homelab-friendly mini-ITX motherboards to choose from today!

The Celeron N5095, N5105, and N6005 had some minor popularity, but not nearly as much as the Celeron N100. I can’t explain why. The Intel N100 is an incremental evolution over the N5095 and not some sort of revolution. It is only better in the ways that you would expect from a processor released two years later: it is a little faster and a bit more power efficient.

The previous generation is every bit as useful as the Intel N100. I imagine it just took a few years for everyone to start catching on, and the Intel N5095 that started the party is now just too old to be as relevant.

You didn’t answer the question, Pat!

You are right. Sorry about that!

The Celeron N100 is awesome because it is a little less than half as fast as a Ryzen 5600 desktop CPU, and you can get an entire mini PC with an N100, RAM, and storage for about the same price! The Intel N100 has a fantastic little GPU with more than enough video-encoding capacity to transcode several 4K Jellyfin or Plex streams. You can upgrade them to 48 gigabytes of RAM. They are also reasonably priced. You can usually buy a mini PC with RAM and storage for $145.

UPDATE: The Trigkey N100 with 16 GB of RAM and 500 GB NVMe has been going on sale almost once a week for less than $140. Sometimes as low as $134 shipped!

My own Intel N100 router-style mini PC idles at around 8 watts. That is the whole thing: motherboard, network interfaces, RAM, NVMe, and the CPU. It is using half as much electricity as a modern LED light bulb. Adding the external 14-terabyte hard disk drive brings that up to about 15 watts.

All of this applies to the Intel N5095 and its siblings.

You can go a long way with just a mini PC!

Everyone’s needs will be a little different, but 32 GB or 48 GB of RAM paired with an N100 is a pretty reasonable ratio of RAM to CPU horsepower, and so is 64 GB paired with a Ryzen 5560U. You can fit a lot of light- or medium-duty virtual machines and containers on these tiny boxes! What about storage?

I had a RAID array at home from the late nineties right up until just a few years ago. They just didn’t make large enough hard drives to hold all my data. The options were to manually split things between drives, or bundle them all together in a RAID and just dump everything in one place. I chose the latter because it also adds redundancy.

Hard drives have been getting bigger at a rate that exceeds my needs, so I am using single USB hard disks on each of my mini PCs and in my personal desktop computer. A copy of all my data lives on a 14 TB USB hard drive attached to the N100 mini PC at my house, and another copy lives on a 14 TB USB hard drive attached to a Raspberry Pi at Brian Moses’s house. The original copy of that data lives on a 12 TB SATA hard disk in the computer I am sitting at right now.

Assuming they survive long enough, those 14-terabyte disks will be rather old by the time I fill them up. When I do, I will be able to replace them with 30 terabyte disks.

What if I don’t like the idea of USB hard drives? What if I need way more storage than that?!

I haven’t had any real issues with the USB hard drives on my servers in the last four years. The USB disk on my N100 Proxmox server can read and write data at just over 200 megabytes per second, which is pretty close to the maximum speed of the drive that came in Western Digital’s enclosure. I have no qualms about recommending this configuration to other people.

I also don’t want to twist your arm! If you want to build a RAID using Linux’s mdadm or ZFS, I surely don’t want to stop you!

You don’t have to give up the power-sipping Intel N5095, N6005, or N100 to do that. If those offer enough horsepower for your needs, you can start shopping for a mini-ITX motherboard and a nice little case. There are plenty of options to choose from.

I like Brian Moses’s new off-site buddy NAS build!

I am not just saying this because I host Brian’s off-site NAS at my house. Many of the choices Brian makes in his builds can be attributed to my own recommendations, thoughts, and insights. In the case of the current iteration of his off-site NAS, my contribution was me exclaiming, “OMG Brian! You are absolutely correct! That 3D-printed NAS case from makerunit looks awesome!”

Brian has been selling Celeron N5105/N6005 and N100 motherboards in his eBay store for quite a while now. He wound up pairing one of those Celeron N6005 motherboards with six refurbished 12-terabyte SATA hard drives. He is running TrueNAS Scale and connecting to his other devices using the awesome Tailscale mesh VPN.

These Celeron N5105, N6005, and N100 motherboards from Topton are all awesome. You get six SATA ports, multiple m.2 NVMe slots, and four 2.5-gigabit Ethernet ports. They squeeze a lot of hardware into a small space, and they manage to utilize every one of the limited number of available PCIe lanes on these mobile processors.

It is a fantastic little server. Brian printed his case using two different shades of blue PETG filament, and he has his face logo on the front.

The makerunit 6-bay NAS case’s front cover is held on by magnets, and you can slide any of the six hard drives right out if you need to replace them. It might not be quite as slick as a case with real hot-swap drive bays, but it costs less than two rolls of filament, and it looks super cool!

The case has room for a full-height single-slot PCIe card, so you might be able to sneak in a slim GPU for some machine learning shenanigans. You really need at least 24 gigabytes of VRAM to run the more serious LLMs, but you can fit some really useful stuff into 8 or 12 gigabytes.

I don’t believe that you will find room to squeeze a GPU into the Topton N100 motherboards. It does have a 1x PCIe slot, and that would actually be just fine for running large language models that completely fit on one GPU, but you’re going to have clearance issue with other ports on the motherboard.

You don’t have to 3D-print a case

Finding cool new uses for your 3D printer is fun. So is finding a good excuse to buy a 3D printer, and I think all of makerunit’s cases qualify here. Even so, you still don’t have to 3D-print a case. There are no shortage of awesome mini-ITX and micro-ATX cases to build a compact NAS or homelab server around.

I am a fan of all the NAS cases from Jonsbo. The Jonsbo N1 and Jonsbo N2 look equivalent according to the spec sheet, but they each have a very different vibe, and they lay out the major components in very different ways. The Jonsbo N1 looks like something out of a sci-fi movie, while the Jonsbo N2 is a more functional and usable NAS case. I could argue with going with either one.

The Jonsbo N3 looks a lot like the Jonsbo N2. The Jonsbo N3 bumps you up to 8 hot-swap 3.5” hard drive bays instead of 5, and it gives you room for a dual-slot full-height PCIe GPU. Then there’s the even larger Jonsbo N4 that gives you room for a micro-ATX motherboard and a wide half-height GPU.

Or you can follow Brian Moses again and go even larger. He migrated his personal DIY NAS to a SilverStone CS382. It has 8 hot-swap 3.5” drive bays, room for a micro-ATX motherboard and full-size ATX power supply, and he is using an Icy Dock to squeeze eight additional 2.5” SATA SSDs into the single available 5.25” optical drive bay.

The SilverStone CS382 is a big case. That not only buys you room to fit your hands inside the machine when you need to work on things, but it also gives you room for a huge GPU and a less expensive and more capable ATX power supply.

There’s an option between a full mini-ITX NAS build and USB hard disks!

Topton makes a nifty little server, the R1 Pro, that we lovingly refer to as “the toaster NAS”. Why do we call it that? Because the pair of 3.5” SATA hard disks slide in from the top like two pieces of bread! It is reasonably priced, quite compact, and it sips power. Brian stocks them in his eBay store, but they are also available with different branding on Amazon and Aliexpress.

Stable Diffusion NAS Guy

The toaster NAS features the Intel N100 processor, one DDR4 SO-DIMM slot, one m.2 slot, and a pair of 2.5 gigabit Ethernet ports. It is basically a mini PC sandwiched between two hard drives but wrapped up in an attractive shell.

This is a fantastic middle ground between a basic mini PC and a 5-bay mini-ITX home server. Depending on how you are measuring things, you could almost fit four of these in the space that Brian’s 6-bay makerunit NAS is taking up under my network cupboard.

Can we trust these motherboards from Topton?

I have to admit that when I first heard about these motherboards a few years ago, I was worried that they’d be sketchy. You had to order them from Aliexpress. Who the heck is Topton? Do they make a reliable product? Will I have to ship it back to China if it doesn’t power on? What if it stops working in a week?

Then Brian bought one and used one. A few of our friends tried them, too, and we got more confident. We started hearing DIY NAS success stories. Then Brian started importing them by the dozen selling them in his eBay store, and we didn’t have to worry about something going wrong between us here in the United States and China, and we didn’t have to wait two or three weeks for the motherboard to show up. This helped a lot!

Today we know that Brian has shipped hundreds of Topton mini-ITX motherboards to customers. I spoke with Brian about failure rates and returns on more than one of our weekend pizza nights here at my house. I don’t want to blabber about specific numbers that I may or may not have been told in confidence and that I may not be remembering precisely, but I think it is safe for me to say that he has had some returned motherboards, and I believe every one of his very small number of returned motherboards worked just fine when he tested them.

We’ve gone from an anecdote, to anecdata, to real data over the last few years. At this point, I would try any of Topton’s motherboards without hesitation.

All of this advice works in reverse

I am not entirely certain that whatever insights this blog post provides qualify as advice, but I am going to refer to it as such anyway!

Maybe you have a huge but aging NAS in your home. Maybe you have a short stack of rack-mount servers in your closet. Perhaps you’ve been planning to downsize during your next upgrade, but you were planning to move from one or more giant 4U servers into something like the Jonsbo N2 or makerunit’s 6-bay 3D-printed case.

Stable Diffusion Mini PC Guy

Is it possible that this is bigger than you need? You might be better served with a toaster NAS and a couple of Trigkey or Beelink Intel N100 or Ryzen 5560U boxes in a little Proxmox cluster that would all fit inside your childhood Return of the Jedi lunch box. You may be better off with just a stack of mini PCs.

The neat thing is that there are no wrong answers. As long as you are treating your valuable data correctly and you have good backups, then getting everything else perfectly optimal is much less important. You can just do whatever seems fun, interesting, or useful to you!

Conclusion

We’ve explored a world of options here, from the humble USB hard drive to the spacious SilverStone CS382, and many options in between. There’s no single right answer when it comes to building your ideal homelab or NAS setup. The best choice depends on your unique needs, budget, and even your personal aesthetic preferences. Maybe you’re drawn to the sleek minimalism of a mini PC, or perhaps you crave the expandability of a larger case.

Whatever path you choose, remember that the journey is as important as the destination. Don’t be afraid to experiment, to try something new, and to learn along the way. Embrace the challenges and celebrate your successes. And most importantly, share your experiences with others!

Join the Butter, What?! Discord community and connect with a passionate group of individuals who are just as excited about homelabs as you are. Come share your builds, ask for advice, and learn from the collective wisdom of the group. Let’s build something awesome together! And don’t forget to leave a comment below and tell us what you think of this exploration of mini-ITX options. What are your favorite builds? What challenges have you faced?

GPT-4o Mini Vs. My Local LLM

| Comments

I am not some sort of machine-learning enthusiast. I am not trying to shoehorn large language models into every problem I encounter. I am just a dude who writes a blog post every week or so, and I have been messing around with various things to see how they might help streamline my workflow.

I am probably only scratching the surface, but I figured this is a reasonable time to write down what I have learned so far.

Stable Diffusion man using an llm

I am almost definitely not using the most optimal local models. I have tried a handful of different things available on Huggingface, and I settled in on things that seem to work well for my needs.

Large language models aren’t replacements for search engines

I used to see a lot of comments on Reddit and Hacker News that were excitedly posting what ChatGPT had to say about the topic at hand, or what ChatGPT said the answer was. More recently I am seeing questions asked on Reddit, not being answered in a way that the poster believes is adequate, so the original poster leaves a comment with the better advice they believe they got from ChatGPT.

Large language models make stuff up all the time. I asked one of the local models I was trying out about the specifications of my first childhood computer: the Texas Instruments TI 99/4a. Much to my surprise, that rather tiny large language model seemed to be correct! I didn’t verify that everything was perfect, but it matched my memory, and that was more than enough to impress me.

Then I asked it for the specs of an Apple 2e. It confidently told me that it had a dual Intel Xeon machine with 192 GB of RAM.

There is a lot of information hiding in a lossy compressed state inside these models. GPT-4o Mini definitely has more real-world information hiding in its weights than my local LLM, and the full GPT-4 model has way more information than that. Either has a better chance of being correct than my tiny local model, but they all suffer from the same problem.

Even if full GPT-4 will be correct more often, it will still be just as confidently wrong as my local model.

I have asked ChatGPT to summarize some of my 3D printing blog posts, and it has given me back lots of bullet points that are exactly the opposite of what I actually said. I only know this because I wrote the words. I’d be careful basing any important decisions off of a summary from ChatGPT if I were you.

Large language models can definitely supplement your use of search engines!

When Windows Copilot+ Recall was announced, I almost immediately started writing a blog post about how I felt about this sort of software. I remembered with absolute certainty that there was a Gnome project more than twenty years ago that attempted to keep track of your information for you in a similar way.

The project was called Dashboard. It monitored your email, instant messages, and text files in an attempt to gather related information before you needed it. It didn’t use AI, but it felt a lot like what Copilot is hoping to accomplish. The trouble is that I couldn’t remember the name of the project, and my Google searches were coming up completely empty.

Stable Diffusion man using an llm

I had a back-and-forth with ChatGPT about it. It went off the rails a lot, and ChatGPT was extremely confident of some wrong answers, but I didn’t eventually get it to mention both Dashboard and Nat Friedman in one of its responses.

Finding that nugget of information made it easy for me to find some old references to the actual Dashboard project!

This applies to OpenAI’s models and local models. They are so often breathtakingly incorrect, but we used to say the same thing about search engines twenty years ago. If you’re not a subject matter expert, you had better dig a little deeper to verify the responses from whichever chatbot you decide to use!

OpenAI’s API is stupidly inexpensive

OpenAI’s most expensive API costs $30 per million tokens, and their newest and cheapest model, GPT-4o Mini, costs only $0.60 per million tokens.

I have been using an Emacs package to help me quickly and easily send paragraphs and entire blog posts up to their API to have thing rewritten, rephrased, or have introductions written for me. I added $6.61 to my OpenAI account in November of 2023, and I have $5.83 left in my account as of July 2024.

I have no idea why I wound up adding $6.61 to my account. That is a really odd number, isn’t it?!

Sending a paragraph or two at a time for GPT-3.5 Turbo to rephrase was costing me a few pennies per blog post. Things got way more expensive when I started using the huge context window of GPT-4 to start each session with entire 2,500 word blog posts. That was costing me nearly an entire dime to work on a single blog post!

I used a local LLM to help me write an intro, conclusion, and call to action for my most recent blog post. When GPT-4o Mini went live, I sent it the exact same text, and I gave it the exact same prompt. That work cost me less than a penny. Not just less than a penny, but less than a tenth of a penny.

This was fun, because I have been messing around with various local models using the oobabooga webui for a few weeks, and I had forgot how lightning fast the OpenAI API can be!

My local LLM is slower and not as good as GPT-4o Mini

I didn’t buy some big, expensive LLM-friendly workstation with a $2,00 24 GB Nvidia RTX 4090. I have a little $300 12 GB Radeon 6700 XT. I didn’t buy it for running AI stuff. I bought it to play games! The AI stuff is a free bonus.

My local LLM can ingest an entire blog post in a handful of seconds, and it can respond to my prompt about that blog at around 25 tokens per second. That winds up settling down to something around 15 tokens per second as I get deeper into the chat and the context goes up.

How much better is GPT-4o Mini?

I gave both my local Gemma 2 9B LLM and GPT-4o Mini the following prompt:

If I give you the text of a blog post I am writing, can you write me a “conclusion” section with a call to action that includes joining our discord server and leaving a comment?

I pasted them both the contents of the blog post. I got this response back from GPT-4o:

Conclusion

In the world of 3D printing, choosing the right machine can be both exciting and daunting. While the Sovol SV06 has its merits, especially with the recent firmware upgrades that enhance its performance, it’s essential to recognize that the Bambu A1 Mini excels in various aspects, including user-friendliness and print quality. For those who value speed and ease of use, the A1 Mini is hard to beat, while the Sovol SV06 may appeal to enthusiasts looking for a larger build volume or the joy of tuning and customizing their machines.

Ultimately, your choice should align with your printing needs and preferences. But whether you opt for the Sovol or the Bambu, remember to keep your expectations realistic and informed.

I’d love to hear your thoughts! Have you had experiences with either the Sovol SV06 or the Bambu A1 Mini? Share your insights in the comments below. And if you’re passionate about 3D printing or want to connect with other enthusiasts, consider joining our Discord server! It’s a fantastic space to share tips, get advice, and stay updated on the latest in the 3D printing community. Let’s keep the conversation going!

This was the response I got from Gemma 2 9B:

Great! Here’s a conclusion with a call to action incorporating your Discord server and a comment section:

Sovol’s SV06 is a capable machine, especially when you consider the price. But, if you’re looking for an out-of-the-box experience with exceptional print quality and ease of use, the Bambu A1 Mini is hard to beat.

Want to explore the world of 3D printing and share your own projects? Join our Discord server where you can connect with fellow makers, get advice, and show off your creations! [Link to your Discord server]

Let me know in the comments what you think of the Sovol SV06 – would you consider it for your next 3D printing project?

I could never use any of these words without heavy editing. Every AI has a weird voice that feels way too pretentious for me.

I also don’t want to say that one of these did a better job than the other. The response from OpenAI is rather long winded, but it does contain a lot of useful information that Gemma’s response is missing. That said, the response I got from Gemma 2 9B is much more like the sort of conclusion I would write on my own.

While both GPT-4o Mini and my local LLM do appear to slowly type a response to you query, the difference is that GPT-4o Mini is only pretending to be as slow as it appears. It actually took GPT-4o Mini about two seconds to complete the entire task, whereas my local LLM took 25 seconds to ingest my blog post and return its entire first response.

That is definitely a usable pace, but I have spent a lot of money on hardware to get lesser responses that take longer. I could work with GPT-4o Mini to help me rewrite all 800,000 words of blog posts I have ever written and it wouldn’t even cost 5% of what I paid for my GPU.

Do you know what I think is neat? I had an LLM generate a conclusion for one of my recent blogs, and I decided to leave in a couple of sentences and phrases that absolutely never would have written on my own. I didn’t feel entirely comfortable speaking the way the machine wrote the words, but they were useful and relevant statements. If it didn’t work out, I could blame the robot!

The GPT-4o Mini API is extremely cheap, but my local LLM is almost free

The electricity required to have an LLM churning away on my workstation is a rounding error. My GPU maxes out at 174 watts, but llama.cpp doesn’t seem capable of utilizing all of the capacity, so it rarely goes far past 100 watts. It would cost less than twenty cents if you could somehow coax oobabooga into running for 10 hours straight with no downtime. That would be somewhere between 500,000 and 900,000 tokens.

The hardware is a sunk cost. I need my GPU to edit videos with DaVinci Resolve and play games. I need my CPU, RAM, and monitor just to be able to do any work at all. I am not investing in hardware to run a language model. I already own it. Running the model is effectively free.

Stable Diffusion neon pcb

Free isn’t cheap enough to justify the effort. My back catalog of blog posts should be somewhere around a million tokens. It would cost me 30 cents every time I have GPT-4o Mini ingest the whole thing, and it’d only cost $1.20 to get a full rewrite of my blog back out. If I were working with the API to actually do a good job of reworking my old posts, then I would most definitely have to go back and forth more than once with many of the paragraphs.

I can’t imagine having to spend more than $10 or so conversing with the OpenAI API to rewrite my entire blog. The OpenAI API would respond faster than my local API via oobabooga, and that alone would save me way more than $10 worth of my own time.

I would never actually do this, but this is by far the most extreme use case I can come up with for using an LLM, and it would only cost me ten bucks!

What if I don’t want to upload all my words to a third party?

This has to be the best reason by far to use avoid using an LLM in the cloud. Would Stephen King want to risk the chapters of his latest novel leaking? Maybe he wouldn’t care. I don’t think that a leak would have a significant impact on his life, but I think you understand the idea.

I have no qualms about uploading my words to OpenAI before I publish them. It feels icky in principle, but I’m not some sort of investigatory journalist trying to beat everyone to the latest scoop. The majority of the words that I write wind up on the public Internet anyway. Who cares if they manage to sneak out a week early?

You might not be as fortunate as I am. You might have actual private work that needs to stay private. I could totally see investing in a small server with an Nvidia 3090 to run Gemma 2 27B so your small business can have access to a reasonably powerful LLM. Spending a few thousands dollars to not leak your secrets is pretty inexpensive!

What is Pat actually running locally?

The two models that I have really settled on are CodeQwen1.5-7B-Chat-GGUF and gemma-2-9b-it-GPTQ, both of which are available at Hugging Face. I think Meta-Llama-3.1-8B-Instruct-GPTQ-INT4 is also fantastic, but Gemma 2 seems to fit my blogging needs better.

1
2
3
4
5
6
7
8
9
10
$ du -sh *GPTQ*
5.4G  hugging-quants_Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
5.8G  marcsun13_gemma-2-9b-it-GPTQ

$ ls -sh *.gguf
4.5G codeqwen-1_5-7b-chat-q4_k_m.gguf
4.0G internlm2_5-7b-chat-IQ4_XS.gguf
4.4G internlm2_5-7b-chat-q4_k_m.gguf
6.0G internlm2_5-7b-chat-Q6_K.gguf
4.2G Llama-3-Instruct-8B-SPPO-Iter3-IQ4_XS.gguf

I was mostly using InternLM 2.5 before Gemma 2 9B and Llama 3.1 8B were released. I tried three different quantizations of InternLM with the context window set to about 12,000 tokens. They eat up about 6, 6.5, and 7 gigabytes of VRAM on my Radeon 6700 XT, and they seem to have the same speed and quality for my use cases.

InternLM supports a context window as large as one million. The more VRAM I leave free, the larger I can adjust the context window. I haven’t needed more than six or seven thousand tokens of context yet.

I had trouble getting CodeQwen1.5 to load. I recall the errors in the stack trace seeming a little nonspecific, so I originally assumed there was just a compatibility issue somewhere. I dialed back CodeQwen’s massive default context window, and it loaded right up. I was just running out of VRAM!

I have not properly used an LLM to help writing any code. I don’t tend to write much code. I just figured I should try out one of the programming-focused LLMs, and CodeQwen seemed to do a nice job of spitting out the short shell scripts that I asked for.

Stable Diffusion is also awesome!

I feel like I have to say this every time I write any words about machine learning. I have been using Stable Diffusion via the automatic1111 webui since I upgraded to my current GPU. It is so much fun generating silly images to break up the walls of words that are my blogs.

I can queue up a bunch of images with different config scales using a handful of different checkpoints, then I can wander off an make my morning latte. I will have several hundred images waiting for me when I get back, and I usually just pick the ones that make me giggle the most. The more fingers the better!

Why am I messing around with running these things locally at all?!

I’ve already said that it doesn’t make a lick of difference if my blog posts go up into the cloud before they are published, and OpenAI’s API is much faster and costs almost nothing. Why bother with any of this?

I think it is neat, and I am having some fun. I am excited that I know where the limits of local models seem to be, and I now understand how much GPU you need to buy to run something useful.

It is awesome that things are moving so quickly. When I bought my 12 GB GPU just over a year ago, I looked around to see what sort of value a large language model small enough to fit in my VRAM might add. At the time, those small models seemed to do a rather poor job.

A year later, and these small models are starting to feel quite capable! I imagine that things will continue to improve in the future.

Getting Llama 3.1 and Gemma 2 working with my Radeon GPU and oobabooga webui required a bit of a kludge!

At least, I think I am using Llama 3.1 8B. You have to roll back llama.cpp to an older version if you want to use the latest releases of oobabooga with AMD’s ROCm, and that version of llama.cpp doesn’t work with Llama 3.1 or Gemma 2, so I am running Llama 3.1 and Gemma 2 9B via ExLlama V2. I have no idea if I am doing this correctly.

Conclusion

As I continue to explore the capabilities of large language models and local alternatives, it’s clear that these tools have the potential to assist with my creative processes in interesting ways. My hope is that machine learning can take some of the monotonous work off my shoulders.

I believe it is doing that to a small extent, but at the same time it is creating more work for me while at the same time improving my writing. I’m not sure how much my blog posts are improving when I bounce things off of the artificial intelligence monstrosities, but it is interesting enough that I am going to continue to do so just to see where it leads.

This is the part where GPT-4o Mini suggested that I invite you to join the Butter, What?! Discord community, where we share our experiences, insights, and tips on leveraging AI in creative projects. I can’t say that, because it isn’t even true! There is hardly any machine learning nonsense going on in there, but it is a fun place where a lot of homelab, NAS, and 3D printing projects are often being talked about!

Additionally, I’d love to hear your thoughts on this topic! Have you experimented with local models or found innovative ways to integrate LLMs into your work? Leave me a comment below and let’s start a conversation!

The Sovol SV06 Got Much More Interesting In 2024

| Comments

First of all, I am NOT saying that you should buy a Sovol SV06. I wholeheartedly believe that the Bambu A1 Mini is by far the best $200 3D printer, and I really do think everyone should own one. The only measure where the Sovol SV06 beats the Bambu A1 Mini is size.

If you click one of my links and buy a Sovol, I will make a few dollars. If you buy a Bambu printer, I will make absolutely nothing. I still think you should buy a printer from Bambu.

Stable Diffusion 3D printer guy

I DO believe there are good reasons to buy a Sovol SV06, SV06 Plus, or SV08. I also believe that Sovol makes a decent printer, but you have to understand what you’re buying.

Now that we got that out of the way, let’s talk about why the Sovol SV06 recently got more awesome.

The Sovol SV06 now ships with Marlin’s input shaper

Do you already own a Sovol SV06? If you do, then you should definitely head over to Sovol’s website and download the latest firmware and Orcaslicer profiles. Your prints will run two or three times faster.

I updated my Sovol SV06 to support Marlin’s input shaping a little over 12 months ago. I had to dial it in on my own and create my own PrusaSlicer and Orcaslicer profiles. That work resulted in me being able to print a 21-minute Benchy with a 0.6 mm nozzle.

This was my Sovol SV06 going about as fast as Octoprint will allow!

This is neat, but I have to note here that I have seen my Bambu A1 Mini print a Benchy in less than 14 minutes at a higher resolution with its 0.4 mm nozzle. The A1 Mini can print faster than my Sovol SV06. Even so, if I slice a Benchy myself with Bambu’s stock profiles, it won’t print much faster than the 35-minute Benchy that Sovol is advertising for the SV06. The 14-minute Benchy on Bambu’s SD card is super optimized!

You don’t have to do any of the work I did to make my own Sovol SV06 fast. If you buy a Sovol SV06 today, it will ship with officially supported firmware that has the input shaper configured for you. Once you figure out how to get your Sovol up and running, you will be printing almost as fast as a Bambu printer.

If you already own a Sovol SV06, you are one firmware update away from printing almost as fast as a Bambu Lab printer!

The Sovol SV06 is ancient technology

The Sovol SV06 is an open-source clone of the venerable Prusa MK3. It is a solid, reliable, and well proven design, but it is built around technology that was state of the art in 2016.

It is also extremely important to note that while Prusa uses premium components, Sovol seems to be using the cheapest parts they can get away with. This isn’t necessarily a bad thing, but this is one of the reasons why a Prusa MK3S kit was $750 while an almost fully assembled Sovol SV06 costs $199.

Fast benchy on my Sovol SV06

One of my first 30-minute Benchy prints on my Sovol SV06

Almost every single Prusa ships with straight rods, good bearings, and a decent print bed. Some people get a good roll of the dice, and they wind up with a nearly perfect printer from Sovol. A few people will wind up rolling poorly, and they will receive a dud.

There are a lot of little things that a Prusa MK4, Prusa XL, or any printer from Bambu can do that a Sovol SV06 can’t, but the most important upgrade would be their modern bed-leveling systems.

Sovol’s printers are still using an inductive probe to locate the heated bed. These sense the position of the print surface indirectly, and their accuracy varies with temperature. That means they aren’t detecting the PEI surface that plastic has to stick to—they are sensing the metal underneath, which means it is up to you to explain to the printer where the nozzle is in relation to the metal surface.

Bambu Lab printers and the most recent Prusa printers detect when the nozzle makes actual contact with the surface. These printers know exactly where the nozzle is in relation to the PEI sheet. Temperature doesn’t have any impact on this like it does with a PINDA probe, and heat slightly changing the shape of the bed doesn’t matter. These more modern printers know how to get a perfect first layer almost every single time.

0.2mm nozzle benchy

The Bambu A1 Mini prints a very nice tiny Benchy with zero tuning or effort, using the 0.2 mm nozzle!

I only use my Sovol SV06 once or twice a month, and I always wonder if I am going to have to adjust the Z-offset for my next print. I have been doing this for more than a decade. I know how to make this adjustment quite well, but the trouble is that I don’t know if I will need to do it until after the start of my next print fails. This is a waste of my time.

If you are new to 3D printing, then this is the thing you are most likely to mess up. Buying a printer that doesn’t require you to dial in a z-offset may be worth several hours of wasted time.

You get a lot of nice features when you buy a Bambu A1 Mini. This feature in particular could be the difference between immediately seeing a successful print vs. hours of frustration.

Sovol is cutting as many corners as they can get away with

There are people all over r/Sovol and r/3DPrinting complaining about bent linear rods, crunchy bearings, and warped beds on their Sovol printers.

It is challenging to determine which folks are actually having a problem, and which folks are expecting perfection. Nobody is getting a perfectly flat print bed on a Sovol SV06, and you don’t need one. Marlin’s mesh bed leveling can compensate for a bed that is 0.4 mm or so out of flat.

Some people have DEFINITELY received beds that are too bent to be usable, but Reddit is flooded with people who think they need perfection. You don’t need perfection to get a good first layer, and Sovol isn’t likely to ship you perfection.

It isn’t just that Sovol chooses cheap components. Their customer service department is tiny, they are slow to respond to emails or tickets, and they don’t keep people updated about the status of their preorders.

Every time Sovol releases a new printer, you will see one or two people every day on Reddit complaining that Sovol isn’t answering their emails about when their printer is going to ship. If you’re expecting good customer service, then you need to spend more money to get it.

You are getting precisely what you pay for. You just need to understand that you aren’t paying for a premium product, and you aren’t paying for a cadre of customer support people. These are some of the reasons that you are getting a bigger printer from Sovol for less money.

Refurbished Sovol SV06 printers are where things get interesting

The Bambu A1 Mini and Sovol SV06 are both about $200. Just about the only reason to pick the Sovol is the bigger build volume. The Bambu is better in every single other way.

Refurbished Sovol SV06 printers are listed at $149. Now you get a bigger build volume AND you save about $50. Is that a good deal?

I’d still go with the A1 Mini. At the time I am writing this, the Sovol SV06 refurbs don’t seem to ship with the input-shaping firmware installed. That is something you will have to spend time upgrading, and it is something that you can mess up.

But that is just, like, my opinion, man! The hours I save using my Bambu A1 Mini instead of futzing with my Sovol SV06 are easily $50 to me. In fact, those hours are worth way more than $50 to me. The results that come out of my printer enable my other hobbies, so I prefer that my printers just work.

A whole lot of people enjoy tuning and upgrading their 3D printers. This is a valid hobby all on its own, and it makes a ton of sense to buy a Sovol SV06 or Sovol SV08 if that is what you enjoy. I certainly don’t want to deprive you of something that you would consider fun!

A few more complaints about the Sovol SV06

When I bought my refurbished Sovol SV06 in April of 2023, my biggest complaint was how loud these printers are. My Prusa MK3S was almost as quiet as a whisper when printing in stealth mode, while the SV06 has the drone of loud, cheap fans as soon as you flip the power switch, and things only get louder when you start printing.

This is still true today. My Sovol SV06 is louder than my Bambu Lab A1 Mini, but neither printer is quiet. You just can’t print fast and quiet. A 3D printer has to move a lot of air to cool the object you are printing, and the faster you are printing, the faster those fans need to go.

Sovol SV06 cable clips

*My SV06 has inadequate cooling for high speeds toward the front left. These are printed with the same settings, but with the overhang oriented in different directions.

Keep in mind that I have upgraded every fan on my Sovol SV06, and I have dialed in these bigger fans to spin as slowly as I can get away with to keep the noise down. My Sovol SV06 is probably quieter than stock.

A lot of people will tell you that there are upgrades and other things that you HAVE to do to your Sovol SV06. They are exaggerating. You don’t have to do anything. The Sovol SV06 prints just fine with 100% stock hardware.

That said, it is hard to resist the urge to upgrade. Those crappy 4010 fans on the extruder assembly are loud and underpowered. Upgrading the part-cooling fan will improve overhangs and bridges, and it will allow you to print a little faster. Swapping out the power-supply fan will make the printer quieter. Spending two bucks on a knock-off CHT-style nozzle will increase your maximum print speed by up to 35%.

If you are planning on immediately spending $20 or $30 on upgrades and investing hours of your time into getting those things working, it might be worth considering putting another $100 in and just skipping ahead to the even bigger and better 256-mm Bambu A1.

I feel like I have to mention the Sovol SV08

The Sovol SV08 is an impressive machine. It is an open-source clone of the Voron. It’s massive 350x350x340 mm build volume is very nearly as big as the $1,999 Prusa XL, but the Sovol SV08 is currently priced at $579.

If you need a machine as big as the Prusa XL, then the Sovol SV08 seems like a steal!

I am looking at things from the other direction. I rarely need a printer larger than my Bambu Lab A1 Mini, but I would enjoy owning a printer larger than my Sovol SV06. That printer SHOULD be a full-size Bambu A1 with an AMS Lite, but there is a tickle in my brain telling me I should put a Sovol SV08 in place of my Sovol SV06.

I liked the value proposition from this angle a lot better when the Sovol SV08 launched with early-bird pricing of $450 while the Bambu A1 was $399 or $559 with an AMS Lite. The trouble is that the Sovol SV08 has crept up in price, and Bambu sure seems to have made their anniversary sale pricing permanent.

Now it is $579 for the Sovol SV08, $339 for the Bambu A1, or $489 for the Bambu A1 with AMS Lite. That has me leaning back toward the convenience of a Bambu Lab printer with an AMS.

Conclusion

In the world of 3D printing, choosing the right machine can be a daunting task. While the Sovol SV06 does have its merits, especially with the recent firmware upgrades that enhance its performance, it is essential to recognize that the Bambu A1 Mini excels in various ways, including user-friendliness, build quality, print quality, and performance. For those who value ease of use, speed, and price, the A1 Mini is hard to beat, while the Sovol SV06 may appeal to enthusiasts on an even tighter budget who are looking for a larger build volume or the joy of tuning and customizing their own printers. Not all of us find joy in working on our 3D printers, and that is OK!

Ultimately, your choice should align with your printing needs, preferences, and budget. Whether you opt for a Sovol or a Bambu 3D Printer, remember to keep your expectations realistic.

I’d love to hear your thoughts! Have you had experiences with either the Sovol SV06 or the Bambu Lab A1 Mini? It would be awesome if you would share your insights in the comments below. And if you’re passionate about 3D printing or want to connect with other enthusiasts, consider joining our Discord server! It’s a fantastic space to share tips, get advice, and stay updated on the latest in the 3D-printing community.