The Sovol SV06 Got Much More Interesting In 2024

| Comments

First of all, I am NOT saying that you should buy a Sovol SV06. I wholeheartedly believe that the Bambu A1 Mini is by far the best $200 3D printer, and I really do think everyone should own one. The only measure where the Sovol SV06 beats the Bambu A1 Mini is size.

If you click one of my links and buy a Sovol, I will make a few dollars. If you buy a Bambu printer, I will make absolutely nothing. I still think you should buy a printer from Bambu.

Stable Diffusion 3D printer guy

I DO believe there are good reasons to buy a Sovol SV06, SV06 Plus, or SV08. I also believe that Sovol makes a decent printer, but you have to understand what you’re buying.

Now that we got that out of the way, let’s talk about why the Sovol SV06 recently got more awesome.

The Sovol SV06 now ships with Marlin’s input shaper

Do you already own a Sovol SV06? If you do, then you should definitely head over to Sovol’s website and download the latest firmware and Orcaslicer profiles. Your prints will run two or three times faster.

I updated my Sovol SV06 to support Marlin’s input shaping a little more than 12 months ago. I had to dial it in on my own. I had to create my own PrusaSlicer and Orcaslicer profiles. That work resulted in me being able to print a 21-minute Benchy with a 0.6 mm nozzle.

This was my Sovol SV06 going about as fast as Octoprint will allow!

This is neat, but I have to note here that I have seen my Bambu A1 Mini print a Benchy in less than 14 minutes at a higher resolution with its 0.4 mm nozzle. The A1 Mini can print faster than my Sovol SV06. Even so, if I slice a Benchy myself with Bambu’s stock profiles, it won’t print much faster than the 35-minute Benchy that Sovol is advertising for the SV06. The 14-minute Benchy on Bambu’s SD card is super optimized!

You don’t have to do any of the work I did to make my own Sovol SV06 fast. If you buy a Sovol SV06 today, it will ship with officially supported firmware that has the input shaper configured for you. Once you figure out how to get your Sovol up and running, you will be printing almost as fast as a Bambu printer.

If you already own a Sovol SV06, you are one firmware update away from printing almost as fast as a Bambu Lab printer!

The Sovol SV06 is ancient technology

The Sovol SV06 is an open-source clone of the venerable Prusa MK3. It is a solid, reliable, and well proven design, but it is build around technology that was state of the art in 2016.

It is also extremely important to note that while Prusa uses premium components, Sovol seems to be using the cheapest parts they can get away with. This isn’t necessarily a bad thing. This is one of the reasons why a Prusa MK3S kit was $750 while an almost fully assembled Sovol SV06 costs $199.

Fast benchy on my Sovol SV06

One of my first 30-minute Benchy prints on my Sovol SV06

Almost every single Prusa ships with straight rods, good bearings, and a decent print bed. Some people get a good roll of the dice, and they wind up with a nearly perfect printer from Sovol. A few people will wind up rolling poorly, and they will receive a dud.

There are a lot of little things that a Prusa MK4, Prusa XL, or any printed from Bambu can do that a Sovol SV06 can’t, but the most important upgrade would be their modern bed-leveling systems.

Sovol’ printers are still using an inductive probe to locate the heated bed. These sense the position of the print surface indirectly, and their accuracy varies with temperature. That means they aren’t detecting the PEI surface that plastic has to stick to. They are sensing the metal underneath. That means it is up to you to explain to the printer where the nozzle is in relation to the metal surface.

Bambu Lab printers and the most recent Prusa printers detect when the nozzle makes actual contact with the surface. These printers know exactly where the nozzle is in relation to the PEI sheet. Temperature doesn’t have any impact on this like it does with a PINDA probe, and heat slightly changing the shape of the bed doesn’t matter. These more modern printers know how to get a perfect first layer almost every single time.

0.2mm nozzle benchy

The Bambu A1 Mini prints a very nice tiny Benchy with zero tuning or effort using the 0.2 mm nozzle!

I only use my Sovol SV06 once or twice a month, and I always wonder if I am going to have to adjust the Z-offset for my next print. I have been doing this for more than a decade. I know how to make this adjustment quite well. The trouble is that I don’t know if I will need to do it until after the start of my next print fails. This is a waste of my time.

If you are new to 3D printing, then this is the thing you are most likely to mess up. Buying a printer that doesn’t require you to dial in a z-offset may be worth several hours of wasted time.

You get a lot of nice features when you buy a Bambu A1 Mini. This feature in particular could be the difference between immediately seeing a successful print vs. hours of frustration.

Sovol is cutting as many corners as they can get away with

There are people all over r/Sovol and r/3DPrinting complaining about bent linear rods, crunchy bearings, and warped beds on their Sovol printers.

It is challenging tell which folks are actually having a problem, and which folks are expecting perfection. Nobody is getting a perfectly flat print bed on a Sovol SV06, and you don’t need one. Marlin’s mesh bed leveling can compensate for a bed that is 0.4 mm or so out of flat.

Some people have DEFINITELY received beds that are too bent to be usable, but Reddit is flooded with people who think they need perfection. You don’t need perfection to get a good first layer, and Sovol isn’t likely to ship you perfection.

It isn’t just components that Sovol choose cheap components. Their customer service department is tiny, they are slow to respond to emails or tickets, and they don’t keep people updated about the status of their preorders.

Every time Sovol releases a new printer you will see one or two people every day on Reddit complaining that Sovol isn’t answering their emails about when their printer is going to ship. If you’re expecting good customer service, then you need to spend more money to get it.

You are getting precisely what you pay for. You just need to understand that you aren’t paying for a premium product, and you aren’t paying for a cadre of customer support people. These are some of the reasons that you are getting a bigger printer from Sovol for less money.

Refurbished Sovol SV06 printers are where things get interesting

The Bambu A1 Mini and Sovol SV06 are both about $200. Just about the only reason to pick the Sovol is the bigger build volume. The Bambu is better in every single other way.

Refurbished Sovol SV06 printers are listed at $149. Now you get a bigger build volume AND you save about $50. Is that a good deal?

I’d still go with the A1 Mini. At the time I am writing this the Sovol SV06 refurbs don’t seem to ship with the input-shaping firmware installed. That is something you will have to spend time upgrading, and it is something that you can mess up.

But that is just, like, my opinion, man! The hours I save using my Bambu A1 Mini instead of futzing with my Sovol SV06 are easily $50 to me. In fact, those hours are worth way more than $50 to me. The results that come out of my printer enable my other hobbies, so I prefer that my printers just work.

A whole lot of people enjoy tuning and upgrading their 3D printers. This is a valid hobby all on its own, and it makes a ton of sense to buy a Sovol SV06 or Sovol SV08 if that is what you enjoy. I certainly don’t want to deprive you of something that you would consider fun!

A few more complains about the Sovol SV06

When I bought my refurbished Sovol SV06 in April of 2023, my biggest complaint was how loud these printers are. My Prusa MK3S was almost as quiet as a whisper when printing in stealth mode, while the SV06 has the drone of loud, cheap fans as soon as you flip the power switch, and things only get louder when you start printing.

This is still true today. My Sovol SV06 is louder than my Bambu Lab A1 Mini, but neither printer is quiet. You just can’t print fast and quiet. A 3D printer has to move a lot of air to cool the object you are printing, and the faster you are printing, the faster those fans need to go.

Sovol SV06 cable clips

*My SV06 has inadequate cooling for high speeds towards the front left. These are printed with the same settings, but with the overhang oriented in different directions.

Keep in mind that I have upgraded every fan on my Sovol SV06, and I have dialed in these bigger fans to spin as slowly as I can get away with to keep the noise down. My Sovol SV06 is probably quieter than stock.

A lot of people will tell you that there are upgrades and other things that you HAVE to do to your Sovol SV06. They are exaggerating. You don’t have to do anything. The Sovol SV06 prints just fine with 100% stock hardware.

That said, it is hard to resist the urge to upgrade. Those crappy 4010 fans on the extruder assembly are loud and underpowered. Upgrading the part-cooling fan will improve overhangs and bridges, and it will allow you to print a little faster. Swapping out the power-supply fan will make the printer quieter. Spending two bucks on a knock-off CHT-style nozzle will increase your maximum print speed by up to 35%.

If you are planning on immediately spending $20 or $30 on upgrades and investing hours of your time into getting those things working, it might be worth considering putting another $100 in an just skipping ahead to the even bigger and better 256-mm Bambu A1.

I feel like I have to mention the Sovol SV08

The Sovol SV08 is an impressive machine. It is an open-source clone of the Voron. It’s massive 350x350x340 mm build volume is very nearly as big as the $1,999 Prusa XL, but the Sovol SV08 is currently priced at $579.

If you need a machine as big as the Prusa XL, then the Sovol SV08 seems like a steal!

I am looking at things from the other side the other side. I rarely need a printer larger than my Bambu Lab A1 Mini, but I would enjoy owning a printer larger than my Sovol SV06. That printer SHOULD be a full-size Bambu A1 with an AMS Lite, but there is a tickle in my brain telling me I should put a Sovol SV08 in place of my Sovol SV06.

I liked the value proposition from this angle a lot better when the Sovol SV08 launched with early-bird pricing of $450 while the Bambu A1 was $399 or $559 with an AMS Lite. The trouble is that the Sovol SV08 has crept up in price, and Bambu sure seems to have made their anniversary sale pricing permanent.

Now it is $579 for the Sovol SV08, $339 for the Bambu A1, or $489 for the Bambu A1 with AMS Lite. That has me leaning back towards the convenience of a Bambu Lab printer with an AMS.

Conclusion

In the world of 3D printing, choosing the right machine can be a daunting task. While the Sovol SV06 does have its merits, especially with the recent firmware upgrades that enhance its performance, it is essential to recognize that the Bambu A1 Mini excels in various ways, including user-friendliness, build quality, print quality, and performance. For those who value ease of use, speed, and price, the A1 Mini is hard to beat, while the Sovol SV06 may appeal to enthusiasts on an even tighter budget who are looking for a larger build volume or the joy of tuning and customizing their own printers. Not all of us find joy in working on our 3D printers, and that is OK!

Ultimately, your choice should align with your printing needs, preferences, and budget. Whether you opt for a Sovol or a Bambu 3D Printer, remember to keep your expectations realistic.

I’d love to hear your thoughts! Have you had experiences with either the Sovol SV06 or the Bambu Lab A1 Mini? It would be awesome if you would share your insights in the comments below. And if you’re passionate about 3D printing or want to connect with other enthusiasts, consider joining our Discord server! It’s a fantastic space to share tips, get advice, and stay updated on the latest in the 3D printing community.

Everyone Struggles to Identify Bottlenecks

| Comments

Identifying bottlenecks is a skill that many of us struggle with. Learning to pinpoint bottlenecks is important, and learning which bottlenecks to ignore can save you a lot of money.

In this blog post, we’ll explore the concept of bottlenecks and how they can impact your computing setup. We’ll discuss common types of bottlenecks, such as CPU, GPU, network, and IOPS, and provide tips and strategies for identifying and addressing them. Whether you’re a seasoned homelab enthusiast or just starting out, understanding bottlenecks is crucial for optimizing your system’s performance and getting the most out of your hardware investments.

What is a bottleneck?

You have seen a bottle. When you pop the top and tip that bottle over, the liquid won’t instantly exit the container. Its exit is slowed by the narrow neck of the bottle. The wider the neck, the faster the liquid can flow.

Stable Diffusion Bottleneck Guy

You see bottlenecks everywhere. The door to your house will only let one person through at a time. The escalator at the mall only has room for two people on each step. The ramp onto the 4-lane highway is one lane wide. Your faucet is only an inch wide, so it takes a long time to fill a pot of water.

Every IT department faces similar constrictions at every level of their operation.

You will always have a bottleneck, and that is OK!

This section almost came in right before the conclusion, but I think it is a better idea to discuss this closer to the top. No system will be perfectly balanced. There will always be a bottleneck somewhere. Every time you eliminate one bottleneck, that will just move the bottleneck elsewhere.

What matters most is that your system is designed in such a way that your bottleneck is acceptable. Performance may be the primary driver behind your design, but cost is usually a significant factor as well.

10-gigabit Ethernet has gotten pretty cheap, while 40-gigabit Ethernet hardware is still extremely expensive. Just because your 10-gigabit network is slower than the disks in your NAS doesn’t mean that the bottleneck is a problem. Would upgrading to 40-gigabit Ethernet ACTUALLY improve your workflow? Four times faster than fast enough is also fast enough.

You may learn that plain old gigabit Ethernet will do the job well enough, and that 2.5-gigabit Ethernet won’t cost you much more.

Your CPU might be bottlenecking your GPU!

PC gamers seem to love using the word “bottleneck,” and they really seem to enjoy using it as a verb. I dislike the verbing of the word bottleneck because it seems like a significant percentage of the people using “bottleneck” as a verb aren’t using the term correctly.

There are real bottlenecks in a gaming PC. The CPU, the GPU, and your monitor each have to do work for every frame that is displayed. Every game needs a different balance, but whichever component is maxed out first is your bottleneck.

If your GPU is near 100% utilization and your FPS isn’t keeping up with your monitor, then your GPU is a bottleneck.

If your GPU is significantly below 100% utilization and your FPS isn’t keeping up with your monitor, that means your CPU is the bottleneck.

When your CPU and GPU are both barely breaking a sweat while running your game at your monitor’s maximum refresh rate, then it might be time for a monitor upgrade! Your 60 Hz 1080p monitor might be a bottleneck, because your other components have room to render at a higher resolution or more frames per second. It might be time for a nice 34” 3440x1440 165 Hz upgrade!

I didn’t have a terribly appropriate photo to use in the section, so I dropped in a test clip of Severed Steel running on my Ryzen 5700X with a Radeon 6700 XT. You can see that the game can’t quite manage to maintain a constant 144 frames per second, and my GPU is at 97% utilization and max wattage, so GPU is holding me back from keeping up with my monitor!

The network is always the bottleneck of your NAS

I hate to use the word “always.”“ It always feels like a lie. As long as your NAS is ONLY being used as a NAS, then this is almost definitely correct. Sharing files tends to be light-duty work.

A 10-gigabit Ethernet connection can move data at about 1 gigabyte per second. That sounds fast, but it really isn’t! Three 20-terabyte hard disks can nearly max that out. A pair of SATA SSDs will be a little faster, and any NVMe you can buy will be a lot faster.

I paid less than $100 for a low-power motherboard and CPU combo ten years ago, and it could share files over CIFS or NFS as fast as the PCIe slot could go via my 20-gigabit Infiniband network. The available PCIe slot was my bottleneck then because it maxed out at around 8 gigabits per second, so I was definitely underutilizing my network at the time!

Stable Diffusion Bottleneck Guy 2

This is what encouraged me to start writing this post. There was a tiny Celeron N100 server posted on r/homelab recently that had slots for four NVMe drives. It is the more compact sibling of my own Celeron N100 homelab box with 5 NVMe slots.

So many comments on that Reddit post were complaining that each NVMe slot only has a single 1x PCIe 3.0 connection. These folks are all bad at finding bottlenecks! That little server only has a pair of 2.5-gigabit Ethernet ports, so any single NVMe installed in that miniature server will be twice as fast as all the network ports combined.

What if your NAS isn’t just a NAS?

I wholeheartedly believe that you cram as much stuff onto your homelab server hardware as possible. Serving files over CIFS or NFS might occasionally max out your network port, but it usually leaves you with a ton of CPU and disk I/O left to burn. You might as well put it to good use!

Stable Diffusion stack of hard disks

Running Jellyfin or Plex in a virtual machine or container on a NAS is quite common. I am fortunate enough that my Jellyfin server rarely has to transcode any video, because most of my Jellyfin clients can directly play back even the 2160p 10-bit files in my meager collection.

I do have one device that requires transcoding. My Jellyfin server can transcode 2160p 10-bit video at 77 frames per second, and it can transcode 1080p 8-bit video at well over 200 frames per second. That means the GPU on my little Celeron N100 is the bottleneck when transcoding video, and I will be in trouble if I need to feed more than three devices a 10-bit 2160p video at the time.

That bottleneck is so wide that I will never need to worry about it, and the transcoding runs a little faster when two or more videos are going simultaneously, so I wouldn’t be surprised if I could squeeze in a fourth playback session!

Sometimes, the bottleneck is IOPS and not throughput

Old mechanical hard disks have reasonably fast sequential throughput speeds. A 12-terabyte hard disk will be able to push at least 100 megabytes per second on the slow inside tracks, and a 22-terabyte has enough platter density to push nearly 300 megabytes per second on the fast outer tracks.

Every 7200 RPM hard disk has a worst-case seek time of 8 milliseconds. That works out to only 120 I/O operations per second (IOPS). A single hard disk has enough throughput to easily stream a dozen 4K Blu-Ray movies, but it might only be able to insert 120 records per second into your PostgreSQL database.

The cheapest SATA SSD can handle tens of thousands of IOPS, while the fastest NVMe drives are starting to reach 1,000,000 IOPS. These drives are fast when streaming a Blu-Ray, and they don’t slow down when you start updating random people’s phone numbers in you 750 gigabyte customer database.

The vast majority of people adding a NAS to their home network are storing video files from the high seas, or they are storing backups. If you fit either of these descriptions, then you probably only need inexpensive 3.5” hard disks.

My personal video storage is mostly footage taken straight off my own cameras, and I work with that footage in DaVinci Resolve. I layered a few hundred gigabytes of lvmcache on top of slow video storage,because 8-millisecond seek times add up to noticeable lag when you are bouncing around a timeline that references three or four videos.

One seek to get to the correct video frame, at least one more seek to backtrack to the previous keyframe, then maybe a third seek to pull in enough video to start rendering—that adds up to around 100 milliseconds on a mechanical hard disk before the GPU even gets to start decoding and rendering the video, while it would take less than one millisecond on any solid-state storage device. That is a difference you can feel!

Caching to an SSD is a great way to smooth out some of the rough edges. The SSD can catch thousands of those database updates and flush them back to the slow disk later on. My SSD cache is big enough to hold one or two projects’ worth of video files, so it is usually only holding on to the data that I need to work with this week.

Conclusion

In summary, understanding and addressing bottlenecks is crucial for optimizing the performance of your NAS and homelab setup. Identifying which component is constraining your system can make a world of difference, and recognizing these limitations can help you make informed decisions about upgrades and configurations, or even whether or not you should worry about upgrading anything at all!

It is your turn to contribute to the conversation! Share your insights, experiences, or questions related to this topic in the comments below. Have you encountered any unexpected bottlenecks in your own setup? How did you overcome them? Was the upgrade to reduce your bottleneck worth the expense? Let’s learn from each other and continue to refine our systems.

If you’re interested in connecting with a community of homelab, NAS, and even gaming enthusiasts, consider joining the *Butter, What?! Discord community. You can engage in discussions, share knowledge, and stay up to date on the latest trends, developments, and deals in the world of homelabbing.

Is Machine Learning Finally Practical With An AMD Radeon GPU In 2024?

| Comments

I don’t know for certain that I have enough to say here to justify writing an entire blog post, but let’s see what happens!

It has been a little over a year since I upgraded this computer from an ancient Nvidia GTX 970 to an AMD Radeon RX 6700 XT. I really needed that upgrade, but I was stubborn, and I didn’t want to pay those inflated crypto-mining GPU prices, so I hung on to that GTX 970 for way longer than I wanted to.

Stable Diffusion Nervous LLM Kid

I think I did a good job. Fifteen months later, and I have only seen my new GPU go on sale for at most $120 less than what I paid. I am happy to have paid less than $10 per month for the privilege of being able to have so much more fun playing games, so I think the timing of my upgrade was pretty decent!

I am not some sort of software developer trying to teach an LLM to learn how to read and do other stuff good too. I am just a guy hoping to use existing models in some sort of useful way.

One of the first things I learned immediately after installing my new GPU was that running AI models with an AMD GPU on Linux was a crapshoot.

At the time of my upgrade, getting Stable Diffusion to run with an Nvidia GPU was as easy as running one command and waiting for things to download. Getting it running on my Radeon took several attempts, and I felt like I was lucky to get it working at all. Every time I wanted to update my Stable Diffusion install, it was almost certain that something would break.

Getting Llama up and running seemed like it would be nearly impossible, but things are much improved today!

I had Oobabooga’s text-generation-webui up and running in thirty minutes

Since you are reading this, you can probably do it in less than half the time.

The first problem I had was picking out a model or two to download. I don’t know if I will find something better, but so far I have been pleased with MistralMakise-Merged-13B. It seems reasonably capable, and it fits well in my 12 GB of VRAM.

NOTE: So far, I am happier with DiscoPOP-zephyr-7b-gemma, and I am using it to help me put the finishing touches on this blog post before I send it to my editor for review.

My second problem was easily solved by punching some error messages into Google, but it took a few clicks before I found the solution. It is mentioned in their documentation under the AMD heading, but that section is way down near the bottom, and I managed to miss it.

1
2
3
4
5
6
# I had to uncomment those two lines in one_click.py

# Remove the '# ' from the following lines as needed for your AMD GPU on Linux
# os.environ["ROCM_PATH"] = '/opt/rocm'
os.environ["HSA_OVERRIDE_GFX_VERSION"] = '10.3.0'
os.environ["HCC_AMDGPU_TARGET"] = 'gfx1030'

If you have a functioning Nvidia GPU, CUDA will just work. If you have a working AMD GPU, things are a little more complicated. When you have the RIGHT Radeon GPU with ROCm correctly installed, Oobabooga’s text-generation-webui will also probably just work.

When you have a different Radeon, you have to give pytorch some hints as to which ROCm bits to actually use. This is a pain point, but if this is the only real problem we’re going to be running into today, then things are already infinitely better than they were a year ago!

Installing ROCm and OpenCL might be a pain!

The documentation says that I need ROCm 5.6 to use text-generaton-webui, but I already have ROCm 5.4.6 installed. That is the version that seems to work well with DaVinci Resolve Studio 19, my RX 6700 XT, and Mesa 24. It seems to be working just fine for text-generation-webui as well!

I would love to tell you the correct way to install ROCm and OpenCL, but I always goof something up, and I wind up jumping through hoops to fix it. That means I don’t REALLY know how to install these things. I know how to goof it up, correct at least some of my mistakes, then wind up with a working installation. I am not even confident that doing things in what seems to be the correct way would even get me to the correct destination!

The newer versions of ROCm can’t easily exist alongside the bleeding-edge versions of Mesa. If you install ROCm 5.6 or newer you can expect to not be able to play games or run DaVinci Resolve. At least, that was the case when I set things up last. This should be a problem that will eventually straighten itself out.

I don’t think this is currently any better or worse than it was a year ago. This is something AMD really, really, really needs to do better. Really.

Should you bother running an LLM on your local machine?

I am sure there are some fantastic reasons to avoid using the ChatGPT API. I do not enjoy the idea of sending all my words up to some company’s servers, but all the words I write are published to one of my blogs, so that doesn’t really matter for me.

The ChatGPT API is quite inexpensive. It didn’t even cost me an entire dime when I was messing around with sending every paragraph of my blog posts up to ChatGPT for rewrites and suggestions. That was with GPT-3.5-Turbo.

Stable Diffusion Man With Tiny Robot

GPT-4o is still inexpensive, but I could easily get up into dollars instead of cents. One of the problems is that GPT-4 offers a much bigger context window, so I can send entire blog posts up as context. Even though GPT-4o is still inexpensive per token, it encourages me to send up 1,500 tokens of context with each query.

OpenAI’s API is FAST. I forgot just how fast it was until I decided to ask GPT-4o and my local Mistral LLM to rephrase the same paragraphs. I ran the paragraph through ChatGPT first because I have some shortcuts to make that happen in Emacs, and I was surprised that it was able to ingest my context and give me a full answer almost instantly. The local LLM on my $300 GPU took a noticeable number of seconds to give me a response.

OpenAI’s API isn’t just ridiculously fast—it also gives significantly better responses than my tiny GPU ever could. I can’t even run an equivalent to GPT-3.5-Turbo locally, and GPT-4 and GPT-4o are orders of magnitude bigger than that.

Speed doesn’t matter if you don’t want to send your private information to a third party.

Sometimes, quantity beats quality!

The game shifts a little when you can run something locally and do not have to pay for every single transaction.

My GPU can consume an additional 175 watts when running at full tilt. It would take something like four hours of me constantly interacting with a local LLM to add 10 cents to my electric bill, and I certainly can’t ask her enough questions to keep her running without lots of breaks. My cost to keep the LLM running and answering all my questions is effectively zero.

I absolutely love being able to run Stable Diffusion locally. I can try out a handful of weird prompts to find something that makes me giggle. Then I can ask Stable Diffusion to generate a eight images at two different guidance scales using six different checkpoints. It will grind away for ten to fifteen minutes while I make a latte, and I will have 96 images to evaluate when I sit down. Usually one will be goofy enough to break up a wall of words in a blog post.

I can’t max out my GPU with an LLM for long, but asking Stable Diffusion to generate 96 images will keep my GPU maxed out for ten minutes. That means I can generate more than 2,000 images for a dime.

I can see myself doing something similar for my blog-writing workflow in Emacs. Right now, I just send a paragraph or two to GPT-4o when I can’t find a synonym I like, can’t decide how to start the next paragraph, or just don’t like the flow of a sentence. OpenAI’s API is almost always just a lazy thesaurus for me. ChatGPT’s writing feels either way too pretentious or corporate for my taste, but it does often inspire me to reorder sentence fragments into something that reads more pleasant.

Stable Diffusion Man with Robot

When the LLM doesn’t cost me anything to run, why not throw everything into that blender to see what comes out? I could write some Emacs Lisp that will send every paragraph to the OobaBooga interface as soon as I hit the Enter key. I’ve already tried connecting my Emacs automations to my local LLM’s API, and it works great even if it feels so much slower than GTP-4o!

Maybe it could show me the rephrased paragraph next to the window I am typing in. Maybe I could puzzle out a prompt that would coax the robot into only speaking up if its rewrite or suggestion seems like it would be helpful to me. Perhaps I could send it the last two or three paragraphs and give it a chance to write the next one?

I think this sort of thing would have to be done one paragraph at a time, or at least be limited to a few paragraphs. When I asked ChatGPT to turn six columns of a Google Sheet into a Markdown table, it gave me back the results in a few seconds. It LOOKED like it was typing the results slowly, but I was able to hit the copy code button right away, and the entire table was available.

It took my local Mistral robot 90 seconds to give me the complete table of mini PC prices and performance. The latency would be too high if my local artificial brain works with too much text at once!

Not every employee needs to be Albert Einstein

My little Radeon 6700 XT with 12 GB of VRAM will never run an LLM that can compete with what can be run on even a single AMD MI300X with 192 GB of RAM, and it certainly can’t compete with a server full of those cards!

That is OK. I don’t need to hire Albert Einstein to figure out how fast my FPV drone falls when I dive down the side of a building. A high school student should be equipped to handle that task, just like my little Mistral 7B LLM can give me a handful of synonyms.

I don’t need to hire Douglas Adams to fix up my words, even if I wish I still could!

Let’s get back on topic

We are supposed to be talking about how much easier it is now to run machine learning stuff on a Radeon GPU. I feel like automatic1111’s stable-diffusion-webui and oobabooga’s text-generation-webui cover something like 90% of the machine learning tasks we might want to do at home. These are both reasonably easy to get going with ROCm.

The other popular machine learning project is the Whisper speech-to-text engine. There is a webui for this, but it doesn’t seem to make it simple to get going with a Radeon GPU. Even so, I am not certain that a webui would be the right place to use Whisper.

Whisper feels like it needs to be built into something else. I want it to transcribe my video footage and turn the text into subtitles. I want to automatically transcribe any audio files that land in a particular directory. I don’t want to be doing any of this manually.

DaVinci Resolve Studio has a fantastic speech-to-text workflow. You can delete words from the transcription, and Resolve will cut it right out of the video timeline. How cool is that?!

I very nearly had to delete this entire blog post!

The version 1.8 release of text-generation-webui showed up in my feeds right in the middle of writing the previous section. I did the thing that any competent professional writer might do, and I upgraded to the latest release!

My GPU acceleration immediately stopped working. That took my performance down from between about 12 to 25 tokens per second to an abysmal 2 to 5 tokens per second.

Stable Diffusion man with GPU

Someone already filed a bug report. I decided to put this blog on hold, and I figured I could check back in a few days. The stickied AMD bug thread had a fix that worked. I had to edit the requirements_amd.txt file to replace one of the packages with an older version.

There were two lines with two slightly different versions. I assume that they weren’t supposed to be there, so I deleted both before pasting in the URL from the comment.

Conclusion

I was eager to write this blog. It was exciting to no longer feel like a second-class citizen in the world of machine learning with my budget-friendly AMD Radeon GPU. Then I found out that no one attempted to run the text-generation-webui with a ROCm GPU in the two days between that dependency being updated and the release of version 1.8, and my citizenship level was once again demoted.

Is that the end of the world? Definitely not. Finding and applying a fix wasn’t a challenge, but even so, everything would have just worked if I had bought an Nvidia GPU, and everything would have just worked for the entirety of this past year. My 6700 XT is comparable in gaming performance and price to an RTX 4060 Ti, but I have 50% more VRAM. In theory, I should have as much or more machine-learning performance as well, except that there is so much less optimization work happening outside of the world of Nvidia’s CUDA.

What do you think? Are you running automatic1111’s stable-diffusion-webui or OobaBooga’s text-generation-webui on an AMD GPU? How have things been working out for you? Do you think it is worth the extra effort and problems to be able to own a Radeon GPU with 24 GB of VRAM for less than half the price of an Nvidia RTX 4090? Let me know in the comments, or stop by the Butter, What?! Discord server to chat with me about it!

Do All Mini PCs For Your Homelab Have The Same Bang For The Buck?

| Comments

I am writing way too many blogs about mini PCs lately, so I definitely wasn’t planning on publishing this. As I was staring at the graph, I just thought that I had to write about what I was seeing.

The simplified answer to the question in the title is yes. There are outliers that are a better deal. All the common mini PC models from the lowly $129 N5095 up to one with an 8845HS for $649 has similar enough value per unit of CPU horsepower or unit of RAM, though they do drift towards being a little more expensive as you climb the ladder. Even so, you do get other upgrades along the way like more NVMe slots, faster network interfaces, and better integrated GPUs.

Trigkey N100 mini PC and Trudy the cat

There is some variation. The math doesn’t work out perfectly, and the models don’t wind up being fractions that divide evenly with each other. You can’t literally say that one Ryzen 5560U Beelink box is exactly equivalent to two Celeron N100 Trigkey boxes, but it is quite close.

I am not even going to try to sneak an extra subheading in here before giving you the table.

CPU RAM Price Geekbench Geekbench
per $100
RAM per $100
n5095 16 GB $129.00 2100 1628 * 12.40 GB
n5095 32 GB $187.00 2100 1123 17.11 GB *
n100 1gbe 16 GB $134.00 2853 2129 * 11.94 GB
n100 1gbe 32 GB $192.00 2853 1486 16.67 GB *
n100 2.5gbe 16 GB $239.00 2786 1166 * 6.69 GB
n100 2.5gbe 32 GB $329.00 2786 847 9.73 GB
n100 2.5gbe 48 GB $380.00 2786 847 12.63 GB *
5560u 16 GB $206.00 6200 3010 * 7.77 GB
5560u 64 GB $321.00 6200 1931 19.94 GB **
5700u 16 GB $233.00 7200 3090 ** 6.87 GB
5700u 64 GB $348.00 7200 2069 18.39 GB *
6900hx 16 GB $439.00 10200 2323 * 3.64 GB
6900hx 64 GB $614.00 10200 1661 10.42 GB
6900hx 96 GB $721.00 10200 1415 13.31 GB *
7840hs 32 GB $550.00 12000 2182 * 5.82 GB
7840hs 64 GB $725.00 12000 1655 8.83 GB
7840hs 96 GB $832.00 12000 1442 11.54 GB *
8845hs 32 GB $649.00 12000 1849 * 4.93 GB
8845hs 64 GB $824.00 12000 1456 7.77 GB
8845hs 96 GB* $931.00 12000 1289 10.31 GB *

NOTE: Heavily rounded Geekbench 5 scores are guesstimated from searching Geekbench’s database. Precise numbers are from my own personal tests.

These mini PCs are discounted almost every single week, so the prices listed are the lowest usual sale prices that I have seen. Don’t buy at full price, but you may have to wait a while to see a sale as low as what I have listed in the table!

The lines with an asterisk next to the RAM are priced with the biggest RAM kit that they be capable of using. You can’t buy most of these without preinstalled RAM, so you will have a spare stick or two of RAM if you upgrade your own mini PC.

I have been keeping this data in a messy Google Sheet. If you want to be able to sort or do your own math, you should definitely to check it out!

I was surprised that 48 GB DDR5 SO-DIMMs are a good deal!

First of all, you can still save more than a few dollars on RAM if you choose an older mini PC that uses DDR4 RAM. DDR4 is still abundant, so the prices are still slowly going down.

You can only get DDR4 SO-DIMMs in 8 GB, 16 GB, or 32 GB sizes, but DDR5 SO-DIMMs are also available in 12 GB, 24 GB, and 48 GB sizes! That means you can squeeze 48 GB of RAM in a mini PC with only a single SO-DIMM slot or 96 GB in a mini PC with two SO-DIMM slots. You will have to pay a small premium per gigabyte compared to 32 GB SO-DIMMs.

Old sticks of DDR4 2666

My brain immediately said, “That can’t be a good value!” but I was wrong. If your goal is to pack as much RAM into your homelab cluster as possible, then it is very much worth paying that premium.

NOTE: I haven’t personally tested 48 GB SO-DIMMs in any of these mini PCs. We have friends on our Discord server who have used 48 GB DDR5 SO-DIMMs with the Topton N100 motherboard and N100 mini PCs. Your mileage may vary, but I expect these will work in the higher end Ryzen mini PCs as well. To play it safe, I would make sure you buy your RAM from a store with a good return policy!

There are some outliers on that table that just don’t fit in!

It isn’t too surprising that both the oldest and newest CPUs on the list are the worst value. The N5095 is not only the weakest CPU in the table, but it has the worst performance per dollar. Even so, it is one of the best deals around if you want lots of RAM in your homelab!

The Ryzen 8845HS and 7840HS are very nearly the exact same piece of hardware. The newer CPU has a stronger NPU, but they are identical in every other way. You’re paying a premium for the absolute newest CPU here, and you are paying a premium with either for DDR5 RAM.

Both models of the Beelink SER5 lean in the opposite direction. Their Ryzen 5560U and 5700U processors are getting pretty long in the tooth, so they are priced really well for their performance capability. Not only that, but they require DDR4 SO-DIMMs, so they also have the advantage of using less costly RAM.

Sometimes you get six of one, half a dozen of the other

Half of the table has RAM prices in the 6, 9, or 12 gigabytes of RAM per $100 range. Each mini PC starts out expensive with the stock RAM, and the pricing gets better as you upgrade. The value for RAM gets slightly worse the higher you go up the list, but Geekbench scores also tend to go up.

The big, messy spreadsheet has a column with the Geekbench score per gigabyte of RAM. That might be a good number to look at if you are working hard to minmax your mini PC cluster, but I can save you some time. The winners here again are the Beelink SER5 with a Ryzen 5560U or Ryzen 5700U.

The Ryzen 5700U being the value champ seems to be the theme of this blog, but I think it is important to note that almost everywhere on this list you are trading RAM for CPU chooch.

Prices go up at the high end, but you do get something for your money!

I like simplifying everything down to RAM or benchmark scores per dollar. I want my virtual machines to only need to be assigned RAM and CPU to go there job, and I don’t want them to worry about where they are running. I prefer to avoid passing in PCIe or USB devices unless I absolutely have to. Sometimes you need more than just RAM and CPU to get things done.

Stable Diffusion Mini PC Guy 1

The Beelink SER6 MAX starts to drift into the more expensive territory if you are simply breaking it down by RAM per dollar, but it adds some useful upgrades. It has a 2.5 gigabit Ethernet port, a pair of m.2 slots for NVMe storage, and a 40-gigabit USB4 port. I don’t currently need this, but these upgrades definitely add value, and these could be extremely useful for someone.

The Beelink SER7 and SER8 both have the 2.5 gigabit Ethernet port, and they both have dual 40-gigabit USB4 ports, but they lack the second NVMe slot.

On the lower end, there are mini PCs like the CWWK N100 mini PC router in my own homelab. It has four 2.5 gigabit Ethernet ports and FIVE NVMe slots. It costs more than a Beelink or Trigkey mini PC with the same CPU, but if you need more than one network port or NVMe, then it might be worth the extra cost.

As far as I am concerned, RAM is the king of virtualization

If you’re going to run Proxmox in your homelab, and you don’t know what to buy, then you should put RAM at the top of your list.

You can over provision disk images. Things will work. Things will run. Things may run forever, but you may run out of disk space some day. That is something that can be fixed.

You can over provision your CPU, and you definitely should! The odds of all your virtual machines needing to do hard work at the same time ought to be low. Even if they aren’t, the worst thing that happens if you don’t have enough CPU is that your tasks will run slower. The important thing is that they will run even if they run slowly.

Pat's Proxmox Machines

From a purely technical standpoint, you can over provision RAM. There are mechanisms that let your machines release RAM to the host to be used by other machines when they need it, but if you push this too far, processes or virtual machines will start to be killed.

Memory is a resource that even idle virtual machines need. A gigantic PostgreSQL eats up a lot of RAM whether you are asking it run queries or not. Your dozens of Nginx processes are waiting for hours, days, or weeks for a connection may not be using any CPU, but they are sitting somewhere on a stick of RAM ready to respond.

Unallocated RAM on a Proxmox server is just RAM waiting to be allocated to a new virtual machine tomorrow. It is handy to have some spare!

How big of a mini PC do I need?

This blog is making me feel like a liar, because a few weeks ago I wrote an entire blog saying that it was cheaper to buy a second N100 mini PC than to upgrade the RAM in my existing N100 mini PC. For my own purposes, it was a better value, but I am beginning to think I didn’t do the math correctly with a 48 GB DDR5 SO-DIMM!

The variety and combination of mini PCs you buy for your homelab will depend on your needs. My whole world fits on a single Intel N100 with 16 GB of RAM, and adding a second similar machine was a luxury. Adding a second server has also been fun. That’s a good enough excuse to do pretty much anything!

My biggest virtual machine is using 3 GB of RAM, and I could tighten that up quite a bit if I had to. Most of my virtual machines and LXC containers are using 1 GB or less. These are small enough that I could easily juggle them around on a Proxmox cluster where each physical server only has 8 GB of RAM.

Stable Diffusion Network Control Center

Your setup might be different. You might have a handful of Windows virtual machine that really need at least 8 GB of RAM to spread out into. If that’s the case, you’d have trouble with 16 GB mini PCs like I am using. There’s only 14 or 15 GB of RAM left over once you boot Proxmox, so you might only fit one of these bigger virtual machines on each mini PC. That’d make playing musical chairs with virtual machines a challenge, so you might be better off with mini PCs with at least 32 GB of RAM.

There’s a balance to be had here. I keep saying that I like the ratio of CPU horsepower to RAM in an N100 mini PC with 16 GB of RAM or a Ryzen 5560U mini PC with 32 GB of RAM. I am comfortable with that ratio without knowing anything at all about what you’re doing with your homelab. It just feels like a good middle ground.

My own homelab wouldn’t be short on CPU even with 48 GB in the N100 or 64 GB in the 5560U. We all have different needs.

One big mini PC, or lots of small ones?!

One big server is easier to manage, and you won’t have to worry about splitting up your large virtual machines while leaving room to fit your smaller virtual machines in between the cracks.

These mini PCs are fantastic because they are so power efficient. The Intel N5095 and N100 mini PCs DO idle more than a few watts lower than all the Ryzen mini PCs on this list, but the difference isn’t all that dramatic. My pair of Intel N100 mini PCs burn at least as much electricity as a single Beelink SER5, and the most expensive mini PCs with Ryzen 8845HS CPUs are probably comparable to the Beelink SER5. That means consolidating down to one beefy mini PC will almost definitely save you money on your electric bill.

Splitting things up provides redundancy. I am not doing anything this fancy, but you can set up a high-availability Proxmox cluster with three or four N100 mini PCs. If one fails, those virtual machines will magically pick up where they left off on one of the remaining mini PCs.

That is pretty cool, but even if you don’t set that up, it is nice knowing that all your services won’t go down due to a single hardware failure, and you could always restore the unavailable virtual machines from backup and run them on the remaining mini PCs. That is assuming you have enough free RAM!

That brings us to another neat feature of running lower-spec mini PCs. When your cluster gets low on RAM or CPU, you can buy another inexpensive mini PC. This works even if you start with only one tiny mini PC.

I prefer the idea of multiple smaller mini PCs

I don’t think there are any wrong choices to make here, but here’s what I have been thinking about for my own setup.

There are plenty of uses outside of our homelabs for these mini PCs. My Trigkey N100 mini PC that I bought for my homelab hasn’t made it into my homelab yet. I have had the Trigkey N100 plugged into the living room TV playing Steam, NES, SNES, and Nintendo Wii games for the last few weeks.

You can plug these into an old monitor and have an extra PC at your workbench or CNC machine, and most of them have brackets that let you hang them from your monitor’s VESA mount. Almost anything you might do with a Raspberry Pi can be done with an Intel N5095 or N100 mini PC.

Stable Diffusion Homelab

If you buy one big mini PC that can meet all your homelab needs, then it is going to be stuck in your homelab for a very long time.

If you have a variety of less powerful mini PCs, you can decide to upgrade and retire one early to reuse that slightly outdated mini PC for something else. You could send it to a friend’s house to use as an off-site Proxmox server and backup host. You could retire it to the living room to stream your Steam library. You could set it up to control your 3D-printer farm.

I expect to get better overall use out of three $150 mini PCs than I would out of a single $450 mini PC, and it is nice knowing that I can upgrade my homelab one slowly instead of having to replace nearly everything all at once.

Your stack of mini PCs doesn’t have to be identical

I fall into this trap a lot when I stare at the spreadsheet. I want to say things like, “Three N100 mini PCs are equivalent to one 6900HX mini PC,” but I shouldn’t be thinking that way. The obvious problem is that the various mini PCs don’t quite divide evenly into each other, but that isn’t the only thing wrong with this line of thinking.

Maybe you have a single virtual machine that needs 16 GB of RAM, then a while mess of other machines that only need 1 or 2 GB. That seems like a good excuse to build your homelab out of one Ryzen 5700U mini PC and one or two Intel N100 mini PCs.

You can mix and match whatever makes sense for you. Heck! Your homelab doesn’t even have to be 100% mini PCs!

We didn’t talk about iGPUs!

From a gaming perspective, the table at the top of this blog goes from worst to best. The Ryzen 7840U and 8845U have ridiculously fast integrated GPUs! They are a generation ahead of the Steam Deck, and they have even more GPU compute units. They can run a lot of games quite well, and they can even manage to run many of the most demanding games as long as you turn the settings down.

A common use case for the iGPU in your homelab would be video transcoding for Plex or Jellyfin. There isn’t nearly as big of a gap in transcoding performance as there is in gaming performance between these boxes. They only set aside a small corner of these GPUs for the video encoding and decoding hardware, and they are all really only aiming to make sure you have enough horsepower to encode and decode 4K video in real time.

Trigkey N100 Mini PC at my desk

My Intel N100 mini PC can transcode 2160p HDR video with Jellyfin at 77 frames per second. That is enough to keep three movies playing on three televisions. I don’t know where my upper limit is on the N100. Transcoding two movies at the same time utilizes more of the GPU and each stream maintains better more half of that 77 frames per second. I wouldn’t be surprised if there is enough room for me to feed five televisions with transcoded 10-bit 2160p video.

Even better, my Jellyfin server rarely has to transcode because the majority of my devices can play back the raw video streams straight out of the video files.

If transcoding is your goal, every mini PC on the list is up to the task. I wouldn’t be surprised if the performance goes up a bit towards the higher end of the list, but I bet it doesn’t go up all that much.

Conclusion

I guess the real conclusion is that the best bang for your buck is easily the Beelink SER5 with the Ryzen 5560U or with the N5095 and N100 boxes coming in a close second, but the important thing to remember is that these aren’t miles ahead on value. You can spend a little more on beefier mini PCs without paying too much of a premium.

Sometimes that premium even buys you something you can us. The Beelink SER6 MAX has a 2.5 gigabit Ethernet interfaces and room for two NVMe drives, and if that isn’t enough for you, the CWWK N100 mini PC in my homelab has four 2.5 gigabit Ethernet ports and room for FIVE NVMe drives. There are so many options available in small packages to piece together a nifty homelab cluster for yourself that might fit in a lunchbox or shoebox.

What do you think? Are you building out a full-size 42U server rack at home? Or are you trying to build something quieter and more efficient? Are you already running a short stack of mini PCs, or are you looking to build up a diminutive server farm at home? Tell us about it in the comments, or stop by the Butter, What?! Discord server to chat with me about it!

https://docs.google.com/spreadsheets/d/1cHuUnASp_mdcFfLnlvMJdhTCfzGfXC9ehUjTs9pWCik/edit?usp=sharing “Pat’s big messy mini PC homelab spreadsheet”

Microsoft Copilot+ Recall AI Looks Awesome!

| Comments

First of all, I don’t trust Microsoft to do a good job keeping our data safe. My recollection of history makes me feel that Microsoft’s execution regarding security often starts off poor, and there would potential for horribly incriminating or financially costly information to accidentally leak will be immense.

Stable Diffusion Kid at a terminal

There are a lot of posts on social media and news stories that are implying that the sky is falling. These posts all remind me of when people starting syncing everything to the cloud, and similar to the arguments that were made when people started posting everything to social media.

I don’t think this is fear mongering. I don’t think these posts are wrong. There is absolutely potential here for both new types of cyber crime and user error, and I can understand being fearful that your employee’s entire time spent on their computer could be subpoenaed. It is even more terrifying that this could also enable employers to scrub years of work of each employee looking for excuses to fire them.

Not only that, but I think a lot of the fears about storing your data in the cloud are extremely valid! I only store my important, private data in my own private cloud. Seafile works very much like Dropbox, except I own my Seafile server, and all my data is encrypted by the open-source Seafile client before being transmitted to my server.

I am not worried about any of this today. I don’t run Windows. I don’t use any Microsoft products.

I won’t be able to use Recall, but I am extremely excited about the concept and the future potential. This is the first time I have seen anything come out of the machine learning hype that could realistically save me significant time and help me avoid monotony.

I lost my trust in Microsoft in the early nineties when they put Stac Electronics out of business, and they left us stuck with Microsoft’s worse and much less reliable DoubleSpace in Stacker’s place.

Microsoft did a lot of shady things during the next couple of decades, and I have no idea if they could ever do enough to regain my trust.

Even though the title seems to imply otherwise, this post isn’t really about Microsoft or Microsoft’s specific new product. This is about the idea of compiling a database with all the text and images of essentially every moment of you computing life, and what that might be able to do for you.

I could save so much time researching blog posts!

You have absolutely no idea how much time I spend flipping between Emacs and a dozen browser tabs while writing a blog post. There are so many numbers involved in many of the things I write about, and I have to get those numbers correct.

I have to remember which tab has the documentation or notes. I have to skim the page to find the numbers that I need. Sometimes I have to do math.

Can you imagine if I had a local large language model (LLM) that had access the contents of all the recent web pages I visited, and all of my notes? How awesome would it be if that assistant was voice enabled?

  • “Hey, You! What is the max clock speed of an Intel N100?”
  • “How many kilowatt hours did I measure on my N100 server at idle?”
  • “Paste a Markdown table with the Geekbench scores of the N5095, N100, and Ryzen 5700U.”
  • “How long have I been running the Seafile Pi at Brian Moses’s house?”
  • “How long ago did I write about upgrading to a Radeon 6700 XT?”

Any of these functions would save me a minute or two, and that adds up fast!

This is a fantasy today, and I don’t think there are any open-source projects working on this today, but it sure feels like this is where things are headed.

This isn’t an entirely new idea

There was a Gnome project 20 years ago that used inotify to watch your emails, chat logs, and the text files you were editing to pop up related information. It might even be so old that it used dnotify!

If you got a ICQ message from your friend Brian asking you to meet up at the new pizza shop he’s been telling you about, it might have shown you a link to Brian’s recent email that talked about that pizza shop.

This was a really cool idea two decades ago. It was the very first thing that came to mind when I read the headline about Windows Recall, but it took me SO MUCH digging to find any sort of information about this project. I couldn’t even remember the name! It was Nat Friedman’s Dashboard.

Finding ways to connect instant messages to emails, and emails to text documents, and text documents to instant messages was extremely rudimentary when all the logic had to be built by hand with regular expressions.

I hope we see a new Gnome Dashboard in 2024!

I am hopeful that there will be a modern implementation of something like Gnome Dashboard. I want to collect regular screenshots. I want that information converted to text. I want it indexed, and I want to be able to query it!

I don’t need something as powerful a GPT-4o processing queries of my personal database. Give me a little LLM that fits in 4 GB or 8 GB of VRAM. It is OK if it sometimes thinks that March has thirty days. Even the best assistants goof up like this from time to time.

I don’t need my assistant to be Albert Einstein, and the quality of the small LLM will improve over time. The software will be optimized, and I will upgrade to a newer GPU with more VRAM every few years. Even a simple brain of an assistant would save me time today.

There are already some open-source projects popping up!

The first one I found is the rem project for macOS. Their first release was in 2023. Rem has a pretty good list of features that includes capturing the screen and recognizing text every 2 seconds. There is no built-in LLM capability, but there is a way to grab recent context from your screenshots to use as context with an LLM.

Stable Diffusion guy with a robot

There’s also Windrecorder for Microsoft Windows. The feature list seems less complete than rem’s, and I didn’t see a roadmap. Rem definitely has some of Recall’s features on their roadmap.

I don’t have macOS or Windows here, so I can’t try any of these. I haven’t yet found a project like this that runs on Linux, though I realize from the names and descriptions that the terminology used to describe these things is very different from Recall even though they do very similar things!

I understand that it isn’t that simple

I know that you can’t just feed a tiny local LLM with all the text I have ever written, all the text from months of screenshots of my activities, and all the text of the Wikipedia pages and other sites I have visited in that time. That is just way too much context to load into my little 12 GB Radeon 6700 XT.

All sorts of work has to be done to make the software summarize this stuff, and then summarize the summaries, and organize those summaries of summaries. Someone probably needs to figure out how to tie together old-school text search to find relevant document, then summarize those documents, then feed them into the LLM every time you ask a question.

What if someone steals your Microsoft Windows Recall database?!

No one will steal MY Recall database, because I won’t be able to run Windows Recall at all. I am running with the assumption that if there is any sort of open-source equivalent that I will have some amount of control over when it is active and what gets stored in the database.

It seems that Windows Recall makes no effort to keep usernames, passwords, and login screens out of the recording of your keystrokes and screenshots. If you log into your bank, it will most likely be in that database, and this makes the Recall database a juicy target.

I wouldn’t complain too much if I had to manually turn my AI tracker on when I start working, then turn it off when I start doing personal things. Maybe it would be interesting to have different databases for different projects and for personal stuff.

I would also bet that it would be pretty easy to tie in certain activities to the tracker. Maybe when you click on Bitwarden, your tracker would stop recording, and hopefully it would show you that in an extremely obvious way. If it detects that the processed text looks like it might be a private encryption key, your tracker could scrub that out of your database immediately.

Even just having an “oops button” that you can hit to have the last five minutes of activity expunged from the record would be nice.

I am ready to start recording every aspect of my computing life!

I have an incongruity here in my brain. I don’t like surveillance cameras. I don’t understand why so many people are pointing cameras at their backyards and living rooms, then plumbing those into Home Assistant. It feels icky. I don’t want to record myself lumbering around the house!

Yet I am excited about recording every second of my digital life. I believe this is because I can see an immediate value in being able to ask a virtual assistant to mine this data, and it will provide me with valuable time savings and possibly even improvements to the quality of my work from day one.

I will almost definitely be jumping on board with the first open-source screen-recording virtual assistant that shows up. I am excited to try something like this out to see to see how much it might improve my workflow, and I am excited to see where this technology will go as it improves!

What do you think? Are you skeeved out by Microsoft recording everything you do? Would you be willing to let an open-source AI assistant mine your screen for nuggets of useful data? Are you as excited about the prospects as I am? Let me know in the comments, or stop by the Butter, What?! Discord server to chat with me about it!

My Asus Vivobook 14 Flip 2-in-1 Laptop and Linux

| Comments

Why am I writing about running Linux on a laptop that was already an old model when I bought it two years ago? That seems like a good question!

When I bought the Asus 2-in-1 laptop, I decided I was going to treat it more like an appliance. I haven’t owned a computer running windows in almost thirty years. I heard good things about WSL2, and I figured Windows 11 would do a better job in tablet mode than Ubuntu, so I decided to give Windows 11 a shot.

Asus Vivobook Flip 14 and my Gotway Tesla v2 at Oak Point Nature Preserve

Not a close-up shot of my Asus Vivobook 14 Flip and my Gotway Tesla at a table by the pond

I wound up doing what one might consider work, which for me involves writing blog words in Emacs, in a WSL2 session. I did everything else with Windows. Weirdly enough, this worked better two years ago than it has recently. My WSL2 session would often go unresponsive after bringing the laptop out of suspend, and the more often that happened, the more frustrated I would become.

I decided to install Kubuntu 24.04 shortly after it entered beta.

tl;dr: What actually works?

Everything related to being a laptop worked out of the box. WiFi is fast. I am pretty certain that I tested the webcam, and it is as basic as ever. Games are running as fast as I would expect. The keyboard and touchpad seem happy.

The touchscreen works, but I had to switch to Wayland to get proper touch scrolling to work in Firefox. It isn’t surprising that I needed Wayland for this, but I was surprised that Wayland wasn’t the default with my 5700U CPU’s integrated AMD Radeon GPU.

Stable Diffusion person with tablet

The fingerprint sensor almost works, but it fails to train my fingerprints. It does get far enough to know exactly when I remove my finger from the sensor, so it might not be a total lost cause.

The most disappointing failure is lack of support for the accelerometer or lid-flip sensor. Without these sensors, Gnome or KDE can’t disable keyboard input and put the laptop into tablet mode.

1
2
3
[    2.233923] pcie_mp2_amd 0000:03:00.7: Failed to discover, sensors not enabled is 0
[    2.233941] pcie_mp2_amd: probe of 0000:03:00.7 failed with error -95
[   40.920347] raid6: avx2x4   gen() 37669 MB/s

NOTE: The failure to discover my accelerometer and its related friends is delaying my boot process by 39 seconds! I haven’t dug too deeply into this yet. I tried the stock kernel and a newer Xanmod kernel. I tried booting with both amd_iommu and iommu disabled. One of the combinations eliminated this delay, but it also made my NVMe super slow and goofy!

I enjoy using tablet mode. It is fantastic for playing games like Into the Breach. It is handy to be able to hide the keyboard out of the way and prop the tablet up to read documentation while working on a project. Sometimes it is just nice to lean back on the couch and surf through comments on Hacker News or Reddit, but it is also nice to know that they keyboard is there if you have to type out a comment!

I did figure out how to easily enable and disable the keyboard, though I don’t know if I am doing it correctly enough to recommend my solution, and rotating the screen via the command line is easy. It won’t be a problem adding a button to my dock to toggle tablet mode for myself. It is a bummer that it will be manual, but it isn’t a deal breaker for me.

While I do enjoy tablet mode, I don’t flip in and out of it constantly.

What is better on Linux than on Windows?

I really want to just scream, “EVERYTHING!” Except I already mentioned a few key 2-in-1 features that aren’t working, so we both know that is a lie.

I already know that I don’t have exact equivalents of the settings on Microsoft’s slider that goes from maximum power savings to maximum performance. I can turn down all my CPU governor settings, and I just can’t get a Geekbench result as low as I did on Windows 11 with the power-saving slider turned all the way down.

I am however getting 14% more single-core and multi-core performance on the same version of Geekbench. That is a pretty impressive boost!

Geekbench scores

Link to these Geekbench results

I don’t have good data on actual battery runtime. I never verify that the estimates Windows was giving me were what I was actually seeing, and I haven’t run my battery down far enough to know that I trust KDE’s estimates. KDE’s estimates do seem plausible when checking the time and punch quick numbers into a calculator, and KDE is usually estimating an extra hour of runtime.

I think an extra hour is pretty impressive, because I have my battery charge limit set to a maximum of 60%. That is the difference between a little more than four hours and five hours of runtime, and the gap would be even more impressive at full charge.

Improvements over Windows 11 that are way more exciting

Before we get to the exciting part, I should tell you that I had a brain fart while setting up Kubuntu. I didn’t check to make sure that I would have a swap partition in an accessible place to allow the kernel to hibernate, and I am realizing that I am not even certain how it would handle resuming from a LUKS-encrypted swap partition. The important thing that I need to say here is that my brain fart means I don’t currently have a way to hibernate.

Suspend is working just fine, and here is the exciting part. The laptop does exactly what I have asked. I told it to stay awake as long as it is plugged in, and by golly, it never goes to sleep. It also never randomly wakes up from suspend. These are both things that Windows 11 would do even though I asked it not to.

Not only that, but my Seafile client never stops working on Linux. On my Windows 11 installation, I ran Seafile inside WSL2 to sync my blog posts and other data that I would work on from Linux. Every time WSL2 got goobered up, my data would stop syncing. That isn’t surprising, but it was extremely annoying.

I am not sure if KDE is bad at tablets or my nonfunctional tablet-mode sensor is to blame

I have an old Cybernet all-in-one touchscreen computer at the workbench behind me. I believe it used to live in a hospital. It is running Ubuntu with Gnome, and it does a fantastic job of popping up the on-screen keyboard at the correct moments when I don’t have my Bluetooth keyboard turned on.

I don’t even think I have managed to get KDE to pop up the on-screen touch keyboard. I have tried clicking things in the taskbar that seem related to a popup keyboard, but it just hasn’t worked.

Asus Vivobook Flip 14 next to my 27-inch monitor

A 2-in-1 can be left open slightly and stood up right on its side. This lets you prop it up right next to your monitor!

Is it because KDE believes I have a keyboard even when I disable it? I am ham-fistedly disabling the keyboard in what is almost definitely the worst possible way, so this wouldn’t surprise me.

This is my first experience with Wayland and KDE Plasma. I am excited to try this out, because I know that I am in my last few years of being able to easily use XOrg with Sawfish and my decades of accumulation of custom window-management Lisp functions. I feel hopeful that I will be able to bend Plasma to my will when I can no longer use Sawfish on my desktop.

DaVinci Resolve might not be ready for Ubuntu 24.04

When I tried to install Resolve, it complained about all sorts of missing packages that have new names now. When I forced it to install anyway, it wouldn’t launch. I found some tips on Reddit about deleting a few libraries out of Resolve’s installation directory to force it to use installed libraries, and that got it to launch.

Using Resolve with a Radeon GPU on Linux is tricky. Resolve is only happy with some versions of AMD’s ROCm and OpenCL packages, and I believe the rather old GCN 5th generation iGPU in my laptop doesn’t work with the newer ROCm releases. Getting this to a point where Resolve would agree that I had an OpenCL-capable GPU while still having a working Mesa library for OpenGL was a challenge, but I might have gotten it right.

When I try to open a project, Resolve appears to hang and its log file fills with ALSA sound card error messages.

I am glad I tried this on my laptop before upgrading my workstation to Ubuntu 24.04, because a working DaVinci Resolve installation is something that I can’t live without!

I am hopeful that this will smooth itself out before Ubuntu 24.04.1 is released.

Conclusion

I have enjoyed owning a 2-in-1 tablet for the past two years. I really do think all laptops should be made with screens that flip around all the way, and I hope to never own a normal laptop ever again.

I use it as a laptop more than 95% of the time, so I am not TOO disappointed that I have to manually switch myself into a tablet mode. I usually find myself needing tablet mode to read documentation while working with a piece of hardware, or when I want to play Into the Breach. It isn’t a big deal to manually adjust the screen orientation for those tasks.

Using my Asus Vivobook as a laptop is infinitely more comfortable and enjoyable for me when running Linux, so I can most definitely say that I am excited to have a better experience for the majority of my use cases.

Using an Intel N100 Mini PC as a Game Console

| Comments

Let’s get this out of the way immediately. A mini PC with an Intel N100 processor is almost definitely the wrong device to plug into your TV to play video games. It is a rather slow CPU, but that CPU is also paired with a really slow GPU. The N100 is overkill for the majority of the games I have been playing on my mini PC, yet it doesn’t have nearly enough horsepower for so many games that would be fun on the TV.

This blog is kind of a tangent for me. I bought this mini PC to double the CPU and RAM of my homelab. It wasn’t intended to play video games, but I just had to try it out, and I am glad I did.

My Intel N100 Gaming PC Test at my desk

The diminutive Trigkey N100 mini PC on my desk next to my cigar box knickknack storage

What’s the tl;dr here? What sort of games can you play with an Intel N100 mini PC? We have played Super Meat Boy, Super Mario World via Retroarch, Super Mario Galaxy 2 via Dolphin, and Dead Cells. The fanciest game I have had moderate success with is Gunfire Reborn. It will run at 720p upscaled to 1080p using FSR at just a few frames shy of 60 frames per second.

I am only gaming at 1080p with my Trigkey N100 mini PC

My aging 70” Vizio television only has one HDMI port that can handle 4K at 60 Hz, and I am using that for my Chromecast with Android TV dongle. I had to dial the mini PC down to 1080p to be able to run at 60 Hz.

NOTE: I am quite aware that I could be running Retroarch on the Chromecast. Just writing that sentence makes me want to give it a shot to see how modern of a console it can emulate, but I do know for sure that it won’t go as far as the N100. That said, I am quite confident that the Chromecast will have no trouble running my favorite NES and SNES games, and those are the games I am most likely to emulate.

This is important. I was most excited about using the N100 box for emulated games, and all the games from my childhood were synced to the 60 Hz of our CRT televisions. I needed at least 60 Hz.

I am running Kubuntu 24.04 on my mini PC. I haven’t managed to get Linux and the Intel GPU driver to send commands to the TV to switch HDMI ports when I wake up the mini PC. I may be doing something wrong. I don’t enjoy having to switch inputs with a remote control like a cave man, but this is also only a test run.

Streaming games via Steam works great!

This is not a surprise, but I think it is worth documenting. I dusted off my old Steam Link hardware from 2015. Streaming games at 1080p via the Steam Link gave me 16 milliseconds of latency on Ethernet and 14 to 15 milliseconds of latency on 5.8 GHz WiFi.

My Trigkey N100 manages 13 milliseconds of latency over WiFi. That isn’t too much of an improvement, though the gap might get bigger if I had a nice WiFi 6 access point available. The Trigkey N100 excels when plugged into Ethernet, because its gigabit Ethernet port can move video packets down the cable several milliseconds faster than the 100-megabit port on the Steam Link. If my math can be trusted, the gigabit Ethernet upgrade alone is shaving more than four milliseconds off the latency. The N100 gets all the way down to 8 milliseconds of streaming latency when wired.

That is only ½ of a frame of additional latency when gaming at 60 frames per second. That is quite reasonable and very difficult to notice in the vast majority of games. I wouldn’t want to play a round of Team Fortress 2 with that extra latency, but I am also bummed out if I can’t play Team Fortress 2 at 144 Hz where the frame latency is even lower. I also completely failed to do well playing Dead Cells with an extra 8 milliseconds of latency.

Those are the sort of twitchy games where perfect timing is everything. On the other end of the spectrum would be games like Red Dead Redemption 2. I only started to notice the latency there when I was streaming to my Android tablet between T-Mobile’s network and my home’s fiber Internet connection.

That said, the Steam Link from a decade ago does a fantastic job. There are tons of games that you can play happily with 16 milliseconds of additional latency.

Retroarch has been fantastic on the N100!

I have one of my Playstation 4 controllers paired to the Trigkey mini PC. Sony’s controllers have always been my favorite for use with emulators. Their unique d-pad is fantastic for NES games, their buttons feel great, and the Bluetooth implementation has always been pretty friendly to work with on Linux.

I don’t trust the cheap knock-off controllers to have usable d-pads, but there are some premium third-party controllers. Playstation 4 controllers seem to have significantly gone up in price since I bought mine.

I haven’t tried one, but I have had my eye on the Gulikit King Kong 3 Max controller. It has rear buttons like the Steam Controller, but it has twice as many! The sticks use hall-effect sensors, so they should never drift. The d-pad looks great, and it can even use a snazzy low-latency dongle. The best part is that it doesn’t cost that much more than a Playstation 4 or Playstation 5 controller. If I didn’t already have four Playstation 4 controllers, I’d be buying one of these.

The Intel N100 is more than fast enough to use Retroarch’s run ahead feature. I have been a fan of this feature ever since it was first announced, and I am not confident that the Chromecast running Retroarch could pull this off.

Super Mario Bros 3 CRT Royale Shader

This is the CRT Royale shader with Intel in the name. It requires about 70% of the N100’s GPU to run!

When playing games on an old console using an actual CRT television, your button presses are very nearly locked to the drawing of the pixels on the tube. When you push jump, Mario will begin that action on the very next frame that is displayed. The least responsive games would only be two frames behind.

This isn’t possible on a modern LCD. At best, your monitor will be one frame behind. In reality, it is probably at least two or three frames behind.

Retroach is able to run the emulator ahead of what is displayed on the screen. When you hit the jump button, Retroarch will roll the game back the desired number of frames, and Mario will instantly be in the air. It works as though you pressed jump 16 or 32 milliseconds in the past. It feels like magic.

You can see how the magic works if you crank up the number of frames that Retroarch is running ahead to 5 or 6 frames.

I have been messing around with various CRT shaders to make my modern television look a bit more like the fuzzy CRT televisions I grew up with. This is the only time I have been disappointed that my television’s HDMI port has me stuck at 1080p. Having four times as many pixels available makes for even more pleasant CRT shaders.

What is the newest system an Intel N100 can emulate?

My searching of Youtube says that the N100 is only just barely too slow to emulate a Nintendo Switch, but it is supposed to be able to handle the Wii U just fine. I would feel better if I could test this myself, but I haven’t been able to verify either of these claims. The Wii U emulator keeps crashing on me, and the Switch emulators are having legal troubles.

I have collected several stars in Super Mario Galaxy 2. It runs at a solid 60 frames per second, but it was a huge pain getting the buttons mapped on my Dualshock 4 controller. I am shocked that the Dolphin emulator doesn’t have profiles for common gamepads.

Mario Galaxy 2 running on Dolphin running on my Trigkey N100 mini PC

Mario Galaxy 2 running at 60 FPS on the Trigkey N100.

I tried doubling up the internal rendering resolution. That lands you at something a little over 720p. My little N100 just couldn’t just keep up with that, so my game dropped to just under 50 frames per second. I was able to turn on 2x MSAA. That doesn’t make for aa clean and smooth of a picture as bumping up the internal rendering resolution, but it does look better than the extremely blocky edges.

I was going to try using Gamescope to upscale Dolphin using FSR, but apparently Gamescope has some dependency issues on Ubuntu 24.04. I probably won’t get a chance to try this on my Trigkey N100 before it moves on to its next task, but I am certain it has enough horsepower to make that work, so I might have to give that a try on my gaming PC just to see how it would look!

If the Intel N100 isn’t the best choice for couch gaming, what would be better?

First of all, the N100 may be fantastic for your own use case. If the goal is to have a machine that can run emulators extremely well and also stream your Steam library, then $150 for an Intel N100 mini PC is pretty reasonable, even if it is leaning into overkill territory.

An N100 is a much better value than the Zimaboard 832. The Zimaboard costs around $60 more, and it isn’t even fast enough to run Gamecube games at 60 frames per second.

There is a wide variety of mini PCs available. The Beelink SER5 with a Ryzen 5560U or Ryzen 5700U often goes on sale for under $250. I haven’t tried one of these, but my laptop has a Ryzen 5700U processor, and it is several times faster the Celeron N100 for gaming.

I know my laptop will run games like Severed Steel or Borderlands 2 at more than 60 frames per second and should be able to at least manage console-like frame rates in Borderlands 3 with very low settings. The question I would be asking here is whether or not several times faster is really worth an extra $100, especially when I can stream these games and more from my much more capable gaming PC. It will all depend on what games you want to play, and how much you are willing to pay to be able to do it!

NOTE: Don’t pay full price for a mini PC. They have coupons and discounts every single week!

You can push ahead to a Beelink SER6 with a Ryzen 6900HX for around $460. That is a pretty good upgrade, and should be able to play Red Dead Redemption 2 with fairly high-quality settings, and it ought to be able to manage Cyberpunk 2077 with low settings. The Ryzen 6900HX has the same RDNA2 GPU cores as the Steam Deck while also bringing a few extra CPU cores along with it. If it runs well on the Steam Deck, it will run on the Beelink SER6.

The Beelink SER7 is available with a Ryzen 7840HS with RDNA3 GPU cores. This is just about the fastest mobile GPU setup you will find, but you’ll wind up paying $569 to get it. This is starting to hit the price range where you could build your own low-end gaming PC with way more GPU horsepower AND have the ability to upgrade it later.

That brings us to the Steam Deck. At $399 for the base model, the Steam Deck is almost definitely the best bang for your buck in PC gaming for your TV. Not only does the Steam Deck perform just as well as more expensive mini PCs, it also has its own built-in screen, controller, and battery.

You can plug the Steam Deck into your TV, or you can take it with you on the go. That is a pretty nice upgrade on a piece of hardware that costs a bit less, isn’t it?!

This is getting into Playstation or Xbox price territory!

There are a lot of advantages to gaming on a console from Sony, Microsoft, or Nintendo. You don’t have to worry about OS updates, driver issues, or games that just don’t like your particular combination of hardware.

One of the biggest advantages of PC gaming is the ability to upgrade, and using a mini PC negates that option almost entirely. So why on Earth would anyone consider buying a mini PC that costs as much as an Xbox?!

Pat's Steam Game Valuation

For me, the answer is quite simple. I have more than 2,000 games in my Steam library, and I don’t own a single Xbox or Playstation 5 game. As soon as I plugged in my mini PC, I had dozens of games in my backlog that I was ready to start playing on the TV!

Even if you are starting from scratch, the cost of gaming on a PC is so much lower than a console. Steam sales reach 75%, 85%, and even 90% discounts. Humble Bundle regularly has bundles where the headlining game is worth the entire price of the $12 to $20 bundle, but you get five to nine other games essentially for free.

There is a hidden cost to these lower prices. When you see a game for the Playstation 5, you know it will run as expected on your Playstation 5. Computer games list minimum and recommended specs, and you don’t know for sure what sort of frame rates or visual fidelity you will see when you just barely meet those specs. There’s some thinking, math, and guesswork involved.

Pat’s goofy Proxmox idea?!

I haven’t decided if this is a worthwhile endeavor, but I think it is worth writing about. You can install Proxmox on Debian. You can install Steam on Debian. You can install Dolphin, Cemu, and all sorts of other emulators on Debian.

Why shouldn’t I have a second Proxmox node that does double duty playing and streaming games from my Steam library? It would be a fantastic location to run my Jellyfin server, and it would give me another physical location in the house to place a 14 TB hard disk to store one more copy of my backups. That way, if my laundry room catches on fire, maybe the server in the living room will survive.

I could definitely do this over WiFi, but if one of my Proxmox nodes is going to be in the living room, I want that node to have Ethernet.

Why shouldn't I?!

That is the conundrum that will almost definitely prevent me from doing this. The television in our living room has only a narrow wall behind it underneath the end cap of a vaulted ceiling. It is a terrible spot to try to work from in the attic, and I fully expect there to be an extra piece of lumber to drill through roughly eight feet above the floor.

I need a really, really, really good excuse to put in that kind of effort, and I definitely don’t want to spend time in the attic now that it is 100 degrees outside. If I do decide this is worth the effort, it will have to wait until winter!

The WiFi in my house isn’t exactly slow, and I do have three access points that are all easily reachable from the TV. There’s somewhere between 300 and 500 megabits available in either direction to the closest access point, and that is plenty to do the job, but WiFi is unpredictable, and I don’t want a Proxmox backup job to hork up a Steam streaming session!

Conclusion

As seems to be the case more often than not these days, I don’t think I am at the conclusion. I think this is only the beginning of mini PC gaming for me, and I believe I have only seen the tip of the iceberg so far!

What is next in my goofing around with an N100 mini PC series of blogs? I was thinking it might be fun to haul that heavy Trigkey box over to my desk and plug it into my monitor, keyboard, and mouse. It ought to be fun do see how slow it feels doing actual work and web browsing on such a slow device.

What do you think? Do you have a machine running Steam on your TV? Is it a full-blown gaming rig, or did you settle for something like a mini PC or Steam Deck? Are you planning on giving something like this a try? Tell me what you are up to in the comments, or stop by the Butter, What?! Discord server to chat with me about it!

I Am Using Swap On Linux Again

| Comments

Let’s start with the motivating factor that encouraged me to write this blog post today. Then we can go back to the ancient 1990s to figure out how we got here.

The workstation that I am sitting at right now chews through swap space. It doesn’t do it quickly, but I can see swap utilization growing every day until I have a reason to reboot the machine. My swap utilization grows quickly enough that I have been more than 16 gigabytes deep into swap in just a week or two of uptime.

Four sticks of DDR4 RAM

What winds up in swap on my desktop computer? My Seafile client definitely winds up accounting for the majority of my swap utilization. There might be a gigabyte or two of Firefox in there. Other processes each only account for a couple hundred megabytes each, but a dozen processes like that add up quickly.

I also noticed that I was nearly one gigabyte deep into swap on my Proxmox server right before I rebooting for the upgrade from Proxmox 8.1 to 8.2.

Why is it a good idea to let my desktop dig itself so deep into swap?

There always winds up being some stuff sitting in RAM that your system may never need to use again. Sometimes these are memory-mapped libraries and executables. Those unused bits can just be freed.

Sometimes a program generates data that it will never need again, or may only need once every few hours, or even once a day. Sometimes a program has memory leaks. I don’t want to make any accusations. Maybe Seafile really does need a few gigabytes of data in memory that it seems to never touch. Who am I to judge?

Swap on Linux

Allowing this rarely used data to be swapped out to my NVMe drive is paving over minor problems while also freeing up actual RAM for programs that will actually make use of it. I never notice a judder, slowdown, or any sort of odd pacing from data having to be moved back in from swap. Either that rarely happens, or our modern NVMe drives with their hundreds of thousands of input/output operations per second (IOPS) make things happen so fast that we can’t see it.

I had one of my four sticks of RAM in my desktop computer fail a few years ago. Crucial wouldn’t do an RMA unless I mailed them all four sticks, and that seemed like it would be a pain in the butt, so I limped along with 24 gigabytes of RAM for a while. To my surprise, I wasn’t really limping. If I ran a game while I accidentally left Davinci Resolve running, I would drop down to only two or three gigabytes of disk cache, but that never caused any performance issues.

In all fairness, though, running Resolve and a heavy game at the same time was and still is way more likely to leave me short of VRAM, and that can turn many games I play into a stuttering mess!

In the old days with slow mechanical disks, you needed every megabyte of RAM you could possibly cram into your machine to use as disk cache. This barely matters today with fast SSDs.

Swapping to your SSD will help stave off the day when 16 GB or 32 GB of memory just isn’t enough for you. Now that my workstation is back up to 32 GB of RAM, I am in good shape. My swap utilization probably grows by half a gigabyte each day, but I rarely see less than 8 GB of RAM dedicated to disk cache.

The over-simplification here is that after a week or two, my computer is moving unused data out of $30 worth of RAM into $0.80 worth of NVMe with no perceptible change in responsiveness.

Swap used to be horrible because disks were slow

Thirty years ago, the rule of thumb was that your Linux machine should have twice as much swap as it had RAM. That rule of thumb persisted long after its useful shelf life had ended.

A 7200 RPM hard disk can come in as low as 120 IOPS in the worst case. This was true 30 years ago, and it is true today. When the data you need is behind the head, the drive has to wait for the disk to make an entire revolution before that data can pass back under the head to be accessed.

Memory was getting faster every year, and sequential disk throughput was also getting faster, but not as quickly. IOPS were stagnant. We eventually got to a point where swapping to mechanical disks was just bad for performance nearly 100% of the time.

My guess would be that it was almost 20 years ago that I stopped enabling swap on most of my machines.

Early SSDs were fast, but they were fragile!

One of the first SSDs worth buying was the 80 GB Intel X25-M. Its sequential write speeds were half as fast as most mechanical hard disks, it cost ten times as much per gigabyte, and even the cheapest mechanical hard disks had at least three times the capacity.

Mechanical hard disks still had a worst-case scenario of 120 IOPS, and their average IOPS would only be about double that. The Intel X25-M could easily manage 20,000 IOPS. It was like night and day. It felt like I added a turbocharger to my laptop, and it would have been an amazing drive to use for swap as far as performance was concerned!

I believe my Intel X25-M was warrantied for 35 TB of writes. If I used that X25-M in the same way I use my workstation’s Samsung 980 NVMe, I would be less than a terabyte away from the warranty, and this Samsung drive is only around two years old.

Even if I wanted to use my X25-M for swap, it would have been a challenge. I replaced an aging 120 GB laptop hard disk, and I was already trying to squeeze down into something that was 1/3 smaller. The performance was amazing, but I was fighting to fit everything I needed on that disk!

It is hard to kill a modern SSD with writes, so don’t be afraid to set up a swap partition, volume, or file!

My desktop computer’s NVMe is one of the lower-end offerings from one of the premium manufacturers. My 1 TB Samsung 980 has a 600 TB write warranty, while the information reported by S.M.A.R.T. suggests that Samsung expects it to last twice as long as that.

Stable Diffusion Guy With An NVMe Drive

I am using this NVMe for swap, for an lvmcache volume that caches my Steam game library, for my root volume, and my home volume. I have accrued 34.2 TB of writes since March 22, 2022. My 5-year warranty will have long since expired by the time I reach 600 TB of writes, and it will probably have been replaced by a much larger NVMe long before I get there!

Not only that, but S.M.A.R.T. says my Samsung 980 can handle double the volume of writes that it is warrantied for. This makes a lot of sense. If Samsung only warrantied the drive right up to the edge of failure, then they’d probably be replacing a lot of drives!

Freeing up 8 GB of RAM isn’t a lot, until it is?!

There is a common theme when buying computer hardware. It might be inexpensive to put together a machine with 64 GB of RAM instead of 32 GB, or four 20 TB SATA disks instead of three, or two NVMe drives instead of one.

Sometimes you reach a limit, and when you need to cross that threshold, you have to buy more expensive components. Maybe your motherboard only has four SATA ports, two m.2 slots, and support for only 64 GB of RAM. Maybe your case only has room for four 3.5” hard drives.

Proxmox Server Swap Utilization

NOTE: Half of the swap space on my Proxmox server belongs to the high-availability daemons that I am not even utilizing.

Do you need more than four SATA disks? You might need to buy a more expensive case. Do you need two or three NVMe drives? You might have to pay a couple of hundred dollars more for a fancier motherboard. Maybe that new motherboard choice is bigger, and it won’t fit in your petite mini-ITX build now.

Sometimes you are just at the limits of the class of machine you own or that you are purchasing. Sometimes you have a laptop, so you only have two SO-DIMM slots, and one m.2 slot.

Sometimes beefier components just cost more. Maybe a 32 GB DIMM costs precisely twice as much as a 16 GB DIMM. That’s fantastic, but a 64 GB DIMM might cost three or four times as much as a 32 GB DIMM.

Finding ways to stay on the low side of the pricing hockey stick can save you a lot of money.

Conclusion

Every single excuse to avoid enabling swap space is gone, and there are moderate benefits to be had. The benefits are even bigger once you have filled up all your DIMM slots, and you have no more room to add more RAM, even if you wanted to spend more money!

How do you feel about swap space on modern systems? Have you been avoiding swap for similar reasons? Have you started enabling swap again now that storage is both fast and durable? Tell me about it in the comments, or stop by the Butter, What?! Discord server to chat with me about it!

It Was Cheaper To Buy A Second Mini PC Homelab Server Than To Upgrade My RAM!

| Comments

Was it ACTUALLY cheaper to buy a second mini PC than to upgrade the RAM in my existing N100 server? It depends on how you look at it.

In the most literal and simplistic sense, it was not cheaper. I paid $60 more than I would have just for a RAM upgrade, but I think it will all make sense when we talk about why I made this choice.

Every N100 mini PC and mini-ITX motherboard has only one SO-DIMM slot!

This is a bummer, right? I believe all of the slightly older, slower, and less power-efficient N5095, N5105, and N6005 also had only one SO-DIMM slot, but the mini-ITX NAS motherboards built around that CPU had two DDR4 SO-DIMM slots, and those motherboards could support up to 64 GB of RAM.

My Proxmox N100 Mini PC, Brian Moses's Makerunit NAS Case, and my UPS

On the left, my CWWK N100 Proxmox server and 14 TB USB HDD. In the center, Brian Moses’s 3D-printed Makerunit 4-bay NAS running TrueNAS Scale. On the right, my aging APC BR-800. Up top, my network cupboard.

The N100 NAS motherboards and mini PCs only have a single DDR5 SO-DIMM slot. You can pay a premium price for a 48 GB DDR5 SO-DIMM, but that is as far as they will go.

Not only that, but all the good deals on mini PCs tend to ship with RAM and an NVMe drive already installed. My router-style N100 mini PC homelab server arrived at my door with 16 GB of DDR5. If I want to upgrade the RAM, I have to pull out that 16 GB stick, put it in a drawer, and hope I figure out something useful to do with it later on.

I scratched my head, then started checking prices of N100 mini PCs on Amazon. Then I felt smart. I hope.

I definitely got more for my money by buying a second mini PC instead of a stick of RAM!

This is all going to be a little subjective because we aren’t quite comparing apples with apples. The inexpensive mini PC I chose was a Trikey N100 with 16 GB of DDR4 RAM and a 500 GB NVMe for $149. I bought this instead of a 32 GB stick of DDR5 RAM for $89.

I think this was a smart move. The 500 GB NVMe is worth at least $35. The 16 GB stick of DDR4 is worth at least $32, but that’s where the oranges come in, because 16 GB of DDR5 costs about 50% more.

I don’t know how best to assign a monetary value to the rest of the mini PC, but I reckon it is worth more than the remaining $82, and there’s definitely more value here than the $59 difference between the stick of DDR5 RAM and my new Trigkey mini PC. There’s a WiFi card in there, a fast little CPU, and a gigabit Ethernet port.

Why was I shopping for a memory upgrade?

I didn’t actually need this upgrade at all. The virtual machines and containers on my Intel N100 Proxmox server only eat up around 10 gigabytes of RAM. I have some room to grow into, and there’s even some superfluous stuff running that I could always eliminate if the need ever arises.

Right before rebooting after my upgrade from Proxmox 8.1 to 8.2, I noticed that my server was around 500 megabytes deep into swap after 62 days of uptime. I know well and good that my server would most likely be 500 megabytes into swap even if I had 128 GB of RAM in there, but that tickled my brain enough to search Amazon for prices.

Trigkey Mini PC

When I realized how little it would cost to just buy a second Intel Celeron N100 mini PC, it was quite the challenge to resist putting one in my Amazon shopping cart.

Except that I did resist that urge. I was staring at a $145 deal on that same Trigkey N100 mini PC with 16 GB of RAM. By the time I decided that I just had to have it, the deal was gone.

There were a lot of Celeron N100 mini PC deals in the $155 to $165 range, but I somehow convinced myself that I wasn’t going to pull the trigger until I could put one in my cart at under $150.

So what am I really saying here? I wound up buying something that I don’t really need mostly for fun, but at least I got it for under $150!

The bummer about shopping for mini PCs

If you can make it work out in your favor, this bummer can be awesome.

All the aggressively priced mini PCs that are constantly being discounted ship with RAM and storage, but they never ship anywhere near maxed out. That means that if you are going to install upgraded components, you’re going to be pulling components out, and you might not have anywhere to make use of them.

You can get barebones mini PCs with no RAM or storage installed, but most of the time, those options cost more than the mini PCs that ship with some RAM and storage!

For my own homelab purposes, and probably for many of your purposes, I believe that the Celeron N100 paired with 16 GB of RAM is a nice balance. It is fortunate that they happen to ship with a reasonable amount of RAM like this.

Trigkey N100 Mini PC

The Trigkey N100 Mini PC has the same manufacturer address as Beelink Mini PCs, and it looks almost identical in construction!

I was a fan of the N5105 and N5095 mini PCs last year, but they almost always ship with 8 GB of RAM and only 128 GB of storage. That was fine with last year’s pricing, but today you can get a faster N100 with twice as much RAM and four times as much storage for about $25 more. It is almost silly to buy the N5095 mini PCs unless you have absolutely no use for the extra RAM or storage.

The Ryzen 5560U and 5700U Beelink mini PCs have the same problem, but in the opposite direction. They offer roughly twice as much CPU horsepower and possibly four times as much GPU grunt as the Celeron N100, but those faster Ryzen mini PCs almost always ship with 16 GB of RAM. Even worse, I believe they ship with a pair of 8 GB SO-DIMMs.

How could you make this work out in your favor? Maybe you could buy an N100 with 16 GB of DDR4 RAM, and N5095 with 8 GB of DDR4 RAM, and a single 32 GB DDR4 SO-DIMM. You upgrade the N100 to 32 GB, put that 16 GB SO-DIMM in the N5095, then throw the extra 8 GB in a drawer.

I will stick with Celeron N100 mini PCs with 16 GB of RAM in my homelab for now. Maybe they’ll get an upgrade someday when DDR5 is obsolete and people are just throwing away sticks of DDR4!

The Intel Celeron N100 paired with DDR4 RAM should be slower, and that is OK!

Instead of upgrading my RAM, I added a second Celeron N100 to my homelab Proxmox cluster. Even though I didn’t precisely double the amount of CPU horsepower available in my homelab, I did come pretty close, and it didn’t cost all that much extra for that upgrade.

Much to my surprise, the DDR4 mini PC clocked in at a few percentage points ahead of the DDR5 mini PC! This isn’t a big deal. The differences are small enough that it could just be a variation in the way the wind is blowing today, but I expected this minuscule difference to be pointing in exactly the opposite direction!

Last week, my entire homelab had 16 GB of RAM paired with around 2,786 units of multi-core Geekbench 5 score. Had I swapped in a 32 GB stick of DDR5, I would still have 2,786 units of CPU horsepower.

Instead, my homelab now has 32 GB of RAM paired with 5,639 units of multi-core Geekbench score. That is the combined equivalent CPU horsepower of a $250 Ryzen 5560U mini PC

You can compare the scores of my Trigkey DDR4 vs. my CWWK DDR5 N100 mini PCs.

The differences in these benchmarks, and why Geekbench 5?!

One of the reasons that I am still using Geekbench 5 is that I own a copy. I haven’t upgraded because Geekbench 6 does not include an AES encryption test.

My personal network relies very heavily on Tailscale, and my various off-site locations are starting to have Internet connections faster than some of my machines can encrypt data. When shopping for upgrades, being able to see how much of a bump to encryption speed I might be seeing is awesome.

In most of Geekbench’s tests, my new DDR4 mini PC is faster than my old DDR5 mini PC by 2% to 5%. There are a few tests that the DDR5 box does excel at, and one of those tests is AES encryption. The more expensive N100 mini PC is ahead there by a whopping 36%.

This makes me pleased that my first homelab server has the faster RAM, because it is equipped with four 2.5 gigabit Ethernet ports. It can’t encrypt Tailscale data that fast, but the extra 36% will help use more of the potentially available pipe.

The Trigkey mini PC with DDR4 RAM only has one network port, and it is a slower gigabit Ethernet port. It can easily transfer 900 megabits per second of encrypted data via Tailscale, so that encryption deficiency isn’t going to be a real-world bottleneck for me.

Mini PCs with an N100 and DDR5 cost quite a bit more, and that is also OK!

The Beelink EQ12 Mini PC with DDR5 goes on sale frequently for $229. Never buy any of these mini PCs at full price. There will almost always be one or two sales every single week.

The Beelink N100 with DDR5 is a very different machine than the DDR4 model, so you don’t just get a potential boost in performance from the faster DDR5 RAM. You also get a pair of 2.5 gigabit Ethernet ports instead of a single gigabit Ethernet port. Depending on your use case, this is either a huge upgrade or a complete nothing burger.

My current homelab machine is even weirder. It has a Celeron N100 paired with a single stick of DDR5 RAM, but it also has four 2.5 gigabit Ethernet ports and FIVE m.2 NVMe slots! I am not yet utilizing all these extra features, but I did have to pay extra for them.

I didn’t account for the increase in my electric bill

My CWWK N100 mini PC burns through 0.34 kWh of electricity every day. That is a little over $15 per year where I live.

I won’t get to put a power meter on the Trigkey box to get a good reading until I put it into service running Proxmox, but I am expecting the numbers to be comparable. The CWWK box is burning some extra power with its four 2.5 gigabit Ethernet ports, and the Trigkey is running a fan.

They won’t be identical, but they ought to be pretty similar.

I paid $60 more to add this second mini PC to my homelab than I would have paid if I had just upgraded my CWWK mini PC’s RAM, and I will pay that much again in electricity over the next four years.

This might only work out to ACTUALLY being a good deal if I can manage to get some good use out of it!

I want to mess around with this N100 mini PC before I put it to work!

I haven’t completely filled up my Proxmox server with virtual machines yet. I have several gigabytes of RAM to spare, and I am barely pushing this tiny CPU at all. I don’t have to rush to load Proxmox on the Trigkey server to quickly migrate a couple of virtual machines to alleviate pressure. I can take my time.

I have lots of questions about the Celeron N100. How terrible is the GPU? Can it play any games? How is the latency streaming games using Steam’s remote-play feature compared to my ancient Steam Link hardware?

I have already farted around with some of this stuff, but I don’t have much recorded data to compare it to.

What little data I have so far on gaming with a Celeron N100

Dead Cells runs locally at more than 100 FPS. This isn’t terribly surprising, but I am excited about this because the Nintendo Switch dips under 30 FPS any time there’s a lot of fire on the screen. Would an N100 on my TV be my new Dead Cells console?

I tried working my way backward in time. Borderlands 2 couldn’t even reach 20 FPS on the title screen at 720p with low settings with FSR upscaling to 1080p. Borderlands GOTY Enhanced Edition could break 30 FPS with the same settings. I don’t even like playing first-person shooters at 60 FPS, so I can’t imagine doing it at 30 FPS!

Trigkey N100 Mini PC as a gaming machine

I squeezed the Trigkey N100 with Steam onto our TV stand in the living room right next to the PlayStation 3!

Gunfire Reborn is a well-optimized game with relatively simple graphics. I could get it running at over 50 FPS, but most people would consider this to be playable!

I spent ten minutes streaming games from my computer to the Celeron N100 mini PC over gigabit Ethernet. I was seeing around 8 milliseconds of streaming latency over gigabit Ethernet and 13 milliseconds with the mini PC connected to 5.8 GHz WiFi and the gaming PC connected to gigabit Ethernet. Is that good? That is roughly half of a frame at 60 FPS. My cheap Alldocube Android tablet was seeing around 24 milliseconds of latency over WiFi.

I can tell you that Dead Cells was unplayable with 24 milliseconds of latency, but Red Dead Redemption 2 only felt a little off with 70 milliseconds of latency over T-Mobile’s 5G network.

Emulation might be where it is at

I am a huge fan of emulating old gaming systems. I built an arcade cabinet with all my favorite childhood games installed. I own an Anbernic RG35XX, and I use it to play all sorts of NES and SNES games. It doesn’t take a lot of horsepower to emulate these older systems.

I have not yet tested this myself, but it looks as though the Celeron N100 has enough horsepower to emulate just about anything up to and including the Nintendo Wii U. That is exciting, because there are a few good romhacks of The New Super Mario Bros. Wii with completely new and awesome levels to play. I haven’t gotten to play those since I gave away my modded Nintendo Wii, and I bet there are fun romhacks for the Wii U that I haven’t even heard of.

My choice won’t make sense for everyone!

My chonkiest VM has 3 GB of RAM allocated to it, and the majority of the services I might want to run here at home would probably fit in 1 GB or less. That makes 16 GB feel spacious, and it isn’t a big deal for me to split my virtual machines up between two or more hosts with only 16 GB of memory each.

Stable Diffusion Geeky Dude

Someone with different needs might feel claustrophobic in this sort of space. If you have lots of virtual machines that each require more than 8 GB of RAM, then you’re going to have a hard time divvying those up between Proxmox servers with only 16 GB of RAM.

Just because this was a good deal for me does not mean it will be a good idea for you, but even if it isn’t, maybe the line of thinking that got me here will lead you down a comparable path.

Conclusion

This is for sure not the conclusion. This is only the beginning, right?! I don’t know what experiments outside of emulation and video games I will come up with for this Intel N100 mini PC from Trigkey, but I expect I will come up with something!

I haven’t decided where its permanent home will be. Should it live right next to the CWWK N100 mini PC underneath my network cupboard? Should it live in my office so it can marginally streamline my Octoprint setup for my Sovol SV06 3D printer? It would be nice to attach another 14 TB USB hard disk to it so I could have a second on-site backup of my data on the exact opposite side of the house.

What do you think? Did I make the right move by buying a second mini PC instead of a RAM upgrade? Are you now thinking about doing the same thing to upgrade your homelab? Is it disingenuous to say this was cheaper than buying a stick of RAM? I still feel like I saved money, but am I wrong about that? Let me know in the comments, or stop by the *Butter, What?! Discord server to tell me why I am wrong!

Please DO Share Our Links on Mastodon - Heres Why!

| Comments

I don’t have a ton to say here. This title came to me as soon as I saw the opposite topic posted on It’s FOSS News. When I saw that their complaint might have been about only 15,000 and 115 megabytes of traffic all coming in over the span of a few seconds, I figured I should post something. Maybe you will all share it on Mastodon to try to make my site break for a little while!

I have no real content to post today, so here is a picture of my adorable cat. Her name is little Miss Trudy Judy. She is quite friendly!

Little Miss Trudy Judy the cat

I can at least tell you why you should share our links on Mastodon. You should share them wherever you like. All our blogs are sitting behind Cloudflare, and I am pretty certain that I have patshead.com just about as heavily cached as Cloudflare will let me.

It has been quite a few years since I dialed up all the caching settings, so I hope they are still working correctly. We might just find out together if you share this post on Mastodon!

For good measure, here is a Stable Diffusion image of a guy who must be working hard to keep my little virtual private server with its massive 512 megabytes of RAM online.

Stable Diffusion Power Guy

I hope you are having a good day. I have been trying to do a better job of posting on my Mastodon account. It would probably encourage me to post more if you happened to give me a follow!