Enabling Transparent Hugepages Can Provide Huge Gaming Performance Improvements

| Comments

My gaming rig is getting rather long in the tooth. I am running a slightly overclocked Ryzen 1600 and an aging Nvidia GTX 970 with its thermal limit pushed to its maximum. I wouldn’t even be able to play any games from the last few years if it weren’t for Proton-GE’s ability to enable AMD’s FidelityFX Super Resolution (FSR) in almost every game I play.

I haven’t done a ton of science. I don’t have a handy way to benchmark most games. I did run a Borderlands 3 benchmark with my potato settings. I have nearly every knob turned to the lowest setting, and I bet I have some extra things disabled in config files. I run Borderlands 3 at 1280x720 with FSR upscaling to 2560x1440, and my hope is that the game can stay above my monitor’s 102 Hz refresh rate. It doesn’t always stay that high.

NOTE: I backed off the overclock of my aging QNIX QX2710 monitors while replaying Borderlands 3. I dropped them down to 86 Hz, and I will probably keep them here until my next monitor and GPU upgrade. It is easier to hit 86 frames per second in newer games, and it is enough of a step up from 60 Hz that I don’t feel too bad about giving up the extra frames. Why I landed on 86 is probably a long enough story for its own blog post. Can you believe these crazy monitors are still working great nine years later?

Borderlands 3 Benchmark

The benchmark came in at 92 frames per second with Transparent Hugepages (THP) disabled. That is the default setting on Ubuntu. That went up to just over 99 frames per second when I turned THP on.

Your mileage will most definitely vary, but when you’re constantly dropping just below your monitor’s refresh rate, that 8% improvement is huge! It is easy and free to give it a try:

1
2
3
4
pat@zaphod:~$ echo always | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
[sudo] password for pat: 
always
pat@zaphod:~$ 

That command won’t do anything permanent. You will be back to the default setting next time you reboot.

What are Transparent Hugepages? Why do they help performance?

Let’s keep this to two or three short paragraphs. Your software usually allocates memory in 4 KB pages, and your CPU has to keep track of which physical locations on your sticks of RAM correspond to those 4 KB pages. The CPU has a cache of recently accessed pages. If your game is flipping through more pages than fit in that cache, things will slow down.

Hugepages are usually 2 MB instead of 4 KB. That means the CPU has to keep track of only a tiny fraction of those mappings. It is sort of like having a page cache that is suddenly 500 times larger.

When something is in the cache, it is just like when an item is on the shelf at the store. When something isn’t in the cache, you have to ask an employee to fetch the item from the back room. Every time something isn’t on the shelf, you have to wait. Just like the CPU.

THP have been a HUGE boost to my Team Fortress 2 performance!

Team Fortress 2 on Linux is stuck in a stupid place right now. The game uses a modern enough version of DirectX on Windows to work well with modern graphics hardware, but it is stuck using OpenGL on Linux. Since it is a multiplayer game, they don’t let us run the Windows version under Proton to get a performance boost. Valve have updated Portal 2 and Left 4 Dead 2 to use DXVK on Linux, and I hope they do the same for Team Fortress 2, but I am definitely not holding my breath.

Team Fortress 2 on Linux needs a lot of single-threaded CPU grunt, and I have always had trouble keeping the game up at my monitor’s 102 Hz. This is another one of those things I can’t easily benchmark.

NOTE: Not much going on in the video. I had OBS running a replay buffer, but this was the only time I remembered to hit the key to save a replay!

The game runs fine until I walk into a busy fire-fight on a server with tons of fancy hats and lots of explosions and effects. Then my frame rate drops far enough below my refresh rate that the game stops feeling smooth and I start having trouble landing pills with my demoman.

Enabling THP has helped dramatically with TF2. As far as I can tell, I have yet to drop below 102 frames per second, and I certainly haven’t dropped as low as my new 86 Hz refresh rate.

Quite a while ago I used mastercomfig.com to generate some potato settings for my game. The settings went so far that the weird cubic lighting made the game sort of resemble Minecraft. I am still using mastercomfig.com to lower my settings, but I have backed off several notches from the potato-grade settings.

It is a bummer that I have to play this ancient game with my GPU so underutilized that it sits clocked at the minimum frequency, but I am super stoked that I can play without my frame rates helping me to lose!

Will THP help with other games?

As I said, I am not using a ton of science here. I was playing through Dying Light when I learned that THP might help gaming performance. My unscientific test there was loading the game, waving the camera around in the room where I spawned, then reloading the game with THP and doing the same thing. The numbers seemed to be leaning at least 5% higher, but we are just going by my memory between reloads and hoping I pointed the camera at similar things.

Some games need more CPU. Some games need more GPU. Some settings lean more on one than the other. Even after that, things will depend on how much CPU and GPU your machine has. Some games could run slower, though I don’t think I have seen that yet. Some games might run the same. Some games might run a little better.

The only way to find out is to try.

THP can cause performance issues

There are reasons that the Linux kernel doesn’t enable transparent hugepages by default. There are some programs that run extremely poorly or cause problems. The most famous of which is probably PostgreSQL.

I have been running THP on my desktop for a couple of weeks now. I haven’t rebooted in nearly two months. I have had one hiccup so far. I wandered into my office and noticed that my glances window had a red process using 100% of a CPU core. It was khugepaged. Its job is to defragment memory so the kernel can map more 2 megabyte pages.

In my haste, I didn’t see the root cause of my problem right away. I figured my web browser was my longest-running process that uses a large amount of RAM, so I closed and reopened Firefox. The problem went away for a few minutes, but then it was back.

It turned out that when I closed Davinci Resolve the night before, it didn’t actually completely shut down. There were no windows visible, but there were processes eating up memory and using a very small but constant amount of CPU. I killed Resolve and haven’t seen khugepaged since. That was a few days ago.

Conclusion

I know some of you are rocking much newer GPUs than my GTX 970, and you probably don’t need to wrestle an extra 5% out of your games. I am glad GPU prices are getting better, but I paid $340 for this GPU within a week or so of release, and it was the second fastest available. More modern cards that perform roughly as well cost almost as much. Prices are getting better, but I feel like I will get quite a bit more bang for my buck if I can hold out on my next a little while longer.

If you need to squeeze a little extra out of your aging gaming rig, you should most definitely try enabling transparent hugepages. It is easy to try, easy to undo, and it seems very unlikely that it would have a negative impact on your gaming performance.

Comments