TL;DR

Linux users can now leverage Nvidia GPU VRAM as swap space via a user-space daemon that uses CUDA APIs. This approach is compatible with consumer GPUs and offers an alternative to traditional swap methods, potentially increasing effective memory capacity.

Linux users with Nvidia GPUs can now repurpose VRAM as swap space through a new user-space daemon that bypasses kernel and driver limitations, expanding available memory on hybrid laptops and desktops.

The method involves a daemon that allocates VRAM via the CUDA driver API and exposes it as a block device using the NBD protocol. This device then functions as swap space, integrated into the Linux swap system. The approach is compatible with consumer Nvidia GPUs supporting CUDA, with no need for kernel modules or modifications, making it resilient to driver and kernel updates. Tested on an AMD/ATI + RTX 3070 laptop with 16 GB RAM and 8 GB VRAM, the setup allocated 7 GB of VRAM for swap, effectively tripling the total addressable memory to approximately 46 GB when combined with zram and SSD swap.

The implementation sidesteps the typical limitations encountered with Nvidia’s peer-to-peer (P2P) API, which restricts direct VRAM access on consumer GPUs. Instead, it uses CUDA memory copy operations (cuMemcpyHtoD and cuMemcpyDtoH) to read and write VRAM, which are supported without special permissions. The setup involves cloning the repository, installing the daemon, and configuring systemd services for automatic startup and power-aware management. Benchmarks show that while VRAM-based swap is slower than NVMe for sequential transfers, it offers significantly lower latency for sporadic access, making it suitable for specific use cases.

Why It Matters

This development could significantly impact users of hybrid and portable laptops, enabling them to extend effective memory capacity without hardware upgrades. It offers a new way to utilize otherwise idle VRAM, potentially improving performance in memory-constrained environments. Additionally, it demonstrates a workaround for Nvidia driver limitations, opening possibilities for further GPU-based system enhancements.

NVIDIA - GeForce RTX 4080 16GB GDDR6X Graphics Card

NVIDIA – GeForce RTX 4080 16GB GDDR6X Graphics Card

Powered by the NVIDIA GeForce RTX 4080 (16GB) graphics processing unit (GPU) with a 2.51 GHz boost clock…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Traditional swap on Linux relies on disk or SSD devices, which can be slow and limited in capacity. Recent efforts to use GPU memory as swap have been hindered by Nvidia driver restrictions, particularly with consumer GPUs. This new approach utilizes CUDA APIs that are supported across many Nvidia consumer cards, providing a practical alternative. The concept builds on existing knowledge of using user-space tools and NBD devices to extend system memory, but applies it specifically to GPU VRAM, which is typically underutilized in portable systems.

“This method allows Linux users to harness idle Nvidia VRAM as high-priority swap, bypassing kernel and driver limitations.”

— Developer of the tool

“Using GPU VRAM as swap could extend usable memory on hybrid laptops without hardware upgrades, improving performance in constrained environments.”

— Linux system administrator

GPU-Accelerated Computing with Python 3 and CUDA: From low-level kernels to real-world applications in scientific computing and machine learning

GPU-Accelerated Computing with Python 3 and CUDA: From low-level kernels to real-world applications in scientific computing and machine learning

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how this approach performs under sustained heavy load or in different hardware configurations. Compatibility with all Nvidia consumer GPUs and across various Linux distributions remains to be fully tested. Additionally, the long-term stability and power management implications are still being evaluated.

Kingwin Tray-Less Hot Swap Mobile Rack Cage for Dual 2.5" SSD/HDD. Hard Drive Backplane Enclosure, Supports SATA I/II/III & SAS I/II 6 Gbps Performance

Kingwin Tray-Less Hot Swap Mobile Rack Cage for Dual 2.5" SSD/HDD. Hard Drive Backplane Enclosure, Supports SATA I/II/III & SAS I/II 6 Gbps Performance

✔️ DELIVER PEAK PERFORMANCE – Our hard drive mobile rack enclosure supports SATA I/II/III & SAS I/II, the…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Further testing across diverse hardware setups is expected, along with potential integration into mainstream Linux distributions. Developers may also explore optimizing performance and reducing overhead, as well as extending support for other GPU vendors or configurations.

FCZFCZ RC30-0370 Battery 61.6Wh 4003mAh Replacement for Razer Blade 14 2021 2022 14 AMD Ryzen 9 5900HX Nvidia GeForce RTX 3060 3070 3080 Series RZ09-0368 RZ09-0370 RZ09-0427 15.4V 2-Cell

FCZFCZ RC30-0370 Battery 61.6Wh 4003mAh Replacement for Razer Blade 14 2021 2022 14 AMD Ryzen 9 5900HX Nvidia GeForce RTX 3060 3070 3080 Series RZ09-0368 RZ09-0370 RZ09-0427 15.4V 2-Cell

【Specifications】Battery Model: RC30-0370 // Voltage:15.4V // Capacity:61.6Wh 4003mAh 2-cell // Color:Black // Condition:Brand New.

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Can I use this method with any Nvidia GPU?

It works with Nvidia GPUs supporting CUDA, including most consumer RTX and GTX cards, but performance and compatibility may vary depending on the specific model and driver version.

Will using VRAM as swap impact GPU performance?

Potentially, especially under high load, since VRAM is being used for swap and may compete with graphics workloads. Benchmarks show increased latency compared to traditional swap devices.

Is this safe for my hardware?

As it uses supported CUDA APIs and user-space tools, it is generally safe, but users should monitor system stability and avoid over-allocating VRAM, which could impact GPU performance or stability.

How do I install and configure this tool?

Clone the repository, run the install script, and configure systemd services as documented. The setup is straightforward but requires compatible Nvidia drivers and Linux kernel support.

Will this work with integrated graphics only?

No, it requires an Nvidia GPU with CUDA support. Integrated AMD or Intel graphics cannot utilize this method.

Source: Hacker News

You May Also Like

Why RAM Speed Matters Less Than Some Builders Think

Lesser RAM speeds often have minimal impact on performance, but understanding why can help you make smarter upgrade decisions.

How to Overclock Your PC Safely for Better Performance

Overclocking your PC can boost performance, but knowing the safe steps to do it properly will ensure your hardware remains protected and efficient.

Russia’s Mikron is selling framed test wafers with up to 120,000 processors as souvenirs — 12 designs, priced around $170 each, sold alongside $2 vials of cleanroom air

Mikron in Russia is selling limited-edition framed wafers featuring up to 120,000 processors as souvenirs, including designs with local chips like AMUR MIK32 RISC-V.

Japan to broaden subsidies for domestic legacy chip production

Japan will broaden subsidies for domestic legacy semiconductor manufacturing by removing a 30 billion yen investment minimum, supporting smaller firms.