u/SuddenPenalty8153

Hi everyone,

I'm planning a major core upgrade for my workstation/home lab, and I’m torn between two completely different paths. My main use cases are Local LLM deployment (Ollama/Agents) and potentially setting up a "2 Gamers 1 CPU" virtualized environment using Proxmox (GPU passthrough).

Right now, I own an RX 9060 XT (16GB VRAM), and I have access to an RX 9070 XT (16GB VRAM). Crucial context: The 9070 XT is not mine, but I am allowed to use it indefinitely on the strict condition that I keep it active in a functional PC.

(And before anyone suggests it: NO, I do not have the money to jump to AM5. This entire project is strictly based on flipping second-hand AM4 parts to keep costs down).

I am evaluating these two options based strictly on net performance and system versatility:

Go the Dual AMD route (The 1-Host Project): Build a single-host Proxmox server using both cards. This fulfills the condition to keep the borrowed 9070 XT active. To make this work, I must upgrade my core system: an AM4 workstation board supporting native PCIe Gen 4 x8/x8 bifurcation (ASUS Pro WS X570-ACE), a 16-core Ryzen 9 5950X, and an 850W+ PSU to handle spikes. Selling the specific parts I am replacing (my old board, CPU, and PSU) finances about 53% of this infrastructure transition.
Go the NVIDIA route (The Split Project): Abandon the single-host virtualization project entirely. I would sell my RX 9060 XT and buy a used RTX 3090 (24GB) for my main AI rig. However, because of the agreement regarding the borrowed RX 9070 XT, I cannot just leave it on a shelf waiting for the current "Ramageddon" market madness to pass; I would have to build a separate, dedicated native gaming PC around it immediately using cheaper/spare parts. Selling the 9060 XT alone finances roughly 70% of the RTX 3090 purchase.

The Transitional Strategy (Dual AMD's silver lining): Going the Dual AMD route right now would buy me some valuable time. It allows me to establish the system topology, keep the borrowed GPU running, and deploy my workloads immediately, giving me breathing room to wait for the market to stabilize. From there, I could comfortably plan my next upgrade path: either replacing my 9060 XT down the road with a second 9070 XT for a perfectly balanced dual-AMD stack, or jumping straight to dual RTX 3090s if the software ecosystem demands it.

Here is where I need your technical insight regarding Performance and Versatility:

VRAM Capacity vs. Compute Speed & Latency: The Dual AMD setup gives me a net total of 32GB of VRAM, allowing me to boot lightweight Q4 quantizations of 70B models locally. However, tensors will be split across two cards over a x8/x8 bus, meaning inter-GPU latency penalties and a heavy reliance on ROCm/Vulkan wrappers.
The 3090 Route: Limits me to 24GB of VRAM (maxing out at 32B/34B models natively), but yields an un-fragmented 936 GB/s memory bandwidth, raw CUDA native compatibility out-of-the-box for all agent frameworks, and zero multi-GPU overhead.
Versatility vs. "Jank" Factor: The single-host Proxmox build offers the versatility of running a consolidated home server, but comes with the heavy maintenance tax of 4x16GB DDR4 stability issues on AM4 (Daisy Chain topology), IOMMU group headaches, and kernel-level anti-cheat blocks (Vanguard/Valorant) for the gaming VMs. The 3090 route limits immediate model size but offers software ecosystem versatility (Docker, PyTorch, stable agents) and zero VM maintenance fatigue.

Given the financial metrics (53% of the dual infrastructure subsidized by old parts vs. 70% of the 3090 subsidized by the extra GPU), which path offers the best balance of raw AI performance and long-term deployment versatility? Is the 70B model capability of fragmented AMD cards worth the virtualization overhead and ROCm friction over a pristine single-die RTX 3090 setup?

Would love to hear your thoughts, especially from those who run multi-GPU AMD setups for inference or those who abandoned "2 Gamers 1 CPU" builds due to upkeep fatigue.

Thanks!

Single RTX 3090 (24GB) vs. Dual AMD (16GB + 16GB) for Local LLMs? Performance vs. Versatility in a "2 Gamers 1 CPU" dilemma.

What's the most dangerous object/equipment in your lab that isn't marked as a hazard?