Wanting more than 16GB of VRAM for local models, the author paired an RTX 5080 with a refurbished RTX 3090 and got Qwen 3.6 27B at Q8 running past 80 tokens a second, up from around 30 on a single card at Q4.
The real content is the plumbing. Splitting one PCIe x16 slot into two x8 needs a board that supports it (an Asus Prime X570-Pro here), plus three non-obvious BIOS changes: disable CSM so the system boots in UEFI mode, enable Above 4G Decoding, and turn on Resizable BAR. Miss the boot-mode step and both cards refuse to cooperate.
Forty gigabytes of mixed VRAM, much of it from one refurbished card, puts Q8-quality local inference within reach of an ordinary desktop. That reads rather differently on a day a vendor pulled two models from everyone with no notice.