Has anyone tried sharing a GPU server instead of everyone renting their own?
Has anyone tried running a shared open-model server instead of everyone renting their own GPU?
Instead of spinning up separate RunPod/Vast instances, I'm wondering whether it makes more sense to run one larger GPU server with Qwen/DeepSeek/GLM etc. loaded, then let multiple people use it with rate limits and queues.
Kind of like joining a game server instead of renting your own.
My assumption is most people's workloads are bursty enough that overall GPU utilization would be much higher.
Has anyone done this? If not, what's the biggest downside, privacy, noisy neighbours, latency, fairness, or something else?