
Finally got my hands on a Mac Studio 96GB (running Gemma 31B locally) Feedback on this setup?
Been running a few Mac minis at home for OpenClaw but wanted to try moving more of my day-to-day AI workloads off subscription-based services and onto local models. Ever since anthropic pushed out openclaw from the subscription, using Opus / Sonnet is way too costly. Even though local models is not there yet, I think it'll get there soon. It felt like the right time to actually invest in the hardware.
Managed to find a Mac Studio 96GB on eBay. Not cheap (4.5k), but couldn't find one anywhere else.
So far it runs Gemma 3 pretty well. Really happy with it for most things. Except the prefill is a bit slow.
The model I really want to run locally is DeepSeek 4 after the new inference engine from this week. But from what I understand you need at least 128GB unified memory to do that properly. So the 96GB is a bit of a bottleneck there.
Anyone else running a similar setup? What are you running on yours and is the jump to 128GB+ actually worth it for local inference?