u/Kitchen_Breakfast_49

Been running a few Mac minis at home for OpenClaw but wanted to try moving more of my day-to-day AI workloads off subscription-based services and onto local models. Ever since anthropic pushed out openclaw from the subscription, using Opus / Sonnet is way too costly. Even though local models is not there yet, I think it'll get there soon. It felt like the right time to actually invest in the hardware.

Managed to find a Mac Studio 96GB on eBay. Not cheap (4.5k), but couldn't find one anywhere else.

So far it runs Gemma 3 pretty well. Really happy with it for most things. Except the prefill is a bit slow.

The model I really want to run locally is DeepSeek 4 after the new inference engine from this week. But from what I understand you need at least 128GB unified memory to do that properly. So the 96GB is a bit of a bottleneck there.

https://preview.redd.it/bdg5oe8uyz0h1.png?width=692&format=png&auto=webp&s=91c0c6d5c937fec0838734e1125be9ac7b48b2a3

Anyone else running a similar setup? What are you running on yours and is the jump to 128GB+ actually worth it for local inference?

Finally got my hands on a Mac Studio 96GB (running Gemma 31B locally) Feedback on this setup?