u/Independent_Deer2931

TL;DR ———
My new project will burn API token usage like crazy:

What's the best model to use in replace of sonnet 4.6 or opus 4.7?
Is virtual llm hosting possible, or should I just hard wipe my gaming computer and run it from that?
I'm using it for: planning/ logic / reasoning / planning / insight /foreseeable future outcome provided the proper documents

Thank you guys in advance! This means a lot to me! :P

——————

Prior to this, I have wanted to host a local Mac Mini instance that runs Hermes Agent. Along with having a local LLM

Fast forward to now. I'm currently working on a project that I can already foresee will eat and take up a huge amount of token usage. Running the first session as a test run today to make sure it was functional before adding anything else or really implementing a plethora of features onto it, it ate up and ran through an enormous amount of usage

Note I was using Anthropix API directly on:
‘CLAUDE-SONNET-4.6’

I now want to know, are there any LLMs that are genuinely very good and recommended that are on par with or genuinely better than Sonnet 4.6? At the very minimum. When it comes to logic reasoning predictability insight judgment and foreseeable company metrics granted it has access to our internal documents with the ability to read them when needed at free will.

For this desired level of output, I understand that I'm going to need a pretty decent rig to run it. And to store it and run it at a pretty good/decently/average rate

By any chance am i able to run this virtually if i was to have access to a pretty beefy bps server or dedicated place that will host it don't really know how this works or how or anything like that but if it can and i do have options that are genuinely that are genuinely good please give me insight let me know and um inform me.

If not my current backup idea is to simply take the gaming rig i have at home and fully wipe it and use that as a dedicated place to download store and run the model off of as well as anything else that can help that can help run the model locally.

I don't want to get a Mac Mini resale prices are high plus new apple m chip soon.

Please give me your best insight and knowledge within this domain, please. It'll be my first time running a model locally or for myself and need some guidance and advice

Am I able to host a LLM on a Beefy VPS or Just use my Gaming PC?

Which is best?: Mac Mini vs VPS