u/H4llifax

Somewhat capable and quick local model for CPU?

I exhausted my budget for the week and started looking at local models.

My Hermes lives on a VPS with no GPU, and 24GB RAM. I tested three models, and gave each the task to essentially look up a skill by name that contained the project context, and to tell me about the project.

Phi4-mini: Failed hard, after waiting for half an hour it produced a long, rambling answer with no usable content.

Llama3.1:8b: Stayed better on track, but couldn't figure out how to use the skill view tool apparently. Wait a bit less ~25 minutes.

Qwen2.5:7b: Took about 15 minutes to view the skill, then another ~36 minutes to produce a somewhat usable answer that did reference project specific knowledge.

I think I could make Qwen2.5 work, but it's too slow. Nothing meaningful will get done at that pace even if the quality were ok (which I am sure will be noticeably worse than my regular cloud model).

Is there any model out there useable for Hermes on CPU at a somewhat useable speed? It doesn't need to be super fast, but I'd rather wait 5 than 50 minutes for a simple response.

reddit.com
u/H4llifax — 5 days ago

How to delegate to a different profile?

I'm using Minimax M2.7 as main model, and recently added a Gemma4 profile for VLM use cases. However, delegate_task to my knowledge always uses the main model. Minimax found a workaround to simply call hermes CLI, but that doesn't feel very clean.

What is a clean approach to call on other profiles on demand, like in my case where my main model has no vision capability and I want to use a secondary model for vision tasks?

I'm using the telegram gateway as my primary way of interacting with Hermes.

reddit.com
u/H4llifax — 13 days ago

Sorry for this rant, but I feel like venting to someone.

Recently I set up an agent on a cloud VPS. All was well until I started noticing that web searches were failing. Turns out, people start blocking bots on their sites. So seemingly basic things like a web search devolve into me, the human, delving into topics like browser stealth, residential proxies and subscription services for said stealth.

Like, really. The internet is full of bad bots so this agentic AI revolution will be stopped in its track by bad actors making it imperative to make your service less, not more, accessible to agents. Sorry people but I'd rather just pay a search engine provider for their service than entering this arms race of bot stealth. And in fact I don't _really_ want to do that, I'm very accustomed to search being free.

I hate how yet another great thing is ruined by the fact that some people do bad things.

Thanks for reading the rant.

reddit.com
u/H4llifax — 1 month ago