u/Embarrassed_Fruit467

I dont know how to make my local llm to tool call

I've been using hermes with api so last week I tried to use a local model, Qwen 3.6 35B Q8 on llama.cpp but i couldn't make it to make any tool call, the model works at like 50t/s. It´s running on windows on a ubuntu terminal.

The server running the local llm is on a 5080 with 64 of DDR5.

Do I need to do additional config or the model is too weak?

u/Embarrassed_Fruit467 — 9 days ago