u/FUTC-Photography

Minimax-M3 is completely unpredictable

I've mostly been using Kimi K2.6 which I like a lot, but decided to give Minimax M3 a go since they just released it. I feel like it is much more chaotic and verbose, as well as hallucinations being more common. Now it just suddenly keeps stopping mid action. I'm using Ollama Cloud and I'm wondering if anybody else is experiencing this behavior? Right now I wouldn't use it in production. It almost seems like a model with increased temperature. It deals with complex long-running tasks pretty well, but I feel like it has a tendency to make mistakes or miss links between information and then frantically tries to fix it. Kimi K2.6 felt much more predictable and calm even if it made mistakes.

u/FUTC-Photography — 7 days ago

▲ 2 r/LocalLLM

Hermes + Qwen3.6:35b-MLX how to turn off thinking/reasoning?

I am relatively new to the whole local LLM thing, I've got an M1 Max Macbook Pro with 32gb of unified memory that can run qwen3.6:35b surprisingly well, especially with MLX. I decided to try out Hermes after seeing networkchuck's video on it, and was able to connect it to ollama.

Here's my issue: Thinking is great for a lot of complex tasks, but a lot of the time I don't need thinking/reasoning (for example when I use an agent to help me study Japanese) and qwen3.6 has a tendency to end up in thinking loops. Is there a way to turn off reasoning/thinking for qwen3.6 from inside Hermes or when interfacing with it through Telegram? An easy way to toggle between thinking and not thinking would be amazing.

reddit.com

u/FUTC-Photography — 19 days ago