Nnoticing qwen-27b@q2 better than qwen-35b@q8?
The Latest qwen3.6 models. Is this odd? i code with qwen models and the 27b@q2 even heavily quantised perform wayyy better than 35b-q8?
Have anyone else also tested across quant levels?
Edit: for anyone asking quants and setup im experiencing this on its on unsloth dynamic k_xl quants
qwen3.6-27b-UD-q2_k_xl. And qwen-3.5-35b-UD-Q8
llama.cpp latest using opencode unsloth dynamic quant makes the q2 more usable than expected.
For some odd reason i find 35b-a3b is really smart but simultaneously behaves kinda dumb. feels like im using a 4b model rather than a 35b. maybe im suspecting MOE behavioural capacity is tightly linked to num of active params rather than total. Im suspecting total params only contribute to how much the model knows but not how complex it can execute. For my use case i need him to understand complexity rather than accuracy. Bit i don’t think enough active params lights up to cover the complexity of the task and makes the 35b-a3b go wonky maybe i need to give 35b-a3b only give him baby tasks? But i need a bit more investigation to close in on that conclusion. Would be helpful if anyone can test this also.