u/Character_Bison5968

A training-free cross-family weight merge of Qwen2.5-7B-Instruct with 8 donors models from 4 architecture families. Lifts GSM8K +3.3 pp, ARC-Challenge +3.2 pp, and IFEval +2.6 pp absolute over the unmerged anchor. No fine-tuning. Interested in your thoughts - here is the model card link

u/Character_Bison5968 — 20 days ago