u/d4mations — reddlx

📌 Daily Github Digest - oMLX Closed Issues → 2026-05-20

Issues Closed: 10

[ISSUE] #972 — Share sensitivity data artifacts across same model quantizations
https://github.com/jundot/omlx/issues/972

[ISSUE] #1068 — DFlash strips thinking tokens
https://github.com/jundot/omlx/issues/1068

[ISSUE] #1260 — Cancelled HF downloads don't clean up `._____temp/` partial shards
https://github.com/jundot/omlx/issues/1260

[ISSUE] #1276 — feat: expose draft_window_size / draft_sink_size / verify_mode for long-context agentic workloads
https://github.com/jundot/omlx/issues/1276

[ISSUE] #1121 — deepseek flash oq2 mtp model pls
https://github.com/jundot/omlx/issues/1121

[ISSUE] #1155 — DeepSeek-V4-Flash-oQ2 FAILED 0:00 [reshape] Cannot reshape array of size 3102720 into shape (129280,6).
https://github.com/jundot/omlx/issues/1155

[ISSUE] #1296 — oQ: deepseek_v4 fails with "Missing mtp.0.{e,h}_proj.biases" after #TEMP guard — concrete repro + fix paths
https://github.com/jundot/omlx/issues/1296

[ISSUE] #1300 — Can’t select DeepSeek-V4-Flash-bf16 for oQ
https://github.com/jundot/omlx/issues/1300

[ISSUE] #1288 — Server Settings restart vs save UI is confusing
https://github.com/jundot/omlx/issues/1288

[ISSUE] #1259 — FYI: some failing tests
https://github.com/jundot/omlx/issues/1259

u/d4mations — 2 days ago

▲ 701 r/oMLX+1 crossposts

Qwen 3.7 plus and max

u/Namra_7 — 4 days ago

▲ 10 r/oMLX

📌 Daily Github Digest - oMLX Closed Issues → 2026-05-15

Issues Closed: 7

[ISSUE] #1105 — Add ParoQuant Support
https://github.com/jundot/omlx/issues/1105

[ISSUE] #1169 — Hermes support
https://github.com/jundot/omlx/issues/1169

[ISSUE] #1144 — [Bug] v0.3.9.dev1 - MTP does not support vlm (e.g. image)
https://github.com/jundot/omlx/issues/1144

[ISSUE] #1207 — [Bug] Decode tg TPS speed drops 50% and pp TPS speed drops 13% when SpecPrefill is enabled after commit 936434f381d1e2e03e8a0f410a2adacbe5d2…
https://github.com/jundot/omlx/issues/1207

[ISSUE] #1223 — omlx launch <tool> ignores configured API endpoint, hardcodes 0.0.0.0
https://github.com/jundot/omlx/issues/1223

[ISSUE] #1221 — Why is the text-to-image model invisible after downloading via the oMLX web dashboard?
https://github.com/jundot/omlx/issues/1221

[ISSUE] #1115 — Severe TG throughput regression on tool result ingestion turns during agentic sessions
https://github.com/jundot/omlx/issues/1115

u/d4mations — 7 days ago

▲ 4 r/oMLX

📌 Daily Github Digest - oMLX Closed Issues → 2026-05-13

Issues Closed: 10

[ISSUE] #1115 — Severe TG throughput regression on tool result ingestion turns during agentic sessions
https://github.com/jundot/omlx/issues/1115

[ISSUE] #1204 — Unable to benchmark oQ-MTP
https://github.com/jundot/omlx/issues/1204

[ISSUE] #1106 — Title: _extract_tensor_bytes SIGABRT on hybrid model (Qwen3.6-35B-A3B) with SSD cache — reproducible on 0.3.8.x and 0.3.9.dev1
https://github.com/jundot/omlx/issues/1106

[ISSUE] #1202 — admin /chat returns HTTP 500 (unhandled): unexpected '}' — unclosed Jinja2 expression in chat.html
https://github.com/jundot/omlx/issues/1202

[ISSUE] #1117 — STTEngine.transcribe drops the `language` parameter — Qwen3-ASR is forced into auto-detect, producing empty output for short audio
https://github.com/jundot/omlx/issues/1117

[ISSUE] #1167 — Error during chat streaming: 'NoneType' object has no attribute 'abort_request'
https://github.com/jundot/omlx/issues/1167

[ISSUE] #1199 — Auto-built sensitivity proxy still comes without MTP tensors?
https://github.com/jundot/omlx/issues/1199

[ISSUE] #1190 — DFlash failure when max context is limited and reached
https://github.com/jundot/omlx/issues/1190

[ISSUE] #859 — Feature request: integration Github copilot-cli
https://github.com/jundot/omlx/issues/859

[ISSUE] #1138 — Unable to benchmark oQ-MTP
https://github.com/jundot/omlx/issues/1138

u/d4mations — 9 days ago

▲ 41 r/oMLX

oMLX 0.3.9.dev2 released.

Highlights:
- Gemma 4 MTP on the vision path (thanks to @Prince_Canuma's mlx-vlm). Image+text decodes much faster now
- Gemma 4 on the DFlash engine (thanks to @bstnxbt's dflash-mlx)
- ParoQuant support
- omlx launch copilot joins claude / codex / opencode / openclaw / pi
- Restart server button right in the admin UI
- oQ auto-builds a proxy when the model can't fit in RAM

Plus a lot of bug fixes and 20 new contributors in this cycle.

reddit.com

u/d4mations — 9 days ago

▲ 16 r/oMLX