▲ 1 r/LocalLLM
NewMx: Compress LLM prompts by 30-40% with zero model changes
I built a deterministic codec that replaces common natural language phrases with single Unicode glyphs. Each glyph tokenizes as ONE token under cl100k_base (GPT-4's tokenizer).
What it does:
- 3,135 phrase mappings (419 exact + 38 intent families)
- 6.19% aggregate token reduction on 1.46M-line corpus
- 30-40% savings on prompts that compress (~92% of cases)
- ~4k token decode table prepended once per session
- Break-even at ~1,054 prompts (much lower with prompt caching)
No fine-tuning. No model cooperation. Works with any LLM API.
pip install newmx
GitHub: github.com/CCC-Studios/newmx
Would love feedback from anyone testing on their workloads!
u/JustHereForOneMeme — 10 days ago