u/Comfortable-Rock-498

▲ 41 r/familyguy

▲ 322 r/LetsTalkLLMs+3 crossposts

OSS models decisively overtook Proprietary models in market share (based on the last 3 months of OpenRouter data)

dirac.run

u/Comfortable-Rock-498 — 18 days ago

▲ 66 r/openrouter

Possibly exploitable routing on openrouter?

As per docs, openrouter supports 3 routing types:

"price": prioritize lowest price
"throughput": prioritize highest throughput
"latency": prioritize lowest latency

In most agentic loops, cache pricing matter much more than new read/writes.

However, and I found it accidentally, it seems that this price is input/output pricing and does not seem to take cache pricing into account?

Go to https://openrouter.ai/deepseek/deepseek-v4-flash?sort=price which sorts by price and you will see that Deepseek official provider, while the cheapest (see screenshot), ranks no 10 in that list.

The other providers offer nominally lower price in input/output tokens but price their cache reads 10x higher than Deepseek.

Looking at the token share among providers seem to confirm this hypothesis. Deepseek's effective pricing is like 1/5th of the nearest competition, it only gets 1/3rd of the token share!

If true, a provider that wants to exploit this only needs to set their read/write pricing lower, and they would get requests routed to them while being more expensive effectively. Alibaba in the screeenshot costs effectively 6x more than Deepseek and gets ~23% routing share, possibly due to this exploit.

u/Comfortable-Rock-498 — 29 days ago

▲ 257 r/codex+1 crossposts

Just got this email from openAI

u/Comfortable-Rock-498 — 1 month ago

▲ 7 r/LLM+1 crossposts

Outsourcing plus LocalAI will soon become more economical vs Frontier labs

signalbloom.ai

u/Comfortable-Rock-498 — 1 month ago

▲ 614 r/openrouter+3 crossposts

Inference provider tiers by Cache-hit rates, using openrouter data

u/Comfortable-Rock-498 — 1 month ago

▲ 5 r/MiniMax_AI

was just doing some refactoring strategy testing among different models including both deepseek, kimi k2.6 and glm 5.1. M-2.7 did surprisingly well, especially considering it is the smallest model in class by a margin

u/Comfortable-Rock-498 — 2 months ago

▲ 417 r/LocalLLaMA+1 crossposts

u/Comfortable-Rock-498 — 2 months ago

▲ 3 r/GoogleGeminiAI+1 crossposts

Scored 65.2% vs google's official 47.8%, and the existing top closed source model Junie CLI's 64.3%.

Since there are a lot of reports of deliberate cheating on TerminalBench 2.0 lately (https://debugml.github.io/cheating-agents/), I would like to also clarify a few things

Absolutely no {agents/skills}.md files were inserted at any point. No cheating mechanisms whatsoever
The cli agent was run in leaderboard compliant way (no modification of resources or timeouts)
The full terminal bench run was done using the fully open source version of the agent, no difference between what is on github and what was run.

I was originally going to wait for it to land on the leaderboard, but it has been 8 days and the maintainers do not respond unfortunately (there is a large backlog of the pull requests on their HF) so I decided to post anyways.

HF PR: https://huggingface.co/datasets/harborframework/terminal-ben...

It is astounding how much the harness matters, based on this and other experiments I have done.

u/Comfortable-Rock-498 — 2 months ago

Me when reading anything these days on bot infested internet

Who the hell is that? I bet he took the trophy... get him!

W-We are smarter than Madonna

Anything Brian says with his index finger up, I find hilarious

Don't conflate 'Minimal' with Minimal Effort

OSS models decisively overtook Proprietary models in market share (based on the last 3 months of OpenRouter data)

Possibly exploitable routing on openrouter?

Just got this email from openAI

Outsourcing plus LocalAI will soon become more economical vs Frontier labs

Inference provider tiers by Cache-hit rates, using openrouter data