Every Chinese reasoning model has the same 400 error on turn 2. www.github.com/tbosancheros39/opencoded-thinking-fix
I have been banging my head against this for months. You ask DeepSeek a question, it answers fine. You ask a follow up, boom. HTTP 400. Same with Kimi, same with GLM, same with MiMo and MiniMax. I thought the models were broken. They are not. The clients are.
This is what is actually happening.
These models think before they speak. Not metaphorically, actually. They output a hidden field called reasoning_content, basically their internal notes. "User wants a weather app, I should check API docs, maybe use React..." You never see this field. It is invisible. But the model needs it back on the next turn.
OpenCode drops it. Cursor drops it. Claude Code drops it. VS Code Copilot drops it. Every single tool built against the OpenAI spec drops it, because reasoning_content is not in the OpenAI spec. It is a proprietary extension that DeepSeek, Kimi, GLM, MiniMax and Xiaomi MiMo all require anyway.
The first turn always works because there is no history to validate. So you test one round trip, it passes, you ship it, and your real users hit the wall on turn 2. This has been sitting in the open since January. Five months.
I know this because I logged it. My plugin has patched 12,551 messages across 200+ real sessions. Every single one of them was missing reasoning_content that should have been there. The plugin just fills the gap so the model can keep going.
The providers literally warn about this in their docs.
DeepSeek: "If your code does not correctly pass back reasoning_content, the API will return a 400 error."
Kimi: "You must keep the reasoning_content of every historical assistant message."
GLM: "When using interleaved thinking plus tools, you must explicitly preserve reasoning content."
MiMo: "Any assistant message with tool calls must preserve its full reasoning_content field, otherwise the API will return a 400 error."
MiniMax: "The complete model response must be appended to maintain reasoning chain continuity."
All five say the same thing. All five get ignored by the same clients.
I scanned Chinese, Russian and Western dev communities for evidence. The same bug shows up everywhere, independently.
15 OpenCode GitHub issues. One from January 28, 2026. Three PRs tried to fix it. None merged.
101K Russian developers read about GLM errors in OpenCode on Habr. A Russian dev patched LangChain source himself because the maintainers said they will not add support for provider specific fields.
31K Chinese developers viewed a cnblogs article explaining the workaround. A Tencent Cloud user wrote: "Feels like most people are hitting this. Qclaw and Workbuddy are dragging their feet, almost a month without fixing."
The CodeRouter blog put it best: "Your multi turn agent will deterministically 400 on turn 2. Affects every major agent framework."
17 platforms total. OpenCode, Cursor, VS Code Copilot, JetBrains, Roo Code, Kilo Code, n8n, Continue.dev, Claude Code Router, Codex CLI, GitHub Copilot, Make, OmniRoute, ZeroClaw, OpenClaw, Qwen Code, Hermes Agent.
That is not a provider bug. That is a protocol famine. The OpenAI spec has no slot for reasoning notes, so every client built on it silently drops them. Chinese providers built thinking mode on top anyway. The result is a five month old bug that breaks the cheapest and most capable models on the market.
I built a fix because I got tired of waiting. It is three layers, use what you need.
Plugin stops the crashes. 92 lines. Drop it in, restart OpenCode. Detects reasoning models and fills missing reasoning_content with empty strings. No more 400s.
Proxy replays real thinking. 422 lines. Runs on localhost:3457. Caches actual reasoning text per session, injects it on the next turn. Your model sees its own notes and keeps going like nothing happened.
Watchdog keeps the proxy alive. Systemd service, set and forget.
They stack. Plugin is the safety net, proxy is the optimization, watchdog is insurance.
If you maintain any tool that routes to DeepSeek, Kimi or GLM, check your message serialization. If you are building {role: "assistant", content: msg.content} from the response, you are dropping reasoning_content and your users are hitting this wall right now. They just are not telling you because they switched to Claude and moved on. The models are fine. The spec is the problem. The fix is simple. Someone just had to ship it. You can find logs in npm - sdk@openai-compatible Qwen does not have this problem.