Turned my StackChan into a German-speaking smart home assistant. Here's what actually works (and what doesn't).
Got the OpenELAB Kickstarter StackChan a while back. CoreS3-based, two servos, cute little guy. The vision: ditch the Chinese cloud, give it a custom personality, connect it to my Home Assistant setup.
The custom firmware rabbit hole
First I tried building custom firmware (AI_StackChan_Ex with M5Unified 0.2.7). Spent around 9 hours on it. Backlight works fine, but the panel renders nothing. Turns out the OpenELAB Kickstarter variant has a compatibility issue with M5GFX that's still unresolved. Tried the Moddable toolchain path next — ESP-IDF Python environment conflicts killed that too.
After those 9 hours I flashed the stock xiaozhi firmware via M5Burner. Took two minutes. Display worked immediately.
The NVS trick
Stock firmware reads its server URL from an NVS key. You can overwrite just the NVS partition with esptool, point it at your own server, and the device has no idea it's not talking to the original Chinese backend anymore. WiFi credentials get wiped in the process, so you set those up again on-device after flashing.
The backend
Forked an existing open-source StackChan server, containerized it, deployed it to my own VPS. Added a few patches: a set_emotion tool so the LLM can dynamically change the face expression, and a handful of additional tools.
What actually works
- Full German conversation with a custom personality (female, cheeky, Austrian inflection, max 3 sentences per response, always addresses me by name)
- Home Assistant control via voice: lights, scenes, sensors, you name it
- Weather from HA's built-in forecast entity
- different face expressions that change dynamically based on what the LLM decides to feel
- Web search for current news and facts (Tavily)
- Persistent memory across sessions — stored in a JSON file on the server, auto-injected into the system prompt on connect
- My n8n automation status on demand
- Accessible from anywhere, not just local network
What doesn't work
The servos. Stock firmware completely ignores motor control frames from the server — display changes, head stays still. Servo movement requires custom firmware, which is blocked by the M5GFX rendering issue. There's an open GitHub issue for it, waiting to see if that goes anywhere.
Wake word is still Chinese. There's no way to change that without custom firmware.
Can't combine function calls with Google Search in the Gemini Live API simultaneously — it's a documented API limitation. That's why I went with Tavily instead of native search grounding.
Overall
The stock firmware path is dramatically underrated. You skip all the toolchain pain, get a stable device, and if you're willing to run your own backend, you have full control over the AI behavior, personality, and integrations. The missing piece is servo animation, and that's a hardware/firmware problem, not a backend problem.
If the M5GFX issue ever gets fixed, custom firmware becomes interesting again. Until then, this setup does everything I actually wanted it to do.