
Open-sourced today: flutter-dev-agents — an MCP server for testing Flutter apps on real iOS and Android devices
Hey r/FlutterDev. Solo dev, been shipping Flutter apps for a few years. Posting because I open-sourced something today that I genuinely couldn't have launched without it being already battle-tested for a month against my own apps.
The thing that broke me.
You know that bug where the Android permission dialog button changes wording across OS versions — "Allow" on Android 13, "While using the app" on 14, and in Polish it's "Podczas używania aplikacji" but encoded with NBSP between words, so your tap_text fails byte-equality silently and the test just hangs there pretending everything's fine?
I lost half a Sunday to that one. Different version on iOS 17 vs iOS 26. The fix isn't hard, but the diagnosis is hours of "why is nothing happening" because the screenshot still shows the dialog and the logs still say "Tap dispatched" and the test still passes its 30-second wait. Selector maintenance was eating ~40% of my testing time and I was getting bitter about it.
What I built.
flutter-dev-agents — an MCP server that lets autonomous agents (Claude Desktop, Claude Code, Cursor, any OpenAI-compat local LLM) drive my Flutter apps on real Galaxy S25s, real iPhone 15s, Pixel 8 emulators, iPhone 17 simulators, whatever's plugged in. 110 tools across:
- Android: uiautomator2 + raw
adbfallback (Samsung One UI sometimes drops accessibility taps; adb-shell input bypasses that). - iOS: WebDriverAgent on the device, pymobiledevice3 for lockdown services. The iOS 17+
--rsdrouting through tunneld's HTTP API took me a weekend to debug — it's now documented as an ADR so nobody else has to. - Flutter-specific: Patrol integration,
flutter run --machinefor hot-reload control, debug-log streaming, widget tree dumps.
The tap_text thing I described above? Fixed properly now — NFC normalization plus folding NBSP/NNBSP/thin-space/zero-width-space, with case-fold fallback in substring mode. tap_text("Podczas używania aplikacji", system=True) just works regardless of which Unicode whitespace variant the Android Settings team decided to ship that week.
What's genuinely different vs other mobile MCPs
I checked the existing landscape before posting — there are mobile MCPs out there. Most are iOS-simulator-only (mobile-next/mobile-mcp, ambar/simctl-mcp), or Android-only (martingeidobler/android-mcp-server), or Figma→Flutter codegen (mhmzdev/Figma-Flutter-MCP, different use case).
What this one does that I didn't find elsewhere:
- Cross-session device locks: I run 3 Claude Code windows at once, one per project, sometimes on the same physical device pool. Filesystem-coordinated locks mean window 2 doesn't grab the S25 while window 1 is mid-tap. Stale-lock cleanup when a holder process dies.
- Tiered tool surface: 110 tools is too many for Claude Desktop's UI ceiling and Cursor's 40-tool cap.
MCP_TOOL_TIER=basicexposes a curated 26-tool subset. Small local LLMs (Qwen 3B class) work with the basic tier; larger models get the full catalog. - Defense-in-depth screenshot pipeline: Anthropic's API rejects images > 2000px on any edge. I hit this three times in production before getting the cap pipeline right. Now: per-use-case cap at 1600px, dispatcher safety-net at 1900px, BASIC-tier
compress_png+inspect_image_safetyso the agent has a recovery path even when an external MCP (computer-use, raw adb screencap) feeds in a 2400px PNG.
Honest gaps I'm not pretending don't exist
- iOS device control needs Xcode → macOS-only for iOS work. Linux is fine for Android-only.
- I'm a solo maintainer. PRs welcome, co-maintainers especially.
- No hosted SaaS planned. The MCP protocol is local-first; SaaS would break the security model. Documented in the ROADMAP's "Not on the roadmap" section.
- Patrol-dependent for true UI tests on Flutter. If you're not using Patrol, you get device-level control (taps, screenshots, logs, app lifecycle) but not Flutter-aware test orchestration.
Try it (genuinely 5 minutes)
pip install mcp-phone-controll
claude mcp add phone-controll -- python -m mcp_phone_controll
Then in Claude Code:
Using phone-controll, run mcp_ping, pick my Android device, take a screenshot.
You should see three tool calls returning ok: true and a PNG in ~/.mcp_phone_controll/sessions/. If anything's red, the structured next_action field tells you what to do — that's the whole MCP contract.
Links
GitHub: https://github.com/michal-giza/flutter-dev-agents
PyPI: https://pypi.org/project/mcp-phone-controll/
15-min onboarding: https://github.com/michal-giza/flutter-dev-agents/blob/main/docs/GETTING-STARTED.md
The 4 operational gotchas that cost me an hour each the first time: https://github.com/michal-giza/flutter-dev-agents/blob/main/docs/operational-gotchas.md
Roadmap: https://github.com/michal-giza/flutter-dev-agents/blob/main/ROADMAP.md