How do you handle real browser automation with Hermes? Logins, navigation, credential management — what actually works?
Simple web searches work fine out of the box. But I'm trying to build workflows where Hermes actually does something useful in a browser — logging into a site, navigating to a specific page, extracting a value, and using it downstream.
Example use case: Find a tracking number in an online shop. → Visit site → log in → navigate to order history → extract tracking number → optionally run shipment tracking → report back.
And that's where things fall apart for me.
The problems I'm running into:
1. Bot protection Cloudflare, hCaptcha, DataDome, you name it. Even headless Playwright/Selenium trips these immediately. Does anyone have a working approach here? Residential proxies? Stealth plugins? Just accepting defeat on certain sites?
2. Credential handling If a site has no API, how do you actually pass credentials to Hermes? Storing them in .env? A secrets manager? Inline in the prompt (please no)? I'm not thrilled about any of these options security-wise, but I need something structured.
3. Tool choice Are you using Playwright MCP, Puppeteer, something else entirely? What's your actual stack for "Hermes opens a browser and does stuff"?
4. Reliability Even when bot protection isn't an issue, DOM scraping is fragile. A site redesign breaks everything. Are you prompting Hermes to be resilient, or just accepting that these workflows need maintenance?
I feel like authenticated browser automation is the gap between Hermes being a cool demo and being genuinely productive.
Any hint, tip, or pointer is greatly appreciated — happy to hear even partial solutions or "here's what I tried and it didn't work either."
What setups are actually working for you?