Multiple scraping issues, multiple solutions tried: about to ditch Hermes
disclaimer: I'm a web developer (who knows a thing or do on how things work)
I setup hermes (on Hostinger) to check two real estate websites (greek market) for new properties. Long story short: the platforms do have their own notification system but somehow it never worked for us (slow to notify or asks for a premium sub or whatnot).
I'm using Deepseek (flash/pro) as a model of choice (direct api, not openrouter).
I've had the following issues:
- random captcha (solved with Firecrawl)
- random no captcha but no content returned (WAF/empty content) (with FIrecrawl)
- Camofox : deepseek has no vision / isnt multi modal, doesn't work as Camofox takes a screenshot of the page and then sends it as an image to the model
I also tried Chrome headless, also run into anti-bot protections.
I looked into the long running thread, followed some of the advice there.
It would literally be faster if I actually just checked the websites myself once per day 😃