u/Gwapong_Klapish

Okay, so I've been in the web scraping game for quite some time now. I was browsing the GitHub top-100 stars list yesterday and saw it sitting at #73 globally with over 120k stars. That's ahead of Node.js. That's in the same breath as projects that have been around for a decade. For context, at the end of 2024 they celebrated 20k stars. They raised their Series A in August 2025 at 43k stars. Now it's 120k+. That's roughly 3x growth in under a year, for what is essentially a web scraping API aimed at AI developers. What in the world happened? How did a scraping API beat Node.js in stars? The repo describes itself as "search, scrape, and clean the web for AI agents." Useful, I'd say. But 120k-star useful?? There are open-source alternatives like Crawl4AI with 65k stars doing very similar things for free. Is it just incredible timing with the AI/RAG pipeline wave, or is there genuine technical moat here that the community is rewarding? My main main concern is the star count organic? I'm not accusing anyone of anything, but a jump from ~20k to 120k in roughly 16 months is one of the most aggressive trajectories I've seen outside of projects with massive corporate backing (and I'm thinking of Microsoft's markitdown). FireCrawl got $14.5M Series A from Nexus and YC. Is any of that marketing spend showing up in developer mindshare as stars? I'm genuinely curious how you break into the GitHub top-100 that fast. Additionally, can someone explain the pricing to me without making my head hurt? On the surface it looks simple: 1 credit = 1 page scraped. But the moment you turn on anything useful, AI extraction, JSON output, Enhanced Mode, you're burning 5–9 credits per page. The Hobby plan at $16/month gives you 3,000 credits, which sounds great until you realize that's only ~333 pages with JSON + Enhanced Mode enabled. A 500-page website on the Hobby plan exceeds your entire monthly allowance in a single scrape. Now before someone says "just self-host it", that's an option, yes, it's AGPL-3.0 open source. But the self-hosted version is deliberately crippled: no Fire-Engine (their proprietary anti-bot system), no proxy rotation, no Actions endpoint, no browser sandbox. The stuff that actually makes it worth paying for is cloud-only. AGPL also means commercial self-hosting has licensing implications your legal team needs to look at and that's if you're within a company, if you're a an individual developer, well, that can get quite expensive. To be fair, the product genuinely seems excellent. Zapier, Shopify, Replit, and Apple are customers. The clean markdown output uses 67% fewer tokens than raw HTML. The MCP server integration means you can pipe live web data straight into Cursor or Claude. That's real value, and the community clearly feels it. But I keep coming back to the same question: is this one of the best-marketed developer tools of the AI era, or is it genuinely the best technical solution? Someone kindly explain what is going on with firecrawl

FireCrawl just hit 121k GitHub stars and I have a LOT of questions, the hype, the pricing trap, and what's actually going on