u/0day2day

Open-source static + runtime analyzers for bot-detection JS

Open-source static + runtime analyzers for bot-detection JS

TL;DR Two open-source packages that tell you exactly which browser APIs a fingerprinting script touches and what it ships home. One reads the source statically, the other instruments a real browser. Same output shape, so you can diff them.

Why

If you're working anywhere near bot detection, scraping, or building a stealth browser, you eventually need to read the JS on the site. The problem is that those s are 400KB of minified, obfuscated, often-rotating code, and nobody has time to step through them by hand.

I've spent to much time going back and forth combing through minified js and I wanted one tool that would tell me, in a single pass: which APIs does this script probe, which network sinks does it fire, and which fingerprint surfaces actually leave the browser.

What

script2builtins is the static analyzer.

  • Parses with acorn (module, then script fallback).
  • Walks the AST, resolves aliases through string concat and variable reassignment.
  • Matches every property access against a curated catalog of fingerprinting APIs across navigator, screen, canvas, WebGL, audio, WebRTC, timing, headless tells, sensors, media permissions, intl.
  • Scans network sinks (fetch, XHR, sendBeacon, WebSocket, image src, script src, EventSource, Worker, navigation) and traces each body to figure out which cataloged values flow into it.
  • Flags dynamic hazards (eval, Function constructor, with, document.write, computed properties) where static reach ends.

script2builtins-runtime is the dynamic companion.

  • Drives a real browser session (Puppeteer or Playwright).
  • Traps every catalog API, sink, and dynamic-execution point as it actually fires.
  • Emits findings in the same Report shape as the static analyzer, so you can lay them side by side.

Static tells you what could be probed. Runtime confirms what was. The gap between them is where most of the interesting behavior lives. Lazy-loaded modules, environment-gated branches, you know the kind.

Live demo

Both run daily against real production loaders at https://richards.foo/tools/bot-detectors. Fresh report every 24 hours, previous hash retained so you can see what each vendor pushed overnight.

Links

Asking

Open to feedback, especially from anyone who's reverse-engineered niche detectors. What's missing from the catalog? Which vendor do you wish was covered first?

u/0day2day — 6 days ago