u/Interesting_Pool5155

WebWright - The browser agent is now live on Chrome 😊

WebWright - The browser agent is now live on Chrome 😊

Hey guys!!

WebWright - Built for action,not just browsing is now live on Chrome.

I saw multiple sidepanel AI extensions for browsers which can summarize webpages and answer basic questions. It's like Chatgpt in a browser just. I built something which can REALLY WORK. From playing songs for you in Youtube to ordering an Iphone from Amazon, WebWright can do any agentic task you tell it totally FREE OF COST. You don't need to click or think, you tell Webwright it does the rest, that's what makes it unique.

All other AI features like workflows, chat , filling forms I have included in this project. This is not just another Vibe-coded classroom project. All the features are tested and working as decided.

I researched all similar products and found Extension by claude is competitive, but that's totally paid. WebWright is also optimized to use least number of tokens. It uses a 4 way selection algorithm. You can check my repo if you interested.

I have added the user manual and answered all your privacy related question in the website.

I am a solo developer working on this project alongside job with a vision to make something actually useful with AI rather than chat wrappers. Please provide valuable feedback for version 2 and do star my repo and rate in chrome webstore.

u/Interesting_Pool5155 — 5 days ago
▲ 8 r/MicrosoftEdge+2 crossposts

WebWright - Agentic Extension for your Browser is now live on Chrome 😁

Hey guys!!

WebWright - Built for action,not just browsing is now live on Chrome.

I saw multiple sidepanel AI extensions for browsers which can summarize webpages and answer basic questions. It's like Chatgpt in a browser just. I built something which can REALLY WORK. From playing songs for you in Youtube to ordering an Iphone from Amazon, WebWright can do any agentic task you tell it totally FREE OF COST. You don't need to click or think, you tell Webwright it does the rest, that's what makes it unique.

All other AI features like workflows, chat , filling forms I have included in this project. This is not just another Vibe-coded classroom project. All the features are tested and working as decided.

I researched all similar products and found Extension by claude is competitive, but that's totally paid. WebWright is also optimized to use least number of tokens. It uses a unique 4 way selection algorithm. You can check my repo if you are interested.

I have added the entire user manual in the website (Link above) and answered all your privacy concerns there.

I am a solo developer working on this project alongside job with a vision to make something actually useful with AI rather than chat wrappers. Please provide valuable feedback for version 2 and do star my repo and rate in chrome webstore.

u/Interesting_Pool5155 — 5 days ago
▲ 9 r/MicrosoftEdge+1 crossposts

Introducing Research Mode in WebWright for your Edge

Gave my open-source edge extension ONE word — "Hanta virus ~ Myths vs Facts" — pressed enter and walked away.

2 minutes later: 10 sources visited, each summarized, full report opened in a new tab. Zero clicks from me 💀

Not a chatbot drawing from training data — it's a real agent. Perceives pages (DOM + vision), reasons with any LLM you pick, and actually clicks / types / navigates.

Also does:

  • Agent mode — any web task in plain english
  • Chat mode — talk to any page, optional vision
  • Workflows — record once, replay forever
  • Form filling from a local vault

Open source MIT · No server · Zero telemetry · 8 LLMs · Runs fully local with Ollama · Under 1 MB

GitHubhttps://github.com/profoncode-debug/WebWright

Sitehttps://profoncode-debug.github.io/WebWright/

Star if u feel like it 🩷 AMA below

u/Interesting_Pool5155 — 7 days ago
▲ 63 r/MicrosoftEdge+4 crossposts

Designed a true browser automation agent : WebWright

Every AI extension be like "here's a step-by-step guide on how to do it yourself"

Webwright just does the thing 💀

Watch it click buttons, fill forms, complete tasks — all from one prompt in plain english.

  • open source · MIT
  • zero servers, zero telemetry
  • 8 LLMs supported, bring your own key
  • runs fully local with ollama

githubhttps://github.com/profoncode-debug/WebWright

sitehttps://profoncode-debug.github.io/WebWright/

star if u feel like it and give feedback 🩷

u/Interesting_Pool5155 — 8 days ago
▲ 1 r/chrome_extensions+3 crossposts

Most "AI browser extensions" are just LLM wrappers in a sidebar. I built one that's actually an agent.

I built WebWright — a Chrome extension that lives in your sidebar and actually does things on web pages instead of just chatting about them. Open source (MIT), free, works with 8 LLM providers, runs fully local with Ollama.

For the last few months I've been frustrated with every "AI in your browser" tool I tried.

I'd ask ChatGPT to compare three flights — it would explain how to compare flights. I'd ask Gemini to fill out a job application — it would describe what fields the form had. I'd ask Claude to research a topic across 10 websites — it would tell me what would be on those websites if it could see them.

Every one of them stops at the same wall: they can read what you copy in, but they can't reach into the actual browser tab where the work is happening.

So I built WebWright.

It's a Chrome extension that lives in the sidebar. You type a goal in plain English — "open YouTube and search for lofi music", "fill this scholarship form with my saved info", "research the latest breakthroughs in AI agents" — and it actually goes and does it. It clicks. It types. It navigates between pages. It fills forms. It writes you a research report with 10 sources cited. You watch every step in a live action log.

It's not a chat wrapper. It's a real agent loop. Perceive → reason → act → re-perceive. The content script extracts every interactive element on the page, the LLM picks one action as JSON, the action runs, the page state is re-read, repeat. When DOM clicks fail (canvas-based games, weird custom elements), it auto-escalates to vision mode — takes a screenshot, overlays numbered markers on every clickable thing, and asks the LLM "which number do you want to click?" If even that fails it falls back to raw coordinate clicks via the Chrome DevTools Protocol.

What it can do

  • Agent Mode — give it any web task in plain English, it runs autonomously across pages
  • Chat Mode — talk to any page you're viewing, with optional vision (Pro mode attaches a screenshot so the LLM can literally see charts and dashboards)
  • Research Mode — drop in a topic, it visits Google, scrapes the top 10 sources, summarizes each, synthesizes a final HTML report. 2-3 min end to end.
  • Workflows — record any sequence of clicks/typing/navigation, replay with one click
  • Personal Info Vault — save your details once, the agent fills any form on demand

The stuff I'm actually proud of

  • Zero vendor lock-in. Works with 8 LLM providers: Ollama Cloud (free tier), Ollama Local (free forever, runs on your machine), OpenAI, Anthropic, Gemini, DeepSeek, Grok, plus a custom endpoint slot. Every model field is editable so new models work the day they ship.
  • Open source under MIT. You can read every line. That's how you verify the privacy claims.
  • No developer-controlled server. Like, there literally is no server. Your data lives in chrome.storage.local, your API keys go straight from your browser to whichever LLM you configured. There is nothing I, the developer, could collect even if I wanted to.
  • No telemetry, no analytics, no tracking. Verifiable in the source.
  • Under 1 MB. Zero dependencies. Pure vanilla JS. No build step.
  • Anti-loop detection. The agent watches its own action history — if it repeats the same action three times or oscillates between two elements, it changes strategy on its own.
  • Works on every Chromium browser — Chrome, Edge, Brave, Opera, Vivaldi, Arc.

Honest about what it's not great at

Accuracy depends on the model you point it at and how specifically you prompt it. Frontier models (GPT-4o, Claude Sonnet 4, Gemini 2.0, large Ollama Cloud models) handle long agent loops far more reliably than small local models. Specific goals beat vague ones — "Search Amazon for Sony WH-CH520 headphones sorted by price low-to-high" works much better than "buy headphones."

It is also genuinely an agent. Don't tell it to do anything you wouldn't take responsibility for completing. Treat it as a researcher + form-filler — let it find and prepare, you review and submit. Don't give it a credit card and walk away.

Links

Currently in Chrome Web Store review — those of you who've shipped extensions with the debugger permission know that means manual review and a few days of waiting. For now you can clone from GitHub and load unpacked, takes 30 seconds.

If the idea resonates, a star on the repo helps it surface for other people frustrated with chat-only AI tools. Not asking for money, not running a freemium scheme — there's nothing to monetize here. Just trying to get the project in front of people who'd find it useful.

Happy to AMA in the comments — architecture, prompt engineering, why I chose CDP over content-script clicks, why Ollama Local is the killer feature for privacy people, anything.

u/Interesting_Pool5155 — 10 days ago