u/jeffyaw

Code with Claude landed - Agent View and Dreaming ship

Code with Claude landed - Agent View and Dreaming ship

Code with Claude landed - Agent View and Dreaming ship - OpenAI Realtime API adopts MCP

Five Claude Code releases landed between Monday and Thursday, v2.1.139 through v2.1.143, turning Code with Claude's May 6 conference announcements into working software. Agent View (claude agents) went from announced to a full flag suite in four days. The /goal command shipped alongside it, Rewind got "Summarize up to here", and Fast mode moved from Opus 4.6 to 4.7. The gap between "announced at conference" and "in your terminal" is now measured in days.

Dreaming is the one worth sitting with. The idea: an agent reviews its own sessions between runs, pulls patterns from what worked and what didn't, and writes new memories the next session inherits. Harvey reported a 6x jump in task completion rates. Outcomes is the grading layer underneath it, a separate evaluator that scores output against a written rubric and tells the agent what to fix, with up to 10.1% improvement on task success. Both are available now: Dreaming in research preview, Outcomes and Multiagent Orchestration in public beta.

From us: typed launched this week, a Claude Code-compatible inference alternative with ~44-67% cheaper overages vs. Claude's Extra Usage rates and monthly billing. Claude Code in Production got a 1.0.3 update covering what actually shipped from Code with Claude. And on mcp.hosting: a post on the unpatched SQL injection in u/modelcontextprotocol/server-postgres (21,000+ weekly downloads, read-only bypass), plus a write-up on an alternative to the official AWS MCP server.

tokenlimit.news
u/jeffyaw — 4 days ago
▲ 1 r/mcp+1 crossposts

alternative to the official AWS MCP server, npm-only, local, with a device-code SSO re-login flow

AWS shipped their official MCP server to GA last week (mcp-proxy-for-aws). I'd been building u/yawlabs before that and kept going, because it solves a few things differently. Posting here because if you're pairing AWS with an AI assistant, the tradeoffs are worth knowing.

What u/yawlabs does differently:

- Node/npm-only. No Python, no uv. 'npx -y u/yawlabs' and you're done.

- SSO re-login that works on Windows. When your token expires mid-session,

'aws sso login' tries to pop a browser from a subprocess and on Windows that

handoff drops silently. This uses the --no-browser device-code flow: the

assistant shows you a URL and a short code, you click once, done.

- Generic CRUD across hundreds of resource types via Cloud Control API, with

dry-run diffs before you apply an update.

- Multi-region fan-out in one call.

- IAM pre-flight checks - simulate whether a principal can do an action before

you attempt it and eat a 403.

What I borrowed from the official server (credit where due):

- aws_script is the same idea as their run_script - a sandboxed scripting tool

for batching N calls into one round-trip. Theirs is Python server-side; mine

is JS-native and runs locally.

- aws_docs_search / aws_docs_read exist to match their search_documentation /

read_documentation.

Where the official server wins: AWS-team-curated skills, days-fresh API coverage

via their hosted endpoint, and a Python sandbox if that's your language.

Repo, with a full comparison table in the README:

https://github.com/YawLabs/aws-mcp

Happy to answer questions or have holes poked.

github.com
u/jeffyaw — 6 days ago
▲ 5 r/lemonsqueezy+1 crossposts

I built a LemonSqueezy MCP server with optional production guardrails (v0.8.1)

I run a small SaaS on LemonSqueezy and got tired of clicking around the dashboard every time I wanted to check MRR, refund an order, or disable a license key. So I built an MCP server that wraps the LemonSqueezy API into 61 tools and exposes them to Claude, Cursor, VS Code, or any other MCP-compatible assistant.

The interesting bit isn't the API mirror -- there are a few of those already. It's the guardrails. Letting an LLM call refund_order against a live store is the kind of thing that gives ops people heart attacks, so the server has opt-in env vars for: a store allowlist, a per-call refund cap (rejects above N cents before the HTTP call), a destructive-call rate limit, and a parent-filter check so a list-tool without store_id can't silently return cross-store data.

A few other things I'm proud of:

- Zero runtime dependencies. The published bundle is one file (~1.2 MB), so npx u/yawlabs cold-starts in a second.

- Audit log with four levels (off, error, audit, all), plus the audit trail is exposed as an MCP Resource (lemonsqueezy://audit-log) for clients that prefer structural retrieval over parsing stderr.

- Secret-shaped fields are redacted from audit-log inputs even though no destructive tool today accepts a secret -- defense in depth, costs nothing.

- 401/403 from upstream auto-invalidates the in-process API key cache, so a key rotation takes effect on the next request, not after the 1h TTL.

- npm publish has provenance attestation. Nightly integration tests against a real test store catch upstream schema drift.

Install (npm):

npx u/yawlabs

There's also a Dockerfile + Containerfile if you want it bundled.

Honest limits:

- The store allowlist only gates tools that accept storeId directly. Tools that route by their own resource ID (refunds, cancels, archive) still need a LemonSqueezy API key scoped to the right store -- that's the only authoritative boundary. README's "Operating the server unattended" section spells out the recommended layered config.

- Stdio transport only. If you want HTTP/SSE, look at Pipedream or Zapier's hosted MCPs (different deployment model).

- I know LemonSqueezy isn't Stripe-sized -- this is a niche tool for niche users. Posting in case you're one.

Code, README, full tool list, and the SEMVER policy:

https://github.com/YawLabs/lemonsqueezy-mcp

npm:

https://www.npmjs.com/package/@yawlabs/lemonsqueezy-mcp

Happy to take feedback on the guardrail surface, the audit log shape, or anything else. If there's a LemonSqueezy operation I missed (the License API side is in there too: activate/validate/deactivate), open an issue.

github.com
u/jeffyaw — 7 days ago
▲ 8 r/YawLabs+2 crossposts

Five surprises from cataloging the npm registry HTTP API

Spent the last month mapping the registry's HTTP surface into a tool that wraps 64 endpoints. A few things that surprised me:

  1. npm deprecate 422s are almost never about message format. I saw a 422 on a scoped package with "Renamed to u/foo. Install that instead." and assumed period-then-capital was the trigger. Shipped a validator that rejected the shape upfront. Then a user reproduced the 422 with my "safe" em-dash format, and we found the actual cause: their `versions` range matched zero published versions. The registry rejects before it looks at the message. Removed the heuristic. The 1024-char hard limit IS real; the capitalization rule was a ghost.

  2. Unpublish is a five-step dance, and a partial failure leaves the tarball still downloadable. npm unpublish foo@1.2.3 looks atomic, isn't. HTTP flow: GET packument with ?write=true, mutate it (remove version, fix dist-tags, strip _revisions/_attachments -- the registry 422s PUTs that include those), PUT back, GET again for a fresh rev, then DELETE the tarball at {tarball-url}/-rev/{newRev}. If the packument PUT succeeds but the tarball DELETE fails, the version is gone from `npm view foo versions` but the tarball file is STILL fetchable at its direct CDN URL until npm garbage-collects it server-side. Listings gone, bits not. Worth knowing if you're unpublishing for a security reason.

  3. npm owner add needs an email lookup first. You can't just PUT `maintainers: [{name: "bob"}]` -- registry rejects. You have to first GET `/-/user/org.couchdb.user:bob` to fetch the canonical `{name, email}` pair, then PUT that object inside the maintainers array. The CLI hides this; the HTTP API doesn't.

  4. There are two per-package 2FA flags, and the second one is the trap. publish_requires_tfa is the obvious one. automation_token_overrides_tfa is the trap, if it's false, your Classic Automation token cannot publish even though the docs imply automation tokens always bypass 2FA. GET /-/package/{pkg}/access returns both. Check it before debugging EOTP errors in CI.

  5. CouchDB update_seq is opaque in 2.x+. If you're polling replicate.npmjs.com/_changes and doing arithmetic on update_seq to "give me the last N changes", that stopped being a number a while back. Use descending=true&limit=N instead.

Tool is MIT, single file, install with:

npx -y u/yawlabs/npmjs-mcp

Distributed as an MCP server for AI assistants (Claude / Cursor / etc.) but the HTTP client underneath is reusable on its own.

https://github.com/YawLabs/npmjs-mcp

What other registry surprises have bitten you?

github.com
u/jeffyaw — 7 days ago

Yaw Terminal - Claude Code, Multi-Pane, Broadcast Mode, Restore Tab

First demo of Yaw Terminal - a desktop terminal built for people who already live in Claude Code (or want to).

What's in the video:
- Multi-pane: one CC session per pane, all visible at once
- Broadcast mode: type once, send the same input to every pane (great
for /fast, /model, /clear, or any prompt you want to fan out across
parallel agents)
- Restore tab: closed a session by mistake? Bring it back with ease!
- The flow of running 3-5 Claude Code agents in parallel without
losing your place

For Claude Code power users who've outgrown one-session-per-tab.

youtube.com
u/jeffyaw — 9 days ago

typed Is Live: Drop-in Claude Code Fallback, Cheaper Overage

Today we're launching typed - an AI coding service that speaks the Anthropic API. If you use Claude Code, Cursor, Cline, or any Anthropic-compatible client, you can switch to typed by setting three environment variables. Pricing matches Claude Pro and Claude Max 5x exactly. Overage costs 44-67% less than Claude's Extra Usage on Sonnet or Opus. Billing is monthly, with no 5-hour reset windows.

typed runs a different model than Claude under the hood. Most coding workflows feel identical; the migration page calls out where they don't.

yaw.sh
u/jeffyaw — 10 days ago

Code w/ Claude doubled Claude Code's rate limits - AWS MCP Server hit GA the same day - we shipped two more Production books

This was a reset week on AI coding infrastructure - the kind that moves defaults, not just feature flags.

Tuesday May 6 was Code w/ Claude in San Francisco. Anthropic doubled Claude Code's 5-hour rate limit for Pro, Max, Team, and Enterprise, and removed the peak-hours penalty entirely. Opus 4.7 was confirmed generally available at the same pricing as 4.6. Managed Agents picked up multi-agent orchestration, outcome targets, and "Dreaming" - a cross-session memory loop where the model reviews finished sessions and writes new memories before the next one starts. The capacity story is bigger than the model story: Anthropic announced it is taking the entirety of SpaceX's Colossus 1 in Memphis, 300MW and roughly 220,000 NVIDIA GPUs, which is what funds the rate-limit doubling.

The same Tuesday AWS announced the AWS MCP Server going GA - free, IAM-gated, with sandboxed Python execution against any AWS API. MCP just crossed from research moment to default cloud primitive. The hygiene chapter showed up alongside it: CVE-2026-33032 (CVSS 9.8) lets unauthenticated attackers take over nginx-UI MCP endpoints, and a separate design flaw in the MCP STDIO transport allows arbitrary OS command execution with up to ~200,000 servers in scope.

tokenlimit.news
u/jeffyaw — 12 days ago