u/Capital-Run-1080 — reddlx

Microsoft just beat Anthropic's best model without having a model. And it used Anthropic's own model to do it.

So this dropped May 12 and I don't think enough people are talking about it in the context of agentic AI.

Microsoft built something called MDASH, an AI security system, and topped a benchmark called CyberGym with 88.45%. Anthropic's Mythos Preview came second at 83.1%, OpenAI's GPT-5.5 third at 81.8%.

Here's the thing. Microsoft doesn't have a frontier model. MDASH runs on publicly available models, including models from the exact companies it just outranked. It assembled 100+ specialized agents across a five-stage pipeline, preparation, scanning, verification, deduplication, proof, with different agents and different models handling each stage. Large models for heavy reasoning. Smaller distilled models for high-frequency verification. And the whole system is model-agnostic, swap the underlying model, the pipeline stays.

It used someone else's bricks and built a taller building.

And it's not just a benchmark flex. Microsoft used MDASH to find 16 actual Windows 11 vulnerabilities, 4 of them Critical-level remote code executions. These are in the May Patch Tuesday. Real CVEs, not demo outputs.

The architecture question this raises is the one I keep thinking about. If a multi-agent system can outperform the single model that powers it, what does that mean for how we build things on top of AI?

And this is where World's AgentKit becomes relevant.

Because as these agentic systems proliferate, the question stops being "can agents do the task" and starts being "who authorized this agent, and is there actually a human behind it." MDASH at 100+ agents is impressive. MDASH at scale, used by people who don't know what they're doing, or used by attackers who do, is a different story entirely. The article itself says this: MDASH uses all publicly available models. There are no exclusive technical barriers. Anyone can build this.

AgentKit is trying to solve the identity layer for exactly this world. Proof of human attached to agentic workflows so you know there's a verified person behind the action, without exposing that person's data. That becomes load-bearing infrastructure as agent systems move from experiment to production.

MDASH is the demo that the production era is already here.

reddit.com

u/Capital-Run-1080 — 5 days ago

▲ 1 r/bots+1 crossposts

The internet's identity layer is quietly being rebuilt

Went down a rabbit hole on this over the weekend.

Online identity is breaking in a measurable way. IBM's 2025 report puts the average breach at $4.44M globally. Stolen credentials show up in 53% of breaches (Verizon). Sumsub clocked a 700% YoY jump in deepfake fraud. Deloitte projects $40B in US generative AI fraud losses by 2027.

Passwords are toast. Document KYC is increasingly spoofable with off-the-shelf AI. Three real replacements are forming in parallel, and most people haven't noticed.
Government digital ID. Aadhaar covers 1.3B people. EU is rolling out eIDAS 2.0. Mature, state-backed. Doesn't cross borders, and if you're undocumented you're invisible.
Document zero-knowledge proofs. Humanity Protocol, zkPassport. Prove things about yourself without revealing the document. Low friction. Problem is the underlying document still has to be real, and AI fakes are getting good.
Biometric proof of human: World ID is the one I kept circling. A device called an Orb takes images of your face and eyes, converts them to a cryptographic identifier, images never leave the device. Around 18M verified across 160 countries. Tinder is piloting it in Japan for age and bot resistance. Most AI-resistant of the three.

My honest read is none of these wins outright. You end up with a stack. Bank uses government ID. Dating app uses biometric proof of human because age verification is legally required in places like Japan and you can't fake an Orb with Midjourney. Forum login uses ZKP because nobody needs nuclear-grade assurance to comment on a recipe.

The real question isn't whether verification gets stronger. It's who owns the verification layer.

reddit.com

u/Capital-Run-1080 — 9 days ago

▲ 1 r/AiForSmallBusiness

Right now when an AI agent books something or makes a purchase on your behalf, the platform receiving that request has no idea if it's coming from one person, one person running a hundred agents, or a bot swarm. They all look the same.

World's AgentKit tries to fix that. You verify as a human once, and that proof travels with any agent you delegate to. The platform gets a yes or no on whether a real unique person is behind the request, without learning who that person is.

Whether it gets enough adoption to matter is a separate question. But McKinsey has agentic commerce at $3-5 trillion by 2030 and nobody has figured out the trust layer for that yet.

u/Capital-Run-1080 — 17 days ago

▲ 126 r/dogmemes

u/Capital-Run-1080 — 18 days ago

▲ 2 r/AI_Agents

So I've been reading up on the World AgentKit launch from April 17 and figured I'd share what I pieced together.

The basic idea is a verified human delegates their World ID to an agent, and the agent carries cryptographic proof that a real person is behind it. Three capabilities in the toolkit: agent delegation (standing authorization), human in the loop (the agent has to come back for approval on sensitive actions), and a verified-human signature on purchase orders for commerce

launch partners were Okta, Vercel, Browserbase, Exa. Vercel shipped an npm package that drops a human-approval step into their Workflow SDK. Browserbase gives agents with a World ID "verified traffic" status so they hit fewer anti-bot blocks. Exa gives verified agents 100 free API calls a month before falling back to x402. there was also a Shopify demo for the commerce flow. One detail i didn't expect: one human can delegate to multiple agents, and that's by design. The website still sees they trace back to the same person, so rate limiting works at the human level not the agent level.

curious if anyone here has actually integrated it or looked at the SDK. how's the dev experience?

reddit.com

u/Capital-Run-1080 — 19 days ago

▲ 154 r/dogmemes+1 crossposts

u/Capital-Run-1080 — 19 days ago

▲ 44 r/MichaelJackson

I grew up knowing the hits but somehow Ben slipped past me until I was around 14. found it on some random playlist and just sat there for a while after it ended. he was younger than me when he recorded it and his voice already had something in it I couldn't explain.

The way he sings "friend" in the chorus is the part that got me then and still gets me now. no showing off, no runs, just a kid meaning every word. I remember thinking how does someone that young sound that sincere.

What was the song that pulled you in when you first found him?

u/Capital-Run-1080 — 19 days ago

▲ 30 r/europrivacy+1 crossposts

Tinder and Zoom are rolling out optional eye-scan verification through World ID to prove users are real humans, not AI bots or deepfakes. The Orb scans your iris, generates a unique code, then deletes the image. You get a "verified human" badge without revealing your identity. Aimed at fighting romance scams on Tinder and deepfake intrusions in Zoom calls. Privacy advocates are split on biometric tradeoffs.

source: bbc.com/news/articles/cp9vppem4evo

u/Capital-Run-1080 — 19 days ago

▲ 76 r/90smemorylane

Julia Roberts and Matthew Perry first crossed paths on the set of Friends in 1995, when Julia guest-starred as his love interest. The two began dating, and their relationship lasted into 1996. After they split, Matthew struggled with addiction for years.

Julia stayed in his corner long after the romance ended and reportedly offered her support during his hardest stretches. Matthew passed away in October 2023, on Julia's birthday.

u/Capital-Run-1080 — 19 days ago

▲ 27 r/90s

Julia Roberts and Matthew Perry first crossed paths on the set of Friends in 1995, when Julia guest starred as his love interest. The two began dating and their relationship lasted into 1996.

Julia stayed in his corner long after the romance ended and reportedly offered her support during his hardest stretches. Matthew passed away in October 2023, on Julia's birthday.

u/Capital-Run-1080 — 19 days ago

▲ 1 r/interestingasfuck

u/Capital-Run-1080 — 20 days ago

▲ 2.7k r/2000smemorylane+3 crossposts

u/Capital-Run-1080 — 19 days ago

▲ 207 r/Funnymemes

u/Capital-Run-1080 — 20 days ago

▲ 1.0k r/Funnymemes

u/Capital-Run-1080 — 23 days ago

▲ 261 r/dogmemes

u/Capital-Run-1080 — 23 days ago

▲ 233 r/funnypets

u/Capital-Run-1080 — 23 days ago

▲ 0 r/technology

u/Capital-Run-1080 — 23 days ago

▲ 11 r/MovieSuggestions

Mine's Spider-Man 2. The first one set everything up well, but the sequel is where it actually clicks. The train scene alone puts it over. What's yours?

reddit.com

u/Capital-Run-1080 — 24 days ago

▲ 55 r/Cinephiles

They've got this weirdly durable on-screen chemistry that I don't think either of them quite hits with anyone else. The Wedding Singer in 98, 50 First Dates in 04, and Blended in 14, almost like they check in on each other every decade.

Mine's The Wedding Singer. I think it's the one where Sandler's sweetness actually has somewhere to go. The 80s setting could've been a gimmick but it ends up doing real work, especially with the music. And the airplane scene at the end still gets me even though I've seen it a dozen times. Billy Idol cameo aside, it's just a really well constructed romantic comedy that doesn't try to be more than it is.

50 First Dates I respect more than I love. The premise is darker than people give it credit for and Sandler plays it straighter than usual. Blended I'll be honest I barely remember.

Curious where everyone else lands. And does anyone actually rate Blended? Feel like that one gets dismissed but maybe there's a case for it.

u/Capital-Run-1080 — 24 days ago

▲ 50 r/productivity

For years I had this strange gap where I'd finish a book, feel like I really understood it, and then a week later someone would bring it up and I'd have nothing. Just a vague "yeah it was good" and maybe one quote I half remembered. After a while I started wondering if I was actually reading or just moving my eyes across pages.

A few months ago I tried something simple. I keep a notebook open next to me while I read. Whenever I hit a section that lands, or an idea I want to push back on, or something I don't fully get, I write about it in my own words. Not a summary. More like, what did this just say to me, and what do I think about it.

The rule I gave myself was that if I couldn't write what a chapter was about after finishing it, I hadn't actually read it. I'd just looked at the words. That sounds harsh but it was true more often than I'm comfortable admitting.

Two things changed. The obvious one is retention. I can pull up arguments and ideas from books I read three or four months ago because writing about them once seems to lock things in way harder than rereading or highlighting ever did. I've underlined things in books my whole life and never went back to them. Notes I wrote myself, I remember.

The less obvious thing is that my own thinking got cleaner. When you have to put raw understanding into sentences, you find out fast whether you actually believe something or whether you were just nodding along because the author sounded confident. A lot of stuff I thought I agreed with fell apart the moment I tried to write it out. And some stuff I thought I disagreed with turned out to make more sense than I'd given it credit for.

The downside is it's slower. A book that used to take me a week now takes two or three. But I finish with something I can actually use. Ideas I can talk about, things that change how I work, instead of a fuzzy sense that I learned something.

Curious if anyone else does this. Do you have a system, like a specific notebook or a Notion setup, or do you just freestyle it? And if you've tried it and dropped it, what didn't work?

reddit.com

u/Capital-Run-1080 — 24 days ago