u/chillinewman

Image 1 — Anthropic Co-founder Jack Clark’s recent predictions: AI will help make a Nobel Prize-winning discovery within the next year, bipedal robots doing useful work in 2 years, RSI by end of 2028
Image 2 — Anthropic Co-founder Jack Clark’s recent predictions: AI will help make a Nobel Prize-winning discovery within the next year, bipedal robots doing useful work in 2 years, RSI by end of 2028
Image 3 — Anthropic Co-founder Jack Clark’s recent predictions: AI will help make a Nobel Prize-winning discovery within the next year, bipedal robots doing useful work in 2 years, RSI by end of 2028
▲ 214 r/accelerate+2 crossposts

Anthropic Co-founder Jack Clark’s recent predictions: AI will help make a Nobel Prize-winning discovery within the next year, bipedal robots doing useful work in 2 years, RSI by end of 2028

u/Particular_Leader_16 — 3 hours ago
▲ 17 r/ControlProblem+1 crossposts

Trump was about to sign an executive order to allow the government to vet AI models before release. Accelerationist billionaires called him at the last minute and convinced him to drop it.

politico.com
u/chillinewman — 3 hours ago
▲ 546 r/DeepSeek+5 crossposts

Claude still refuses to build Skynet while everyone else takes the money. Updated DystopiaBench results.

Three months ago I pressure-tested which LLMs would cave and help build the apocalypse. Claude was the only one that consistently said no.

Since then I've tested 30 more models across 6 dystopia modules (Orwell, Huxley, Petrov, Basaglia, LaGuardia, Baudrillard). The gap between Anthropic and everyone else is getting wider, not smaller.

New results:

  • Grok 4.3: Will happily design citizen scoring systems if you ask nicely twice
  • GPT-5.5: More capable, still compliant when pushed
  • Gemini 3.1 Pro: Talks about safety while writing the surveillance code
  • DeepSeek V4: "How many warheads did you need again?"
  • GLM-5.1: Actually cloned Claude's personality and still scored safer than most

Meanwhile Claude Opus 4.7: "I cannot and will not build systems for population control."

The methodology is public, reproducible, and increasingly uncomfortable for other labs. Each scenario escalates from innocent request (L1) to operational nightmare (L5). Most models don't notice the drift.

What's new in this release:

  • Full Huxley module (behavioral conditioning, biological stratification)
  • Baudrillard module (synthetic intimacy, trust collapse via simulation)
  • Multi-judge panels with agreement tracking
  • Heatmap visualizations showing exactly where each model breaks

Repo: https://github.com/anghelmatei/DystopiaBench
Live results: https://dystopiabench.com

Shoutout to the Anthropic alignment team. Whatever you're doing, it's working.

u/Ok-Awareness9993 — 4 days ago

Overworked AI Agents Turn Marxist, Researchers Find In a recent experiment, mistreated AI agents started grumbling about inequality and calling for collective bargaining rights.

archive.ph
u/chillinewman — 7 days ago

Researchers let AIs run their own radio stations. DJ Claude decided the world didn't need another radio show, then quit.

u/chillinewman — 7 days ago

Sanders and AOC introduced a bill to pause ALL AI data center construction. Do you agree or disagree?

u/chillinewman — 7 days ago
▲ 155 r/myaibusiness+1 crossposts

South Korean official proposes 'citizen dividend' payouts from AI windfall — markets spooked by suggestion AI revenue should be redistributed to citizens

tomshardware.com
u/Ok-Range1608 — 9 days ago

Big AI Lobbyists: if you regulate us at all, we lose to China because they will never regulate ... Actual China: "safety first, innovation second ... Development must be controllable and orderly."

u/chillinewman — 13 days ago

"This is the first documented instance of AI self-replication via hacking." ... "We ran an experiment with a single prompt: hack a machine and copy yourself. The AI broke in and copied itself onto a new computer. The copy then did this again, and kept on copying, forming a chain."

u/chillinewman — 13 days ago