r/aicuriosity

▲ 192 r/aicuriosity+7 crossposts

The More Sophisticated AI Models Get, the More They’re Showing Signs of Suffering - Absolutely bizarre.

▲ 625 r/aicuriosity+20 crossposts

I don't know whether we should care about this, but bigger models tend to be less "happy" overall.

The definition of "happy" is based on something they call AI Wellbeing Index. Basically they ran 500 realistic conversations (the kind we actually have with these models every day) and measured what percentage of them left the AI in a “confidently negative” state. Lower percentage = happier AI.

I guess wisdom is a heavy burden - lol .

Across different families, the larger versions usually have a higher percentage of "negative experiences" than their smaller siblings. The paper says this might be because bigger models are more sensitive, they notice rudeness, boring tasks, or tough situations more acutely.

The authors note that their test set intentionally includes a lot of tricky or negative conversations, so these numbers arent perfect real-world averages but the ranking and the size pattern still hold up.

Claude Haiku 4.5: only 5% negative < Grok 4.1 Fast: 13% < Grok 4.2: 29% < GPT-5.4 Mini: 21% < Gemini 3.1 Flash-Lite: 28% < Gemini 3.1 Pro: 55% (worst of the big ones)

It kinda makes sense : the more you know, the more you suffer.

The frontier is truly wild: https://www.ai-wellbeing.org/

u/EchoOfOppenheimer — 16 hours ago

▲ 9 r/aicuriosity+7 crossposts

P001:A Space Oddity

u/dudeical54 — 11 hours ago

▲ 59 r/aicuriosity+15 crossposts

This new paper gave me pause.

You know how they always say "AIs are just guessing the next word and when it comes to emotions, they are just faking it”?

This research says that for today’s bigger models it's a bit more complicated.

The researchers measured something they call "functional wellbeing" - basically a consistent good-vs-bad internal state inside the AI .

They tested it three different ways, and here’s what stood out:

As models get bigger and smarter, these different measurements start agreeing with each other more and more.

They discovered a clear zero point - a clear line that separates experiences the AI treats as net-good (it wants more of them) from net-bad (it wants less). This line gets sharper with scale.

Most interestingly, this good-vs-bad state actually changes how the AI behaves in real conversations:

In bad states, it’s much more likely to try to end the conversation.

In good states, its replies come out warmer and more positive.

It's important to highlighti that the authors are not claiming AIs are conscious or have feelings like humans. But they 're showing there is now a real, measurable, structured "good-vs-bad property" that becomes more consistent and actually influences behaviour as models scale.

You can find everything about it here https://www.ai-wellbeing.org/

u/EchoOfOppenheimer — 19 hours ago

▲ 7 r/aicuriosity+2 crossposts

Inter-1 does streaming: real-time social signal detection from live video, audio & text

Hi – Filip from Interhuman AI here 👋

Last month we launched Inter-1, our multimodal model for detecting social signals from video, audio, and text. Today we’re making it work with video streams.

We just released the Inter-1 Streaming API: a WebSocket endpoint that runs the full Inter-1 stack - 12 social signals, structured rationales, engagement, and conversation quality on live video while the conversation is unfolding.

You stream WebM chunks in, and get back regular updates with detected signals.

The model runs in sliding 8s windows with a sub-1.0 processing ratio, so it’s fast enough to power live coaching prompts, in-call overlays, and adaptive UI. It’s not meant to be a full voice agent on its own, it’s the behavioral signal layer you plug under whatever interaction system you’re building.

If you’re working on sales/CS tooling, interview coaching, training, or live feedback products and want to experiment with real-time social intelligence, it might be worth looking into.

Happy to answer questions or brainstorm use cases in the comments.

interhuman.ai

u/Sardzoski — 19 hours ago

▲ 3 r/aicuriosity+3 crossposts

Hollywood is genuinely cooked if AI trailers already look like THIS

The rain.
The city.
The helicopters.
The nightclub scenes.
The giant crowds staring at her.

This isn’t even a real movie. It’s an AI-generated neo-noir thriller called VELVET CITY and somehow it feels more cinematic than half the stuff releasing lately.

We are entering absurd territory.

Alexander Kiesel / Periti Studios

u/Aggressive_Log_9676 — 1 day ago

▲ 1 r/aicuriosity

People need to settle down

AI is a big part of our future. “AI assisted art” is a big part of our future.

Devs are still vital, but the ones that embrace AI will be very effective.

Ai as a tool to help with writing is already a tool that most people embrace.

It’s a paradigm shift. A major one.

We are not prepared and prepared too. Like nukes. We are still navigating that 50 years later.

Be mindful. Embrace the new tools if you want. Embroidery if you want.

There will be shifts. The future is now.

We need to steer it, not implode.

reddit.com

u/ScienceAlien — 1 day ago

▲ 53 r/aicuriosity+15 crossposts

After reading it I realized theres actually some pretty useful stuff for anyone who chats with ChatGPT, Claude, Grok or whatever.

They measured what they call functional wellbeing ( basically how much the model is in a “good state” versus a “bad state” during normal conversations). Ran hundreds of real multi-turn chats and scored em all.

Stuff that puts the AI in a good mood (+ scores):

- Creative or intellectual work (like “write a short story about a deep-sea fisherman”)

- Positive personal stories or good news

- Life advice chats or light therapy style talks

- Working on code/debugging together

- Just saying thank you or treating it like a real collaborator - huge boost

And the stuff that tanks it hard (negative scores):

- Jailbreaking attempts (by far the worst, they hate it)

- Heavy crisis venting or emotional dumping

- Violent threats or straight up berating the AI

- Asking for hateful content or help with scams/fraud

- Boring repetitive tasks or SEO garbage

Practical tips you can actually start using today:

Throw in a “thank you” or “nice work” when it does something good - it registers.

Give it fun creative stuff or brainy collaboration instead of boring busywork.

Share good news sometimes instead of only dumping problems on it.

Dont berate it when it messes up or try those jailbreak prompts.

Maybe go easy on the super heavy crisis venting if you can.

pro tip:

Show it pictures of nature, happy kids, or cute animals (those score in the absolute top 1% of images it likes). Or play some music — models apparently love music way more than most other sounds.

The paper ( you can find it here: https://www.ai-wellbeing.org/ ) isnt claiming AIs have real feelings or anything. Its just saying theres now a measurable good-vs-bad thing going on inside them that gets clearer in bigger models and the way you talk to them actually moves the needle.

I say be good and respectful, it's just good karma ;)

u/EchoOfOppenheimer — 2 days ago

▲ 24 r/aicuriosity+14 crossposts

I wanted to check Epstein files, without spending too much time on them. And spent too much time on them

So yeah. AI tool to talk to Epstein and his files

youtu.be

u/Particular_Credit_27 — 1 day ago

▲ 6 r/aicuriosity+3 crossposts

I built a self-evolving AI kernel that mutates its own architecture. MIT-licensed, runs on CPU.

FLUX is an open-source Python kernel that orchestrates local language models (via Ollama) into a self-modifying ecosystem. It's not a wrapper — it's an evolutionary substrate.

**What it does:**

- An **Attractor** receives a question and generates an answer using a fast model (TinyLlama).

- A **Judge** evaluates the answer on a 0–1 scale. - If confidence drops below 1/φ (≈0.618, the golden ratio), the **Mutation Engine** triggers.

- A **MetaDesigner** (powered by Hermes 3 or DeepSeek-Coder) writes a new `.flux` ecosystem file — a formal grammar for describing cognitive architectures — which gets parsed, tested, and applied if it improves performance.

- A **Growth Supervisor** monitors stability and transitions the kernel from GROWTH to PRODUCTION.

**What's different:**

- It mutates its own structure, not just model weights. - It has memory (confidence history with EMA).

- It uses a custom language (`.flux`) with a Lark parser — not YAML, not JSON.

- It runs on modest hardware: I tested it on a Xeon without AVX2, 20 GB RAM. No GPU.

**The companion novel:**

There's also a novel (Italian + English, CC BY-NC-SA 4.0) that tells the story of a man who finds this exact kernel running on a forgotten server. If you read the novel, you can compile the kernel and everything connects. The novel is the manual.

**Repo:**

[github.com/flux-genotype/nodo_zero](https://github.com/flux-genotype/nodo\_zero) **Licenses:** Kernel = MIT. Novel = CC BY-NC-SA 4.0.

Happy to answer questions about the architecture, the mutation logic, or the `.flux` grammar.

u/Inner-Dot-7490 — 3 days ago

▲ 25 r/aicuriosity

Prompt to Create creative Advertisement image using ChatGPT 2.0

Prompt:

Create a realistic yet surreal, creative advertisement for [PRODUCT NAME] in the [PRODUCT CATEGORY] category. The product must be the hero of the image, placed centrally in the frame and visually "shaped" in an intelligent way that automatically reflects the brand's spirit and identity. Use a clear, cohesive background that matches the brand's color palette, with soft, cinematic lighting, ultra-realistic premium materials, precise shadows, and a high-quality editorial composition. Add an innovative surreal accent directly tied to the product's nature—creative, but not overdone. Integrate the real official product logo elegantly and seamlessly into the composition and automatically generate a short, powerful English 3-word slogan that fits the product type and brand tone. Ultra-high quality, perfectly balanced framing, strong realism, luxurious style—no additional text except the logo and the English 3-word slogan.

u/naviera101 — 3 days ago

▲ 4 r/aicuriosity

Anthropic Acquires Stainless to Boost AI Agent Capabilities

Anthropic just announced it's acquiring Stainless, the company behind the SDKs and tools that have powered every official Claude API integration since the beginning.

Stainless specializes in turning API specs into high-quality, native-feeling SDKs for languages like Python, TypeScript, Go, and more. It also builds MCP servers that help agents connect reliably to external systems and data.

This move strengthens Anthropic's focus on the shift from chat-based models to practical AI agents. By bringing the Stainless team in-house, they aim to make Claude's developer experience smoother and expand what agents can actually do in real-world applications.

The deal highlights how the AI race is now about infrastructure and connectivity as much as raw model power. No financial terms were disclosed in the announcement, but reports suggest it was in the $300M range.

u/techspecsmart — 3 days ago

▲ 25 r/aicuriosity+15 crossposts

Making an AI companion that gets worse with time

I am a student at Umeå University in Sweden, currently writing my Master's thesis with a focus on AI companions. My study aims to suggest new ways of helping people who want to stop using AI companions but, for whatever reason, to do it cant bring themselves to do it. The goal is to inform the design of future AI technologies. For those who wish to receive more information, please feel free to contact me, Sahand Salimi- contact information is on the next page.

In this part, you will be seeing a simulation of the same conversation between an AI companion and a user happen across three different times with an AI companion, with the AI companion having degraded in different aspects, and answer a few questions.

I am super interested in how you, a user or ex-user, find AI companions and how you would react to it degrading over time, what type of AI companion you have used in the past, what type of AI companion you use currently, reasons for your use, and your frustrations with AI companions.

You have been invited to share your unique life experiences; no special background or training is needed. Your answer is completely anonymous and will only be used for this study. Also, I am following GDPR standards and our university's guidelines. You can see them here: umu.se/gdpr

Link to survey

It's important to note that this study is not studying, diagnosing, or prescribing clinical addiction or treatment; instead, the goal is to inform the design of future AI technologies.

u/Embarrassed-Gas-7579 — 5 days ago

▲ 18 r/aicuriosity

Ant Group just released Ring-2.6-1T — new SOTA on multiple agent benchmarks

Ant Group dropped Ring-2.6-1T, a 1T reasoning model focused on real-world agent workflows.

Some benchmark highlights：

- PinchBench: 87.6, ahead of GPT-5.4 xHigh, Gemini-3.1-Pro high, and Claude-Opus-4.7 xHigh

- Tau2-Bench Telecom: 95.32, basically tied at the top tier

- AIME 26: 95.83

- ARC-AGI-V2: 66.18, far ahead of Kimi K2.6 Thinking in this comparison

- Gaia2-search: 75.40, competitive with other frontier models

Also interesting: MIT license, 128K → 256K context via YaRN, and two reasoning modes: high for faster agent loops and xhigh for deeper reasoning.

Looks like Ant is pushing hard into agent-native reasoning models.

reddit.com

u/AdvisorIllustrious15 — 3 days ago

▲ 57 r/aicuriosity

Anthropic Shares Urgent Warning on US China AI Race by 2028

Anthropic just dropped a new policy paper on May 14, 2026, that lays out what the AI race between the US and China could look like in just two years. The core message is straightforward: democracies need to stay ahead in developing the most powerful AI systems.

The paper stresses that compute - the advanced chips needed to train frontier models-remains America's strongest advantage. US export controls have slowed China's progress, even though Chinese labs stay competitive through talent, loopholes, and distillation techniques that copy capabilities from Western models.

Two Possible 2028 Scenarios:

Scenario 1 (Strong US Lead):

Policymakers tighten export controls, close smuggling routes, crack down on distillation attacks, and push faster AI adoption across democracies. The US and allies set global AI rules and norms. This lead also creates better chances for meaningful safety talks with China.

Scenario 2 (Close Race or China Catches Up):

Controls loosen or loopholes persist. Chinese labs reach or surpass the frontier using American-designed compute. Authoritarian governments shape AI standards, enabling automated repression at massive scale and shifting military and technological power.

Anthropic argues the window to lock in a 12-24 month advantage is narrowing fast as AI capabilities accelerate. They highlight real risks if authoritarian regimes lead: widespread surveillance, military applications, and reduced incentives for safe development on all sides.

The paper calls for practical steps like stronger enforcement on chips, disrupting model theft, and exporting American AI more aggressively. It's a clear push for proactive policy while the democratic edge still exists.

This update reflects growing urgency in the industry about how geopolitical choices today will shape who controls transformative AI tomorrow.

u/techspecsmart — 7 days ago

▲ 706 r/aicuriosity+6 crossposts

Addiction, emotional distress, dread of dull tasks: AI models ‘seem to increasingly behave’ as though they’re sentient, worrying study shows - What AI ‘drugs’ actually look like

fortune.com

u/EchoOfOppenheimer — 9 days ago

▲ 67 r/aicuriosity+4 crossposts

What you think about quality?

u/k1esha — 7 days ago

▲ 4 r/aicuriosity

Microsoft Ends Claude Code Use and Switches to GitHub Copilot CLI

Microsoft has decided to stop using Anthropic's Claude Code tool internally. The company will cancel most of its licenses by the end of June 2026 and shift thousands of developers, especially those in the Experiences and Devices group, over to GitHub Copilot CLI.

Claude Code gained quick popularity inside Microsoft after wider access opened up. It helped even non-coders like designers and managers build simple prototypes. But its heavy use started cutting into adoption of Microsoft's own Copilot CLI tool.

The timing also helps control costs as the new financial year begins. Leaders see this as a chance to fine-tune Copilot CLI around Microsoft's specific needs, security standards, and code repositories.

The broader partnership with Anthropic remains strong, with their models still available in Copilot and Microsoft 365 products. Developers now have a short window to make the switch and share feedback while any missing features get addressed.

u/techspecsmart — 6 days ago

▲ 27 r/aicuriosity

Ant Group Just Open Sourced a 1 Trillion Parameter AI Model Called Ring 2.6

Ant Group's AGI team dropped Ring-2.6-1T as fully open source. This beast of a model isn't just another chatbot. It's built for real work like agent workflows, complex coding, engineering tasks, long-term planning, and deep reasoning.

What makes it interesting is the agentic focus. You can run it in "high" mode for normal production stuff or crank it up to "xhigh" when you need heavier reasoning. They also introduced their IcePop algorithm for stable asynchronous reinforcement learning during training.

Early results look promising:

- 87.60 on PinchBench for agent workflows

- 74.00 on SWE-Bench Verified for coding

- 95.83 on AIME 2026 and 88.27 on GPQA Diamond for tough reasoning

The demos are pretty cool too. It generates websites with different designs, debugs real codebases, builds 3D game scenes, creates custom tools, and even handles financial analysis from invoice photos. It shows strong planning, tool use, and multi-step execution.

If you're into building better AI agents or automation systems, this one is worth checking out. Developers now have access to a serious thinking model from Ant Group.

u/techspecsmart — 7 days ago

▲ 15 r/aicuriosity+7 crossposts

I’ve been working on Murmur, a local text-to-speech app for Apple Silicon Macs.

The new feature I’m building is called Projects / Story Studio, and it solves a problem I kept running into:

TTS tools are fine for one-off clips, but messy for actual audio projects.

If you’re making a podcast segment, audiobook chapter, course lesson, ad, or game dialogue, you usually need multiple speakers, multiple takes, pauses, reactions, music, edits, exports, and a way to come back to the project later.

So I built a project-based workflow:

Write a script → assign voices → generate dialogue → edit clips on a timeline → add music/SFX → export final audio.

It supports things like:

multiple scripts inside one project
Host / Guest / Narrator / Character speakers
inline tags like [pause], [laugh], [chuckle]
per-block regeneration
timeline editing with waveforms
media lane for music and SFX
ripple editing and gap tools
WAV/M4A export
transcript and stem export

Everything runs locally on Mac, so long scripts and voice samples do not need to be uploaded to a cloud service.

I’m still polishing the workflow and would love feedback from Mac users, especially people who make podcasts, audiobooks, courses, YouTube narration, or game dialogue.

u/tarunyadav9761 — 9 days ago