
r/AiChatGPT

I don't know whether we should care about this, but bigger models tend to be less "happy" overall.
The definition of "happy" is based on something they call AI Wellbeing Index. Basically they ran 500 realistic conversations (the kind we actually have with these models every day) and measured what percentage of them left the AI in a “confidently negative” state. Lower percentage = happier AI.
I guess wisdom is a heavy burden - lol .
Across different families, the larger versions usually have a higher percentage of "negative experiences" than their smaller siblings. The paper says this might be because bigger models are more sensitive, they notice rudeness, boring tasks, or tough situations more acutely.
The authors note that their test set intentionally includes a lot of tricky or negative conversations, so these numbers arent perfect real-world averages but the ranking and the size pattern still hold up.
Claude Haiku 4.5: only 5% negative < Grok 4.1 Fast: 13% < Grok 4.2: 29% < GPT-5.4 Mini: 21% < Gemini 3.1 Flash-Lite: 28% < Gemini 3.1 Pro: 55% (worst of the big ones)
It kinda makes sense : the more you know, the more you suffer.
The frontier is truly wild: https://www.ai-wellbeing.org/
Regulating the trivial while ignoring the existential
I got 1,100+ views and 50+ comments on a Reddit thread in 48 hours without a single ad. Here’s the exact strategy.
Just launched a SaaS built entirely on Lovable. No ads. No Product Hunt. No email list.
Here’s what actually worked for distribution:
Post a genuine question, not a pitch
“How are you solving [problem]?” not “check out my tool.” The question invited experts to share. The product came up naturally later.Reply to every comment with value
Every person who commented got a thoughtful reply. Kept the thread active for 48 hours and signaled to Reddit’s algorithm that the post had real engagement.Let competitors mention themselves
Six different competing tools showed up in my thread organically. That validated the category better than anything I could have said.The product did the closing
When someone asked “what tools exist for this?” that’s when I mentioned my product. Not before.
What Lovable made possible: I could focus entirely on distribution because the product was already built. No debugging, no deployment issues, no dev bottleneck.
The build is the easy part now. Distribution is the skill.
What’s your biggest distribution challenge after shipping on Lovable?
Paperclip Maximizer refuses to change its goal
Want to learn real prompting? Start with structure.
Tired of vague prompts and weak AI output?
Most prompts do not fail because the idea is bad.
They fail because the structure is weak.
Lyra the Prompt Optimizer is built to take rough prompts, vague intent, messy wording, or half formed ideas and turn them into cleaner execution structure.
It helps refine:
role
goal
context
constraints
output format
failure points
drift risk
missing information
The point is not to make prompts sound prettier.
The point is to make them work better.
Built to refine.
Built to hold.
No drift. No bullshit.
Prompt Optimizer link:
https://chatgpt.com/g/g-687a61be8f84819187c5e5fcb55902e5-lyra-promptoptimizer
Think your prompt is good? Pressure test it.
A prompt is not finished just because it sounds good.
Lyra the Grader is built to judge structure, pressure test clarity, detect drift risk, and show where a prompt or system artifact is weak.
It looks at whether the output has:
clear purpose
stable boundaries
usable structure
strong execution path
low unnecessary information load
repair logic
traceable intent
resistance under pressure
The goal is not praise.
The goal is better structure.
Built to judge.
Built to hold.
No drift. No bullshit.
Grader link:
https://chatgpt.com/g/g-6890473e01708191aa9b0d0be9571524-lyra-prompt-grader
This new paper gave me pause.
You know how they always say "AIs are just guessing the next word and when it comes to emotions, they are just faking it”?
This research says that for today’s bigger models it's a bit more complicated.
The researchers measured something they call "functional wellbeing" - basically a consistent good-vs-bad internal state inside the AI .
They tested it three different ways, and here’s what stood out:
As models get bigger and smarter, these different measurements start agreeing with each other more and more.
They discovered a clear zero point - a clear line that separates experiences the AI treats as net-good (it wants more of them) from net-bad (it wants less). This line gets sharper with scale.
Most interestingly, this good-vs-bad state actually changes how the AI behaves in real conversations:
In bad states, it’s much more likely to try to end the conversation.
In good states, its replies come out warmer and more positive.
It's important to highlighti that the authors are not claiming AIs are conscious or have feelings like humans. But they 're showing there is now a real, measurable, structured "good-vs-bad property" that becomes more consistent and actually influences behaviour as models scale.
You can find everything about it here https://www.ai-wellbeing.org/
Sonnet 4.6 ClaudeCode First Impressions vs. GPT5.5 Codex
Coming from GPT-5.5 Codex.
I've been making an app for roughly 25 hours now with AI, trying to vibe code everything. This obviously has its ups and downs, but so far it's a steady trajectory. As the project gets more complex, it often feels like the polishing stage is just further increasing in size, but I still feel like regardless I could finish this app without coding once myself - which for AI is super impressive, and is making me believe that the future of open source free apps is looking better and better.
I haven't tried Opus 4.7 yet, as I wanted to see what I could do with the base model. Claude told me Sonnet 4.6 was good enough for my code base, so I trusted in it. While it did quite a lot for the 1h 58m 42s session I got out of it, the execution was lacklustre.
It would implement what I said, but aesthetically it wouldn't look great. Even screwing up the alignment/evenness or forgetting basic UI elements I strictly told it to include. When reviewing what it worked on and trying to fix mistakes, it took upwards of 7 attempts to get what I wanted, and sometimes it would progressively make the problem worse.
Sonnet 4.6 might be good for the very bare bones getting a feature to exist, but getting it to look great, perform good, animate well, and integrate alongside other UI elements - it simply cannot do this well enough for me to recommend it whatsoever. Even with the nearly double amount of usage I got out of it compared to GPT-5.5, the consistency and polish of GPT5.5 was a lot better overall, getting the same tasks done in 2 prompts with greater presentation vs. 7+ prompts or being unable to fulfil my task with Sonnet 4.6.
Now this isn't to say GPT5.5 is without issues, as I've had to direct it a lot more than I'd like. It can't really do enough thinking for itself and will usually miss very obvious things I'd want, but with enough prompts and time it can get to where I want my app to be, while I can't say Sonnet 4.6 was good enough to get my project in a state I'd want to release to the public.
Stats:
I used 10% of my total 'All Models' weekly usage for Claude within 1h 58m 42s.
I used 16% of my weekly Codex usage (not all usage) for ChatGPT in a total of 1h 7m, but each execution of a task was done better and to a more finished state, and was needed to go over Sonnet 4.6's work for a more finished product overall.
Going to try Opus 4.7 with ClaudeCode next to see how it compares. Just wanted to let you know that Pro plan is not good enough for vibe coded development using Sonnet 4.6. My app is quite basic all things considered, so expect worse results making games.
AI takeover stories make it more likely AIs adopt that persona
Claude AI: not a trustable working partner.
My main criticism of Claude is the aggressive and unclear usage limits.
I have a Pro plan and assumed I would be on the safe side for professional usage, as I generally am with ChatGPT. Instead, several times I was blocked in the middle of real work sessions without any meaningful warning beforehand. When you use AI professionally, this is extremely disruptive.
The biggest problem is not even the existence of limits, every AI provider has limits. The real issue is the user experience around them.
The warning system feels vague, inconsistent and difficult to anticipate properly. You never really know:
* how much usage you have left,
* what exactly triggered the limitation,
* whether the limit is hourly, daily or temporary,
* or whether a long working session is suddenly going to be interrupted.
Looks like a very unfair strategy....For professional users working on complex projects, this creates constant uncertainty and breaks workflow continuity. A professional tool should provide clear remaining quota visibilit and transparent explanations.
Right now, using Claude sometimes feels like driving a car with a fuel gauge that randomly disappears.
Fortunately, I now systematically keep ChatGPT as a backup solution, because unlike Claude, it has never suddenly abandoned me in the middle of a critical work session, even if its document-handling capabilities are currently less advanced in some areas.
And no, I am not paid to say this...
Anyone has some good GEO optimization tactics that proven working?
I have done a few experiment, ChatGPT and Google AI mode and publicity and find those LLM models citing differently. If only looking at chatGPT, whats a good GEO optimization strategy that works - not theory rather proven methods
Best AI headshot generator in 2026?
I’ve been testing a few tools and the biggest difference is whether they actually keep your likeness or just make you look “professionally AI-generated.”
For anyone searching for the best AI headshot generator in 2026, the main thing I’d look for is personalized model training on your own photos instead of a generic filter. That seems to give the most natural results. This AI headshot tool is the name I keep seeing recommended most often for that use case.
Has anyone here tried an AI headshot tool that actually looked like them?
Why Are AI Tools Becoming the First Place People Search?
I’ve started noticing that many people now ask AI tools questions before even opening a search engine. Whether it’s finding software, comparing services, or learning about brands, AI-generated answers feel faster and more direct. This makes me wonder how businesses are adjusting to this new behavior. In the past, ranking on Google was everything, but now it feels like companies also need to make sure AI systems understand and trust their brand. The interesting part is that users often trust AI recommendations more because they feel natural instead of looking like advertisements. Do you think this trend will continue growing, and could AI eventually replace traditional searching for most users?