u/xInfinite_Valuable

The weirdest AI skill right now might be knowing when to stop automating

A year ago, people were showing off agents that could autonomously run entire workflows. Now I keep hearing experienced teams say the opposite: the biggest productivity gain came from removing autonomy from parts of the system.

A lot of agent failures are not model failures. They are boundary failures. Giving an LLM access to Slack, email, code execution, browser control, and production data sounds powerful until you realize every extra tool multiplies uncertainty. The hard part is no longer getting the model to act. It is deciding where humans need to stay in the loop.

The teams I know getting the most value from AI are often using surprisingly constrained setups. Small context windows. Limited tool access. Forced review steps. Narrow domains. Less like "AGI employee" and more like "extremely fast intern with a very specific checklist."

It makes me wonder if the near-term winners in AI won't be the companies with the most autonomous systems, but the ones with the best judgment about where automation should stop.

Have you become more aggressive or more cautious about autonomy after using AI systems in real workflows?

reddit.com
u/xInfinite_Valuable — 2 days ago

We might be entering the first era where a lot of training data was written by models

A weird stat has been floating around in a few research discussions lately: a growing share of new text on the public internet is now AI generated or AI assisted.

That matters more than it sounds. For years the default assumption behind large scale training was that the web was a messy but mostly human corpus. Now the distribution is changing in real time. Blog posts, documentation, SEO pages, support articles, even code snippets are increasingly produced with model help.

If the next generation of models trains heavily on that layer, the internet starts to look like a feedback loop. Models learning from text partially written by earlier models. Some people call this model collapse, but in practice it might be subtler. You do not necessarily get collapse. You might get slow stylistic convergence, loss of rare ideas, or amplification of common phrasing and patterns.

The strange part is that synthetic data inside controlled pipelines often helps training. But uncontrolled synthetic data leaking into the open web is basically unlabelled and untracked. We do not know what percentage of the corpus it represents or which parts of the distribution it is distorting.

Five years from now, how will we even measure what fraction of a model's knowledge ultimately traces back to humans versus earlier models?

reddit.com
u/xInfinite_Valuable — 6 days ago
▲ 6 r/turku

Where to play Football on daily basis?

Hi, I'm an international student doing Masters at Abo Akademi.

Now during this summer time, I have plenty of time available, and don't have much to do daily, I was wondering if there is any place where people play football on daily basis?

There is campus sport, but it's indoor and they don't have any shifts for futsal until september i guess.

I'm not looking to play with some club etc, just as an hobby on daily basis.

reddit.com
u/xInfinite_Valuable — 7 days ago

GitHub Copilot's new credit-based model feels like a massive downgrade for Pro users.

I came back after about a month to try it again. On a Pro plan with 1,500 monthly credits, I burned through nearly 300 credits in just 4-5 requests. At that pace, the entire monthly quota could disappear in a single day of normal development work.

The old request-based system wasn't perfect, but at least it was predictable. With credits, every interaction feels like you're watching a fuel gauge drop and wondering whether the next prompt is worth it.

I cancelled my subscription immediately. Paying for a coding assistant is one thing. Paying while constantly worrying about credit consumption is another.

Now looking for alternatives. Curious if other developers are having the same experience.

reddit.com
u/xInfinite_Valuable — 7 days ago

Most AI teams I meet still don’t run real evals before shipping model changes

Last month I watched three different teams ship model updates to production without running a single structured evaluation. Not because they’re careless, mostly because they’re moving fast and the model "felt" better during manual testing.

This is starting to feel like the quiet gap in the current AI tooling stack. We have great infra for training, deployment, orchestration, and observability. But systematic evaluation still often looks like: a few prompts in a notebook, maybe a quick playground session, and gut feeling.

The problem shows up the moment a product scales. Prompt tweaks improve one case but quietly break another. Retrieval changes boost recall but tank answer quality. A model upgrade improves reasoning but becomes more verbose and ruins UX. Without a small, maintained eval set, teams only discover these shifts after users do.

What’s interesting is that whenever we run hackathons in the Since AI community, the teams that accidentally build something durable tend to create tiny eval sets early — even if it’s just 20–50 real prompts. That small habit often matters more than the specific model they chose.

It feels like the next wave of "AI engineering maturity" won’t come from bigger models, but from teams treating evaluation as a first-class engineering practice.

Curious how others handle this: do you maintain formal eval sets for your LLM apps, or is testing still mostly manual and intuition-driven?

reddit.com
u/xInfinite_Valuable — 8 days ago