u/Frosty-Judgment-4847

▲ 9 r/costlyinfra+1 crossposts

AI is not going to cause a jobcalypse as Dario says, i think it is exactly the opposite

I love Anthropic and Claude, but hate the narrative that Dario is setting for AI in terms of replacing humans. I honestly think AI is going to create more jobs than it destroys. It will double/triple our GDP in coming years.

And the numbers already speak for it. There are more Software engineering jobs created in the last 2 years than destroyed.

Yes the roles and responsibilities will shift significantly. Maybe repetitive office work gets crushed.But the idea that half the population just becomes useless overnight honestly feels disconnected from how technology has historically worked.Every engineer i know is doing more with AI tools.. they are building, fixing and shipping things faster... productivity is super high and if this momentum continues we are looking at abundance and prosperity for everyone. What do you folks think?

(Edit: why is my post downvoted so much 😄 )

reddit.com
u/Frosty-Judgment-4847 — 10 days ago

I ran a semantic caching experiment on a real-ish workload and see how much money it saves, where it breaks and if it’s even worth the effort.

My Setup

  • ~10k support-style queries (eCommerce data)
  • mix of repeated + slightly reworded stuff
  • avg ~1.2k tokens per request
  • mid-tier model (Claude/GPT class)

Flow was simple:

query → embedding → vector search
if similar enough → return cached answer
else → call LLM + store response

Baseline (no caching)

  • ~12M tokens
  • ~$70-ish cost
  • latency ~1.7–1.8s

With semantic caching (threshold ~0.94)

  • cache hit rate: ~38%
  • tokens avoided: ~4.5M
  • cost dropped to ~$45

~35–40% savings

latency also dropped to ~0.9s avg which was noticeable

I tried lowering the threshold to ~0.90 to get more hits

  • hit rate jumped to ~50%+
  • cost savings looked great (~45–50%)

…but quality started getting weird

examples:

  • “reset password” vs “reset password as admin”
  • “cancel subscription” vs “pause subscription”

these look similar to embeddings, but answers shouldn’t be reused. I’d estimate ~10% of cached responses were “kinda wrong” at that level

At higher threshold (~0.97)

  • very safe
  • almost no bad responses
  • hit rate dropped to ~20%
  • savings ~15–20%

best setup for me:

  • threshold ~0.94
  • only cache low-risk queries
  • fallback to model when unsure
  • log + review bad cache hits
reddit.com
u/Frosty-Judgment-4847 — 20 days ago

I have been playing around with Claude Opus 4.7 the past few days and something feels off with token usage.

Compared to GPT/Gemini (same prompts), it just seems to go longer than needed, add extra explanation even when I don’t ask for it and burn tokens faster than expected

Like a simple prompt (~800 tokens in) ends up with way longer outputs than I’d expect.

Which is great sometimes… but at scale, this gets expensive fast.

Not sure if this is better reasoning or something else

Anyone else seeing this?

reddit.com
u/Frosty-Judgment-4847 — 1 month ago

It sounds ridiculous at first… but there’s actually a reason. And as Elon said the lowest-cost place to put AI will be in space… within two to three years.

On Earth, as you can hear in news that we’re running into limits fast:

Power is getting expensive (AI made it worse) - some states have moratorium on starting a data center. I have noticed my bills slowly rise for no reason

Cooling eats a huge chunk of cost

Land + permits = slow, messy, political

Now if you compare that to space:

Solar power is basically unlimited

Cooling is “free” (you just dump heat into space)

No land, no neighbors, no zoning issues

Also… longer term, a lot of data is already in space (satellites, imaging, defense). Instead of sending everything back to Earth → process it up there.

Let's do a cost breakdown

Launch alone:
~$2K–$5K per kg (today)

Even a small setup (~10–20 tons):
→ $20M–$100M just to get it up there

Then add:

Space-grade hardware (radiation will kill normal servers)

Assembly in orbit

Basically no easy maintenance

So realistically:

Small experimental system → $50M–$150M

Larger system → $500M+

True hyperscale → multi-billion

In comparision, here is what it taks

Small / Mid-size data center (10–30 MW) - $100M – $300M

Large hyperscale data center (100 MW) - $900M – $1.5B (just facility) and $3 - $5B if you add GPUs/servers

Curious what others think — hype or inevitable?

u/Frosty-Judgment-4847 — 1 month ago

I was ask shocked to hear people spend $10k / month for OpenClaw. Here is what they are doing

It's all for business use, not personal. Personal usage is like $10 - $200 max what i heard

  • Inbound sales / support agents → reading emails, drafting replies, updating CRM (Intercom/Zendesk style workflows)
  • Outbound lead gen at scale → scraping leads, enriching (Clearbit/Apollo), writing personalized emails
  • RAG over large datasets → legal docs, healthcare records, internal company knowledge bases
  • Dev copilots / internal tools → engineers constantly hitting models for code, debugging, docs
  • Research agents → web scraping + summarization + report generation running all day

Anyone that has high usage use case that they will like to share?

reddit.com
u/Frosty-Judgment-4847 — 1 month ago