u/CryOwn50

the most expensive ai decision your company made this year was never approved

someone grabbed an api key, shipped something that worked, and three months later finance gets hit with a $40k invoice nobody signed off on. now multiply that across every team running their own experiment.

the finops foundation changed their mission this year, they went from "advancing people who manage the value of cloud" to "advancing people who manage the value of technology." j.r. storment calls it "technology value management." the stats back up why: 98% of organizations track ai costs now, compared to 31% two years ago. 90% monitor saas spend versus 65% last year. missions change when bills arrive.

everyone watches gpu costs. but the real money drain? rate limits that were never configured. prompt caching that's disabled. usage logs no one checks. teams using flagship models when gpt-5.4 nano costs 12x less per token. most ship on the expensive model because it's the default. ask them what their last 10,000 api calls cost and they can't tell you.

every cloud audit plays out the same way: nobody owns the ai budget line. engineering blames finance. finance blames engineering. product blames infrastructure.

where does your ai spend actually live? and who's responsible when it spirals?

reddit.com
u/CryOwn50 — 3 days ago
▲ 9 r/sre

ibm cloud services impacted after datacenter fire near amsterdam. status page showed no major issues during the outage.

ibm cloud services in AMS3 were reportedly disrupted for 4+ hours on may 7 after a fire at the northc facility in almere. the status page showed no major issues during this time, and users were finding out through downdetector/statusgator first.

separately, aws also had thermal/power issues in us-east-1-az4 that week which impacted coinbase, fanduel, and others for hours.

outages happen. what stood out was how official status pages can lag behind what users are actually experiencing during large incidents.

so what are people here actually using for early signal during incidents? vendor status pages, third-party monitoring, synthetic checks, or slack/reddit/x?

reddit.com
u/CryOwn50 — 10 days ago
▲ 45 r/CLOUDS

The symmetry of the clouds near the mountains mesmerised me

u/CryOwn50 — 10 days ago
▲ 20 r/devsecops+1 crossposts

Docker v29.3.1 dropped in March with a fix for CVE-2026-34040 (CVSS 8.8)

the bug is weird. Dockers middleware strips request bodies over ~1mb before AuthZ plugins see them but the daemon still processes the full thing. so the plugin evaluates an empty body, approves it, and the daemon runs whatever was actually in the request

the AuthZ plugin and daemon are literally looking at different requests

craft an oversized request, plugin sees nothing suspicious and approves it, daemon executes the full payload with elevated access. could spin up privileged containers, read bind mounted host files, maybe even break out depending on how things are configured

this is supposedly related to CVE-2024-41110 from last year which was "fixed" but apparently not really. i'm starting to think nobody actually tests these patches

mainly a problem if you expose the Docker API over TCP (even internally), run CI/CD that talks to Docker remotely, or lean on AuthZ plugins for access control

check your version:

docker version --format '{{.Server.Version}}'

anything under 29.3.1 has the bug

if your Docker API is network accessible this is one to actually fix rather than add to the backlog and forget about

just ran into this while auditing our infra and would love to hear your thoughts

reddit.com
u/CryOwn50 — 15 days ago