r/googlecloud

Stop copy-pasting Terraform modules, I built a tested registry for AWS, GCP, and Azure with Terratest and CI
▲ 0 r/googlecloud+3 crossposts

Stop copy-pasting Terraform modules, I built a tested registry for AWS, GCP, and Azure with Terratest and CI

Disclaimer: I built this project and am sharing it as a free open-source tool.

Every project I join has the same problem: someone copied and pasted a VPC module from a blog post in 2019, nobody tested it properly, and now it's load-bearing infrastructure.

This registry has 9 modules across AWS, GCP and Azure, VPC/VNet, Kubernetes (EKS/GKE/AKS), and IAM/Workload Identity for each cloud.

Every module has:

- A Terratest that provisions real infrastructure and tears it down (no mocks)

- GitHub Actions CI (fmt, validate, tflint, Checkov)

- Secure defaults with every option exposed as a variable

- Working examples you can run in under 5 minutes

**Module list:**

- modules/aws/vpc: VPC, public/private subnets, NAT gateway, route tables

- modules/aws/eks: EKS cluster, managed node groups, OIDC, IRSA

- modules/aws/iam: roles, policies, IRSA binding

- modules/gcp/vpc: VPC, Cloud NAT, Private Google Access, firewall rules

- modules/gcp/gke: GKE cluster, node pools, Workload Identity

- modules/gcp/iam: service accounts, IAM bindings, WI federation

- modules/azure/vnet: VNet, subnets, NSGs, route tables

- modules/azure/aks: AKS, managed identity, OIDC, Workload Identity

- modules/azure/iam: managed identities, federated credentials, role assignments

**Quick start:**

git clone https://github.com/Cloud-Architect-Emma/terraform-module-registry

cd terraform-module-registry/examples/aws

terraform init && terraform plan

**Or reference directly in your code:**

module "vpc" {

source = "github.com/Cloud-Architect-Emma/terraform-module-registry//modules/aws/vpc?ref=main"

name = "production"

cidr = "10.0.0.0/16"

azs = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]

private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]

public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]

enable_nat_gateway = true

}

⭐ If this saves you time, a star on the repo helps others find it: https://github.com/Cloud-Architect-Emma/terraform-module-registry

PRs welcome, what module would you add first?

u/EmmaOpu — 8 hours ago

I activated free trial yet the api usage still makes charges without using the free credits.

I received a $12 bill yesterday after activating my free trial. Which shouldn't be a thing since I already got charged and the website says I have free credits. How does the free trial even work?

u/Clean-Tea-2837 — 8 hours ago

Production Project in Suspension

I just got an email saying my production firebase project (which has a few real users using it) has been suspended due to potential abuse because of leaked credentials or service accounts. I did what i could with `gcloud` cli but couldnt go far. billing hasnt gone up crazy so whatever happened, got stopped soon. I already submitted and appeal but it says it might take upto 2 days. Why on earth would you just suspend someone's project and take potentially 2 days? that makes no sense. If I was hacked why not disable that principal? just that service account or api key that was being abused? why suspend the entire thing with a potential outage of 2 days. This is absurd. I dont have cloud identity or workspace, so i had not purchased support, which is becoming even more frustrating. Is there any thing at all, that can be done here so i can get production back and running in a few hours instead of a few days? I tried contacting support but without the premium plans nothing is good. i already submitted the appeal but is that all? nothing else I can do besides waiting?

Thanks.

PS. it might sound like a vibe coder who messed up, I assure you that is not the case here. This isn't a vibe coded app. this is a genuine ask for advice.

reddit.com
u/Psychological-Newt75 — 13 hours ago

Does GCP organization structure eventually become harder to manage than the actual infrastructure?

now it feels like the cloud resources themselves aren’t even the difficult part anymore.

it is more the project structure, IAM permissions, shared VPC setup, service accounts and trying to figure out where things are supposed to live long term once more teams start getting involved.

everything usually makes sense when it is first set up

but months later even small changes turn into digging through old docs, tickets and permission chains trying to understand why something was configured a certain way in the first place.

starting to feel like the organizational side of GCP scales faster than the infrastructure itself sometimes.

reddit.com
u/Belladonna2278 — 13 hours ago

Google is officially replacing Vertex AI with the new "Gemini Enterprise Agent Platform"

Just wanted to share an important Update for AI & Cloud Learners

Google is shifting from a traditional AI platform toward a complete Agentic AI ecosystem focused on autonomous AI agents and enterprise workflows.

Key highlights:

  • Existing Vertex AI services and workloads will continue to work
  • AI development, orchestration, governance, and security are now unified under one platform
  • New tools introduced for building autonomous AI agents and multi-agent workflows
  • Access to Gemini, Gemma, Claude, and 200+ models remains available

This marks a major shift in Google Cloud’s AI strategy toward Agentic AI and enterprise automation.

If you are currently learning or working with Vertex AI, it’s important to start exploring the Gemini Enterprise Agent Platform moving forward.

Have seen that, GCP ACE exam is going to revamped absed on this Gemini Enterprise Rebranding.

reddit.com
▲ 5 r/googlecloud+1 crossposts

Extremely disappointed with Google Cloud / Firebase suspension: production app offline, zero visibility, no meaningful support

We’re extremely disappointed with the current Google Cloud / Firebase suspension process.

Our production app was taken offline immediately. Access to logs and IAM visibility was removed, leaving us unable to properly investigate, while real users and businesses were impacted.

If suspicious activity is detected, developers need:

  • Clear explanations of what triggered the action
  • Temporary read-only access to logs/IAM for investigation
  • Faster human escalation paths for production outages

We are actively auditing everything on our side. However, the lack of transparency and emergency support during a live outage is deeply frustrating.

After researching, we’ve found many other developers reporting nearly identical Google Cloud suspensions, limited explanations, delayed responses, and little to no meaningful support during downtime.

Abuse prevention is important. But production infrastructure requires production-grade incident handling.

What makes this more concerning is the possibility that issues originating from Google AI Studio may be involved. If that’s the case, developers need clarity urgently.

Has anyone else here experienced similar suspensions?
Would appreciate insights from the community.

(Posting in hopes this reaches the Google Cloud / Firebase teams.)

reddit.com
u/Muhammed_Rahif — 1 day ago

Cloud Run problems since May 12. GCP Says it is solved.

Hi guys,
Since May 12, our Cloud Run services and jobs have been consistently failing. Our services call external endpoints fairly frequently, but the call volume itself hasn't changed. The issue is that we're getting a "network unreachable" error after the service exhausts its configured timeout, and this affects both Cloud Run services and jobs.

To rule out issues on the external endpoint side, we tested the same service on a Hetzner instance and an on-prem server, and both worked without any problems. This points to the issue being specific to our Cloud Run environment.

We checked the GCP status page for our project and found a NAT issue that was marked as resolved, but the problem is still ongoing on our end. On top of that, this has caused our billing to spike significantly.

As a temporary workaround, we switched to a Serverless VPC, which does work, but it's considerably more expensive and still fails occasionally.

Is anyone else facing the same issue ?

reddit.com
u/DailyDuino — 1 day ago

API Key abuse: Google offered me a 75% refund - anyone else?

Just wondering the status of other peoples refunds from recent API key abuse in April.

Has everyone been provided with a percentage refund?

Has anyone received a full refund from the abuse saga?

I've also noticed they provided me with a refund but didnt include the +20% VAT so they claimed to give a 75% refund but its actually more like 67% - I'm disputing this at the moment with my support engineer but it seems abit sneaky.

I'm happy they provided the 75% refund but the bill is still pretty unaffordable. I will have to empty my bank account to pay it.

reddit.com
u/churro-banana — 1 day ago

My billing account seems to be compromised

We are a very small startup in India with very sharp budget for cloud. Today we started receiving mandates unlimited times on my phone. On checking the billing dashboard, I saw a whopping transactions of more than 64lakhs INR and the charges are piling.

I contacted support but they said they are unable to help until 32 hours have passed and the data propagates to the console.

Kindly help us 🙏 we are in no position to manage cashflow of 1Lakh let alone nearly 1 crore.

I have disabled the gemini apis and deleted all api credentials. I also cancelled mandates and stopped the VMs. But the transactions keep piling up.

[Update]
If anyone has experienced similar issue, kindly let me know how you dealt. I have already raised support ticket but they say they are unable to help because there is no data recorded on console until 32hours have passed. I am really worried.

I checked in AI studio now the usage is finally visible. It started today at 6 AM IST and there were more than 4 million api calls majorly to nano banana. I have cleared the e mandates at least to avoid card charges later this week.

[Support Update]

I talked to the support. They have assured to raise a readjustment request. Lets see what happens

reddit.com
▲ 4 r/googlecloud+1 crossposts

200$ credit trial going to expire in 10 days, how to use them?

After signing up and receiving the $300 trial credits, I’ve hardly used them! How could I use them to create something interesting?
The credits will expire on May 30 😞

reddit.com
u/Moalphie — 1 day ago

One of us: $17k Gemini API spending fraudolent spike overnight

Still investigating.

What probably happened:

A project of mine was using an old Google Map API Key. Because the old key lived on the same Google Cloud project, Google's backend infrastructure automatically and silently upgraded the public Maps key to have full access to Gemini.

As described by: http://trufflesecurity.com/blog/google-api-keys-werent-secrets-but-then-gemini-changed-the-rules

Key was probably scraped by the app bundle.

I already opened a case and waiting for reponse. What do you suggest me? Cannot afford the bill. Solo developer.

u/jollyrosso — 3 days ago

Eu-west2 ERROR_STOCKOUT (no resources) for SQL?

Half a day cant create enterprise plus db with 4 cpus. Yet dont see anyone complaining?

Is there dashboard or something which would show available capacity?

reddit.com
u/vvolas — 2 days ago
▲ 9 r/googlecloud+1 crossposts

Firestore Enterprise now the default go-to?

Is Firestore Enterprise now what new (startup) projects should be building in?

Assuming documents are small and expensive queries have indexes, is there any reason NOT to choose Firestore Enterprise now when starting a new project?

reddit.com
u/daskalou — 3 days ago

Google Sign-In failing with PlatformException(sign_in_failed, L0.d: 10, null, null) — tried everything...

Hi everyone,

My name is David.

I've been stuck on this for a long time and can't figure it out. My app is built with FlutterFlow + Supabase, and Google Sign-In keeps failing with this error:

`PlatformException(sign_in_failed, L0.d: 10, null, null)`

**What I've already checked/done:**

- SHA-1 fingerprint is registered in Google Cloud Console

- Package name is NOT com.example (already changed to a unique name)

- Web Client ID (not Android) is entered in Supabase

- Redirect URI (https://xxxx.supabase.co/auth/v1/callback) is registered in Google Cloud Console

- Google Provider is enabled in Supabase

- OAuth Consent Screen is set to External and I'm registered as a Test User

- Tested with a fresh APK install (removed old version first)

**My setup:**

- Built with FlutterFlow (via a third-party builder)

- Authentication: Supabase

- Testing on Android device via APK

Has anyone experienced this? What am I missing?

Thanks!

reddit.com
u/Super-Example1765 — 2 days ago
▲ 6 r/googlecloud+1 crossposts

Show your landing page for Google Cloud Startups Grand

If you have signed up for the Google Cloud Startup Grand program and were successfully approved, could you please share the landing pages you assigned to the application?

Or maybe you know somebody who has done it and can show their landing pages.

My college and I are planning to assign this program to our startup for the first time and want to explore successful examples. Would appreciate your help.

reddit.com
u/vokruggrizli — 3 days ago
▲ 39 r/googlecloud+14 crossposts

I added dedicated AWS / EKS support to KubeShark.

Mini recap:

KubeShark is my Kubernetes skill for Claude Code and Codex.

It helps AI agents generate, review, and refactor Kubernetes manifests without falling into the usual LLM traps: missing security contexts, deprecated API versions, broken selectors, wildcard RBAC, unsafe probes, missing resource requests, and rollout configs that look okay but fail under real traffic.

The important part is that KubeShark is failure-mode-first. It does not just tell the model “write good Kubernetes”. It forces the model to reason about what can go wrong before it generates YAML, and then return validation and rollback guidance as part of the answer.

That matters a lot with Kubernetes, because many bad manifests are accepted by the API server and only fail later at runtime.

Repo: https://github.com/LukasNiessen/kubernetes-skill

---

Now what’s new:

KubeShark now has special dedicated AWS / EKS support.

When the task involves EKS, AWS, IRSA, EKS Pod Identity, AWS Load Balancer Controller, EBS/EFS CSI, AWS VPC CNI, or Karpenter, KubeShark switches into EKS-aware guidance.

That matters because EKS is “just Kubernetes” until identity, load balancing, storage, pod networking, and node provisioning enter the picture.

Common LLM mistakes include:

  • putting AWS access keys into Kubernetes Secrets
  • mixing IRSA and EKS Pod Identity assumptions
  • using nginx annotations with AWS Load Balancer Controller
  • treating EBS like ReadWriteMany storage
  • recommending Karpenter while omitting resource requests
  • assuming NetworkPolicy works without checking the CNI/policy engine

Example guidance KubeShark now keeps in mind:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: app
  namespace: payments
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/payments-app

It also knows that EBS is usually RWO and zone-sensitive, EFS is the RWX option, and Karpenter depends heavily on good workload requests.

So instead of generic Kubernetes advice, you get EKS-aware manifest generation and review.

u/trolleid — 4 days ago

MLOps - observability at scale (agentic space )

Hi folks , has anyone here worked in the agentic AI space?

How are you handling observability for AI agents especially around infrastructure, tracing, monitoring, debugging, and reliability at scale?

I’m particularly interested in learning from people who have experience with large-scale agentic deployments in Tier 1 tech companies. Experience from smaller implementations is still useful, but I’m mainly looking for insights from production environments with high scale and complexity.

Any tips, lessons learned, or recommended tooling/frameworks would be appreciated.

reddit.com
u/gringobrsa — 3 days ago

Abnormal Gemini API billing on beta-stage project, Google Cloud review pending for 9 days

I’m a solo developer building a beta-stage research/survey platform.

The product has not been publicly launched yet. It has no paying customers, no production-scale traffic, and no revenue. Normal usage was limited to development, deployment, and a few internal/test survey workflows.

On May 2, 2026, my Google Cloud project generated an abnormal charge of about ₺124,899.88, approximately $2,740. Google Cloud Billing shows that almost all of it, about ₺124,730.55, approximately $2,737, came from Gemini API usage.

Google Cloud’s own cost anomaly alert flagged the spike on the same date, showing an expected cost of about ₺60.79, approximately $1.33, and an actual cost of about ₺92,278.65, approximately $2,025, during the anomaly period, with Gemini API as the top contributor.

I filed an unauthorized transaction/payment claim and received a standard denial. I then submitted a detailed Google Cloud Billing escalation with service-level billing reports, SKU-level CSV, Google’s cost anomaly alert, payment claim records, and API key remediation evidence.

The case has now been pending for 9 days. While waiting, I added a new card only to avoid further suspension or termination of Google Cloud services, and Google’s automated billing system immediately charged an additional TRY 40,000, approximately $878, threshold payment. I added that payment to the same dispute and clarified that it should not be interpreted as acceptance of the disputed charges.

I’m not trying to avoid legitimate cloud costs. I’m asking for a fair manual technical review of abnormal Gemini API usage that is completely disproportionate to the real activity of a beta-stage project.

Has anyone here successfully escalated a similar Gemini API billing abuse or abnormal usage case with Google Cloud? Any advice on getting this reviewed by the right team would be appreciated.

reddit.com
u/According-Owl6604 — 3 days ago

Gemini/VertexAI Increasingly Failing To Complete Requests?

I am recently experiencing an issue recently where requests from Gemini (Google Cloud and AI Studio) prematurely end. This will be like 15-30 seconds into reasoning, and then it just stops responding, no errors, nothing about content being flagged, etc.

I primarily use Gemini on Google Cloud, and on there I use various regions (us-central1, us-south1, us-west4, us-east1, global), but I tried the AI studio API (and the website), and this happens there too.

This is happening frequently, more often than 1/10 requests, and only within the last 1-2 weeks it started. Does anyone else have this issue?

reddit.com
u/donde_waldo — 4 days ago

"Watched a video saying ChatGPT can build websites. Spent 5 days doing exactly that. Opened Google billing this morning and nearly had a heart attack"

So I just got back from 6 months backpacking Southeast Asia. No job, almost no money left, but my head is full of ideas from the trip.

I watched a YouTube video showing how the new ChatGPT can basically build entire websites for you. I got excited. I had this idea about connecting hotels and content creators, something I kept thinking about during the trip. So I thought, why not try.

I'm not a developer. I'm just a guy in his room with a laptop and too much free time.

My roommates all got sick so I basically locked myself in my room and just started building. I copy-pasted code from ChatGPT, tweaked things, broke things, fixed things. I added maps, interactive pins, hotel pages, city guides. I had no idea what I was doing but it was actually working and I was genuinely enjoying it.

5 days. I spent 5 full days on this thing. Barely slept. My roommates were sick in the next room and I was in there having the time of my life adding features to a website nobody had ever seen.

Today I got curious about Google Maps costs. I'd added a lot of map features and wanted to understand the pricing before I eventually launched it. So I opened my Google billing account.

8,300 DKK. About $1,300 USD. In 4 days. On a site that was running on my laptop. That nobody visited. Ever.

Apparently the code ChatGPT wrote for me was making hundreds of API calls every time I tested something, places data, photos, nearby searches, directions, and I had no idea each refresh was costing money. No warning. No alert. Nothing.

I have maybe 50€ in my bank account right now.

I've submitted a billing dispute to Google and I'm hoping they'll refund it. From what I've read online they sometimes do for first time cases like this. But man.

I just wanted to build something cool.

Has anyone been through this? Any advice on getting Google to actually refund it?

https://preview.redd.it/3bo3qncgjo1h1.jpg?width=1804&format=pjpg&auto=webp&s=fcdb88dbbaa5268382803c4e2b152a459ba0dce7

reddit.com
u/CameraFederal9599 — 4 days ago