r/Terraform

Stop copy-pasting Terraform modules, I built a tested registry for AWS, GCP, and Azure with Terratest and CI
▲ 0 r/Terraform+3 crossposts

Stop copy-pasting Terraform modules, I built a tested registry for AWS, GCP, and Azure with Terratest and CI

Disclaimer: I built this project and am sharing it as a free open-source tool.

Every project I join has the same problem: someone copied and pasted a VPC module from a blog post in 2019, nobody tested it properly, and now it's load-bearing infrastructure.

This registry has 9 modules across AWS, GCP and Azure, VPC/VNet, Kubernetes (EKS/GKE/AKS), and IAM/Workload Identity for each cloud.

Every module has:

- A Terratest that provisions real infrastructure and tears it down (no mocks)

- GitHub Actions CI (fmt, validate, tflint, Checkov)

- Secure defaults with every option exposed as a variable

- Working examples you can run in under 5 minutes

**Module list:**

- modules/aws/vpc: VPC, public/private subnets, NAT gateway, route tables

- modules/aws/eks: EKS cluster, managed node groups, OIDC, IRSA

- modules/aws/iam: roles, policies, IRSA binding

- modules/gcp/vpc: VPC, Cloud NAT, Private Google Access, firewall rules

- modules/gcp/gke: GKE cluster, node pools, Workload Identity

- modules/gcp/iam: service accounts, IAM bindings, WI federation

- modules/azure/vnet: VNet, subnets, NSGs, route tables

- modules/azure/aks: AKS, managed identity, OIDC, Workload Identity

- modules/azure/iam: managed identities, federated credentials, role assignments

**Quick start:**

git clone https://github.com/Cloud-Architect-Emma/terraform-module-registry

cd terraform-module-registry/examples/aws

terraform init && terraform plan

**Or reference directly in your code:**

module "vpc" {

source = "github.com/Cloud-Architect-Emma/terraform-module-registry//modules/aws/vpc?ref=main"

name = "production"

cidr = "10.0.0.0/16"

azs = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]

private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]

public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]

enable_nat_gateway = true

}

⭐ If this saves you time, a star on the repo helps others find it: https://github.com/Cloud-Architect-Emma/terraform-module-registry

PRs welcome, what module would you add first?

u/EmmaOpu — 15 hours ago

Better options than Terraform-only workflows for GCP security drift?

we rely on Terraform for most of our GCP infrastructure, but teams still create resources directly through the console or gcloud for quick tests.

those never go through our policies, IAM setup, or org constraints, so drift shows up quickly.

we’ve tried a few approaches. asset discovery tools pick up some of it but miss certain GCP-native services created ad hoc. drift detection tools flag issues, but remediation ends up manual and noisy, especially with short-lived test resources. Config Connector didn’t fit well since not everything runs through Kubernetes.

at this moment we don’t have a reliable way to see what’s out of sync or enforce a baseline once something is created outside Terraform.

whats working to catch and control GCP security drift without slowing teams down?

reddit.com
u/ElectricalLevel512 — 17 hours ago
▲ 4 r/Terraform+1 crossposts

I built an open-source compliance scanner for AI infrastructure on AWS - looking for feedback

What it is: A small CLI (infrarails, Apache-2.0) that reads your Terraform and tells you which EU AI Act, NIST AI RMF, and ISO 42001 controls are passing, failing, or unverifiable -specifically for AWS Bedrock infrastructure. Runs in CI like any other linter, outputs terminal / HTML / PDF / JSON / SARIF.

Why I built it: I work on AI systems and kept noticing during audits that a lot of the auditor observations were things that could have been caught at PR time - missing model invocation logging, log retention too short for post-market monitoring, audit trail buckets without versioning. All declarative, all sitting in Terraform, all verifiable before merge. There are plenty of tools for the broader AI governance picture (model cards, evals, lineage), but I couldn't find one that lived inside the deployment pipeline itself and mapped checks back to the actual framework articles auditors open. So I started building one on weekends.

The interesting design problem: The hardest call wasn't the rules - it was making "we couldn't verify" a first-class verdict alongside PASS / FAIL. Logging often lives in a separate stack, behind a remote module, or in a var with no default. A static scanner that confidently says PASS when it actually has no idea is worse than no scanner. So the third bucket (INCONCLUSIVE, with a machine-readable reason code) became the whole personality of the tool. Strict mode treats it as blocking; --no-strict lets it pass.

What I'd love feedback on:

  • Whether the rule severities feel right (FAIL vs WARN vs INCONCLUSIVE)
  • Whether the framework mappings hold up — anyone here familiar enough with NIST AI RMF or ISO 42001 to spot stretches?
  • Other AI/ML platforms worth adding next (Vertex AI, Azure OpenAI, Sagemaker?)
  • General "this is a weird side project to spend weekends on" reactions welcome too

Honest scoping note: a passing scan is necessary but not sufficient. Infra is maybe 30% of what these frameworks ask for — governance, data quality, human oversight aren't in Terraform. But automating the 30% felt worth the weekends.

Repo: github.com/policyrails/infrarails
npm: npm install -g infrarails

reddit.com
u/Capable_Influence157 — 21 hours ago

How do you handle security findings that require Terraform changes?

I’m trying to understand how Terraform-heavy AWS teams handle security findings in practice.

Example: Security Hub / GuardDuty / Config flags an issue like public S3 access, overly broad IAM, exposed security groups, missing logging, or drift from expected controls.

How does that usually become a Terraform change?

In teams I’ve seen, the flow is often messy:

- finding appears in AWS

- someone has to decide if it matters

- ownership is unclear

- the actual fix may need Terraform, not a console change

- reviewers need to trust the diff

- compliance/audit needs evidence that it was handled

I’m exploring a workflow where findings are grouped into prioritized actions and turned into human-reviewed PR-style Terraform remediation bundles. No direct cloud changes.

Curious how others do this today:

- Do security findings usually become Terraform PRs?

- Who owns the fix: security, platform, app team, or DevOps?

- Do you allow console fixes, or force IaC-only?

- What would make an auto-generated Terraform fix untrustworthy?

- How do you track exceptions and evidence?

reddit.com
u/MarcoMaher — 1 day ago

LLM Api bill

I've been dealing with massive LLM API bills and unpredictable Terraform costs. Provider dashboards only show total spend, which is useless for figuring out which specific features or users are burning tokens. Has anyone else experienced this?

reddit.com

SBOM for Infrastructure as Code

Is anyone generating SBOMs for their IaC repositories? Looking into the best way to accomplish this for compliance and curious if a tool that converts Terraform lockfiles to SPDX would be beneficial?

reddit.com
u/RoseSec_ — 1 day ago

Skipped our planned CLI 1.0 to ship 2.0 designed for AI agents. Who's letting Claude et al. write their Terraform in prod?

5 years ago I shared a project with this group and got lots of good feedback. It was a CLI tool that generated cost estimates for Terraform. Recently, I'd been thinking about a 1.0 release where the CLI would go beyond just cost estimates and show best practices such as previous-generation instances, storage lifecycle policies, and the kinds of issues a thorough PR review would catch.

Then Claude et al happened and the more developers I spoke with, the clearer it became that the 1.0 scope was the right idea aimed at the wrong caller. A human reviewer reads a PR comment; an agent runs `infracost inspect --filter` ... and gets the same insight as a tabular row it can pipe into the next step. So I decided to skip our planned 1.0 release and go for 2.0, where I treated agents as a first-class citizen user of the CLI.

I'm curious if folks are actually using Claude/Copilot etc to write IaC in production? The repo is here https://github.com/infracost/infracost/ in case people want to test the new version and provide feedback on how to improve it.

u/alikhajeh1 — 2 days ago
▲ 4 r/Terraform+1 crossposts

Lazytf: a terminal UI for reviewing Terraform plans

I’ve been working on lazytf, a terminal UI for reviewing Terraform plans and apply history.

The goal is to make large Terraform plans easier to inspect locally, especially for teams that are not using Terraform Cloud but still want a cleaner diff review flow in the terminal.

It currently supports:

- running plan/apply/init/validate/format flows inside the TUI

- targeted plan and apply workflows

- read-only mode

- piping `terraform plan -no-color` into lazytf

- opening existing saved plan files

- apply history

- workspace and folder environment detection

- YAML, NixOS, and Home Manager configuration

- presets and project overrides

- Terraform and OpenTofu binary selection

- themes and lazygit-style keybindings

Github Repo: https://github.com/ushiradineth/lazytf
Blog post: https://ushira.com/blog/introducing-lazytf
Demo: https://assets.ushira.com/introducing-lazytf/demo.mp4

I’d especially like feedback from people managing larger Terraform/OpenTofu projects locally.

u/Terrible_Capital789 — 3 days ago

does anyone actually keep their architecture diagrams up to date or have we all just accepted theyre going to be wrong

genuine question, been using terraform for deploys and cloudcraft for diagrams and the two have basically never been in sync for more than a week

someone updates the hcl, nobody updates the diagram. onboard a new dev and the diagram is lying to them from day one. tried enforcing it as a rule, lasts one sprint. tried generating diagrams from existing tf state after the fact, its clunky and lossy

curious if anyones found a workflow that actually works or if most teams have just quietly given up on diagram accuracy

reddit.com
u/0xArchitectus — 3 days ago
▲ 39 r/Terraform+14 crossposts

I added dedicated AWS / EKS support to KubeShark.

Mini recap:

KubeShark is my Kubernetes skill for Claude Code and Codex.

It helps AI agents generate, review, and refactor Kubernetes manifests without falling into the usual LLM traps: missing security contexts, deprecated API versions, broken selectors, wildcard RBAC, unsafe probes, missing resource requests, and rollout configs that look okay but fail under real traffic.

The important part is that KubeShark is failure-mode-first. It does not just tell the model “write good Kubernetes”. It forces the model to reason about what can go wrong before it generates YAML, and then return validation and rollback guidance as part of the answer.

That matters a lot with Kubernetes, because many bad manifests are accepted by the API server and only fail later at runtime.

Repo: https://github.com/LukasNiessen/kubernetes-skill

---

Now what’s new:

KubeShark now has special dedicated AWS / EKS support.

When the task involves EKS, AWS, IRSA, EKS Pod Identity, AWS Load Balancer Controller, EBS/EFS CSI, AWS VPC CNI, or Karpenter, KubeShark switches into EKS-aware guidance.

That matters because EKS is “just Kubernetes” until identity, load balancing, storage, pod networking, and node provisioning enter the picture.

Common LLM mistakes include:

  • putting AWS access keys into Kubernetes Secrets
  • mixing IRSA and EKS Pod Identity assumptions
  • using nginx annotations with AWS Load Balancer Controller
  • treating EBS like ReadWriteMany storage
  • recommending Karpenter while omitting resource requests
  • assuming NetworkPolicy works without checking the CNI/policy engine

Example guidance KubeShark now keeps in mind:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: app
  namespace: payments
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/payments-app

It also knows that EBS is usually RWO and zone-sensitive, EFS is the RWX option, and Karpenter depends heavily on good workload requests.

So instead of generic Kubernetes advice, you get EKS-aware manifest generation and review.

u/trolleid — 4 days ago
▲ 21 r/Terraform+2 crossposts

Interview Qs for Cloud Engineer Role at FNZ Group.

  1. Can you introduce yourself and explain your Cloud/DevOps experience?
  2. What projects, tools, and Azure services have you worked on?
  3. Do you have cloud migration experience?
  4. What is your latest project and role?
    5, How proficient are you with Terraform?
  5. Write Terraform code to provision two Azure VMs with high availability.
  6. What is Terraform and its alternatives?
  7. How does auto-healing work in Kubernetes?
  8. Explain ingress and egress in Kubernetes.
  9. What is the purpose of Azure Virtual Network (VNet)?
  10. What is a Virtual Machine?
  11. Are containers OS independent in Kubernetes?
  12. What is an Azure Landing Zone?
  13. What does “pre-configured” mean in Azure Landing Zones?
  14. What Azure services and CI/CD tools have you used?
reddit.com
u/ResidentComedian2977 — 5 days ago
▲ 15 r/Terraform+1 crossposts

Transition to Terraform deployments

Hi all! for context, I’m still a relatively new engineer working in a business with a fairly immature Azure environment.

At the moment, we deploy all of our Azure resources manually through the portal, but I’ve been trying learn more about IaC with the goal of eventually transitioning us towards managing everything through Terraform instead. (Terraform over Bicep as there are plans to expand into GCP in the future)

One thing I’m struggling to understand is around our current change/documentation process.

Before creating or modifying resources, we usually produce a spec sheet in Excel. It contains all the Azure resources & settings we commonly use, and we fill in the relevant values/options for the deployment. It works well because:

  • Non-technical staff can easily review and understand it
  • It clearly shows what’s changing
  • It doubles as documentation for future reference
  • It fits nicely into our change management/review process

The problem is that moving to Terraform feels like we’d now be doing the same work twice:

  1. Fill out the Excel spec sheet
  2. Then manually recreate all of that again in Terraform

If we skipped the Excel part and only used Terraform, I think a lot of non-technical stakeholders/reviewers would struggle to understand the changes as easily.

Am I thinking about this the wrong way? How do other businessed manage this?

EDIT: I have thought about using AI to generate the Excel spec sheets based off the Terraform files as well as a general summary of what's new/changing, would be interested to hear if anyone else has done something similar.

reddit.com
u/Teqzahh — 4 days ago

Terraform Notes

I’ve been learning Terraform recently through the Kodekloud course -Terraform for Beginners and compiled my study notes into a PDF while going through the concepts.

It covers fundamentals like providers, resources, variables, state, modules, workflows, and other Terraform basics in a concise way.

Thought I’d share it here in case it helps someone else getting started with Terraform or revising concepts.

https://drive.google.com/file/d/12fzoxxRpB\_iTfVCPokX9FREN5I9KUCqi/view?usp=drivesdk

drive.google.com
u/bilal32600 — 5 days ago
▲ 131 r/Terraform

OpenTofu 1.12 has landed!

Hey! OpenTofu Maintainer here.

OpenTofu 1.12 is out and I just wanted to share here what I think may be useful for some people.

  • prevent_destroy can now reference variables. (prevent_destroy = var.is_prod works now!)
  • tofu init now understands all platform hashes for every platform on its first run. This means you shouldn't have to reach for tofu providers lock anymore for managing multiple architectures.
  • Provider downloads now run in parallel. init should be faster for everybody all around.
  • -json-into=FILENAME lets you send human readable logs that we all love to stdout, and have json readable logs sent off to a different file, pipe, etc. This means you can do some fancy TUI logging alongside your real logs!

We have lots more for you to see in our full changelog here: https://github.com/opentofu/opentofu/blob/v1.12/CHANGELOG.md

Or our blogpost here: https://opentofu.org/blog/opentofu-1-12-0/

u/Yantrio — 8 days ago
▲ 1 r/Terraform+1 crossposts

How do you actually catch security issues in Terraform PRs when you're doing solo reviews?

The pattern I keep seeing: security groups too open, S3 buckets publicly accessible, encryption disabled on databases, IAM policies wider than they need to be. I catch some of it in manual review, but I know I'm missing things.

Question for the room: what's actually working for you?

  • Are you using any automated tooling? (Checkov, tfsec, something else?)
  • Has anyone tried running infrastructure changes through ChatGPT or Claude to catch gaps before merge?
  • If you haven't automated this, what's the blocker company policy, trust in the output, or just haven't found the right tool?

Curious what's actually practical at the startup/small-team scale where you can't afford enterprise solutions.

reddit.com
u/Status-Direction99 — 7 days ago

Terraform + GitHub Actions + 30+ secrets -> is Vault actually the right solution here?

I have a fairly large Terraform setup that manages servers + DNS and almost all related configuration: Docker setups, service configs, JSON/YAML files, secrets, etc. Server images are built with Packer. Deployments run exclusively through GitHub Actions, and Terraform state is stored in PostgreSQL.
Right now, I pass all secrets through GitHub Actions Secrets and inject them into Terraform variables. It works technically, but it increasingly feels like the wrong approach — I’m now at around 30 secrets just for the pipeline.

I’m trying to understand whether HashiCorp Vault is actually the right solution here or whether I’d just be adding unnecessary complexity. Most Vault explanations feel very abstract to me. What I’m really looking for is a pragmatic setup for:
centralized secret management
secure usage in GitHub Actions
clean Terraform integration
avoiding secret sprawl
scaling cleanly across many services/hosts

How are people handling this in larger Terraform environments? Are you using Vault, 1Password, SOPS, cloud secret managers, or something else entirely? And at what point does Vault actually become worth it?

EDIT: Servers and most stuff powered on Hetzner. Other providers that are used: Cloudflare (public DNS), cloudinit (server setups to configure everything possible like installs and configurations) and Twingate (ZTNA)

reddit.com
u/PurchasePatient5465 — 8 days ago

Best AWS security controls for preventing console-created resources in 2026?

we’ve got a strict policy that all AWS resources go through Terraform. that broke this week.

a junior dev needed temporary storage for a data export and created an S3 bucket directly in the console. uploaded ~500GB of customer data from a prod RDS replica. bucket ended up public.

we found it when GuardDuty flagged activity on a bucket we didn’t recognize. public access was open for several hours before we caught it. we’ve locked it down now, but there’s no clear way to know who accessed the data during that window.

on top of that, an IAM role from prod with broad read permissions was attached for the export script. so now we’re also dealing with potential exposure through that path.

we’re digging through CloudTrail and access logs to understand scope, but it’s messy.

this wasn’t a tooling gap, it was someone bypassing IaC under time pressure.

for those dealing with AWS security at scale, what actually works to prevent this? not policies on paper, but controls that stop or catch console-created resources fast.

reddit.com
u/Past-Ad6606 — 8 days ago

Hashicorp Terraform Associate (004) - Bryan Krausen's past papers?

Hey all,

Looking to take the 004 exam having completed Bryan Krausen's past papers for practice (on udemy).

Those of you who're familiar and who've taken the 004 exam - how similar to the exam would you say Krausen's past papers were, and how good of an indicator of performance are scores on those papers?

Thanks!

reddit.com
u/Designer_Canary_7646 — 6 days ago
▲ 2 r/Terraform+3 crossposts

Multi-Cloud Auto-Remediation in a Few Clicks

I am building Zyvoq, and it can delete all your idle resources in just a few steps with simple UI interactions across multiple clouds, including AWS, GCP, and Azure for now.

I read a lot about how deleting resources after getting recommendations becomes messy, and it becomes even more difficult when you are managing multiple clouds.

So, to solve this problem, I am introducing zyvoq.moamir.cloud.

Please give your feedback and opinions, does this solve a real problem or not?

reddit.com
u/mo-amir — 7 days ago
▲ 9 r/Terraform+2 crossposts

OpenDepot - an open-source Kubernetes native module and provider registry

TL;DR: Checkout OpenDepot an open-source Kubernetes native module and provider registry for OpenTofu and Terraform I built! OpenDepot Documentation

Deploy your very own local registry in minutes following the Local Quickstart Guide!

If you're still with me, now the full story!

I had tasked my team last year with implementing one of the open-source registry options that were available at the time. They spent months trying to get each one implemented in a manner that we deemed secure and appropriate for production. However, each failed to meet our requirements for safety and soundness. We eventually caved in and went to Artifactory since it had a mature OIDC implementation. However, this came with a high cost.

I soon saw this as an opportunity to leverage my years of experience in the Kubernetes and IaC space to build a registry that was cloud native, easy to deploy, and built with security in mind. From that realization, OpenDepot was born!

OpenDepot is the first completely Kubernetes native registry that implements the Module and Provider registry protocols for both OpenTofu and Terraform. See how it stacks up to other registries! Feature Comparison

With OpenDepot, if you have a Kubernetes cluster, the same auth mechanisms you use to get access to the cluster are the same mechanisms you can leverage to fetch modules and providers. OpenDepot can be setup in minutes, not days, weeks, or months. It's built from the ground up with security in mind: Authentication

OpenDepot got its name from its most prominent feature: the Depot controller. Most registries are push or webhook based; the Depot controller operates differently by providing a pull-based mechanism for modules and providers so you don't have to expose your cluster or open additional ports to ingest your artifacts. The Depot also serves as an easy migration path to OpenDepot: Depot (Pull Based)

My favorite and preferred approach for private modules is using GitOps with ArgoCD. This allows you to add new module versions right alongside the module code itself so your team can approve the module and version in the same Pull Request! GitOps with ArgoCD

OpenDepot currently supports the three major cloud providers AWS, Azure, and GCP. It also supports Filesystem based storage backed by a PVC with a Storage Class that provides ReadWriteMany access. The cloud providers also support pre-signed URLs so large downloads don't add stress to your infrastructure: Storage Backends

OpenDepot also has opt-in scanning for modules, provider binaries, and source code using Trivy: Vulnerability Scanning

Please, feel free to DM me, or post issues, feature requests, or whatever else on GitHub! I'm hoping people out there find this as useful as we did!

tonedefdev.github.io
u/azjunglist05 — 8 days ago