Follow the OpsBerry blog to learn about about Terracotta news and engineering.
🌐 Visit Blog.tryterracotta.com
🌐 Blog.tryterracotta.com besuchen
Write rieview✍️ Rezension schreiben✍️ Get Badge!🏷️ Abzeichen holen!🏷️ Edit entry⚙️ Eintrag bearbeiten⚙️ News📰 Neuigkeiten📰
Tags: engineering opsberry terracotta
US United States
US United States
US United States
Your platform team is reviewing more Terraform PRs than they can safely handle. Every week, there are more infrastructure changes, more modules, more environments, and the same three engineers rubber-stamping plans because they don't have time to actually read them. The blast radius gets missed, the drift gets ignored, and the security group that's open to the world gets approved because it's PR number fourteen that day, and everyone's already in the next sprint.
Nobody on your team feels confident about hitting the approve button. They just don't have a choice.
You've been here. Your team has been here. And the incident that finally forces the conversation is always the one that could have been caught in the PR if anyone had the time and context to catch it.
This is what a single Terraform PR looks like after Terracotta AI reviews it:
One PR. One comment. Twelve automated analysis sections covering everything from simulated plan analysis and drift detection to IAM security, cost impact, blast radius, dependency mapping, and a full infrastructure architecture diagram. All of it runs automatically the moment a PR is opened. No Rego. No Sentinel. No custom scripts. No pipeline changes.
Let’s break down what’s happening, section by section.
The first thing your reviewer sees isn’t a raw diff or a wall of HCL. It’s a plain-English summary of what this PR actually does, which resources are being created, modified, or destroyed, and why it matters.
This is the “here’s what you need to know before you look at anything else” section. It gives your team immediate context without parsing the diff line by line. For a senior engineer reviewing their tenth PR of the day, this is the difference between a 30-second triage and a 15-minute deep dive into code they didn’t write.
This is where Terracotta AI diverges from every other code review tool on the market. We don’t just look at the code diff; we simulate what Terraform will actually do when this code runs.
The Simulated Plan Report shows a resource-level breakdown of planned actions: what’s being created, what’s being updated in place, and what’s being destroyed and recreated. Each resource is displayed with its action type, resource address, and the specific attributes being changed. The report includes visual diagrams showing how resources relate to each other within the plan, so you’re not just seeing a list of changes, you’re seeing the shape of the change.
This is the plan output your team would normally only see after running terraform plan in CI. Terracotta AI surfaces it directly in the PR comment, before anyone approves anything, with context and explanation that a raw plan output can never provide.
Line-level code review findings on the Terraform changes in the PR. These aren’t generic linting rules; they’re contextual findings that understand what the code is doing in the context of your existing infrastructure.
Each finding includes severity, the affected resource, a clear explanation of the issue, and a recommended fix. Findings eligible for automated fixes include committable suggestions and one-click fixes that your engineers can apply directly from the PR without leaving GitHub or GitLab.
Your junior engineers get senior-level feedback on every PR. Your senior engineers stop repeating the same review comments across dozens of PRs every week.
Platform teams have standards. You may require encryption on every S3 bucket. Maybe you don’t allow public subnets in production. Maybe your security team mandates specific tagging conventions or instance types for compliance.
Terracotta AI Guardrails let you define these policies in plain English. No Rego. No Sentinel. No OPA. Just write what you want enforced: “All S3 buckets must have server-side encryption enabled” or “No EC2 instances larger than m5.xlarge in staging,” and Terracotta AI automatically evaluates every PR against your policies.
The Guardrail Report shows a pass/fail result for each active policy, with clear explanations for any violations. Guardrails can run in advisory mode (flag but don’t block) or mandatory mode (block the merge until the violation is resolved).
This is policy-as-code without the code. Your compliance team defines the rules. Your engineers see them enforced in real time. Nobody has to learn a new language.
Infrastructure PRs don’t exist in isolation. When two engineers work on overlapping resources in separate PRs, the second to merge can silently break the first one’s assumptions.
The PR Conflict Report identifies potential conflicts between the current PR and other open PRs targeting the same resources or state. This is the kind of issue that causes production incidents on Friday afternoons, and it’s completely invisible in a normal code review workflow.
This is the section that makes infrastructure engineers sit up in their chairs.
Drift happens. Someone makes a manual change in the console. An automated process modifies a resource outside of Terraform. A previous apply partially failed, leaving the state inconsistent. Whatever the cause, your Terraform code no longer matches reality, and your next apply is about to do something you didn’t expect.
Terracotta AI’s drift detection compares your PR against your actual remote state and live infrastructure across 1,618 AWS resource types. It performs field-level comparison, so you don’t just know that a resource has drifted, you know exactly which attributes changed, what the expected value was, and what the current value is.
The Drift Detection Report shows each drifted resource with detailed findings, severity levels, and context for how the drift affects the changes in this PR. If your PR is about to modify a resource that has already drifted, you need to know that before you hit apply, not after.
Every Terraform change has a cost implication. Adding a new RDS instance, resizing an EC2 fleet, and enabling a new service all show up on next month’s bill, and the engineer writing the code often has no idea what the number looks like.
The Cost Impact section breaks down the estimated monthly cost delta of the changes in this PR. It shows per-resource cost estimates so your team can see exactly which resources are driving the number. No more surprise bills. No more “who approved that m5.4xlarge?”
This is the conversation your FinOps team has been trying to have with engineering for years. Terracotta AI puts the number in front of the engineer at the exact moment it matters before they merge.
IAM misconfigurations are the number one cause of cloud security breaches, and they’re the hardest thing to catch in a code review. An overly permissive policy, a wildcard action, and a missing condition key look fine in a diff but create real exposure in production.
The IAM Security Analysis evaluates every IAM-related change in the PR against security best practices. It flags overly broad permissions, missing least-privilege constraints, and risky policy patterns with clear explanations of why each finding matters and how to fix it.
For regulated industries, healthcare, financial services, and government, this section alone can justify the tool.
Tags are the foundation of cloud cost allocation, access control, and compliance reporting. And they’re the first thing engineers forget when they’re shipping fast.
The Tag Compliance Check validates that every resource in the PR meets your organization’s tagging requirements. Missing tags, incorrect formats, and inconsistent naming all caught before merge, not after your FinOps team sends the quarterly “please tag your resources” email.
Not all Terraform changes are created equal. Modifying a single security group rule is not the same as replacing a VPC. But in a raw plan output, they can look equally mundane.
The Blast Radius Analysis evaluates the scope and severity of changes in the PR, the number of resources affected, which dependencies could be affected, and the potential downstream consequences. It provides a severity rating and a table breaking down the impact by resource.
This is the analysis a senior infrastructure engineer runs through in their head when reviewing a plan. Terracotta AI does it automatically, consistently, and documents it for every PR.
Understanding what depends on what is the difference between a safe deployment and an outage. The Infrastructure Dependencies section maps the dependency graph of the resources in your PR, showing which resources reference other resources, which modules depend on which outputs, and where a single change could cascade.
This is the context that’s impossible to see in a flat diff. A security group change might look harmless until you realize four auto scaling groups and a load balancer depend on it. The dependency map makes that visible before anyone hits approve.
This is the feature that makes people stop scrolling.
Terracotta AI generates a Mermaid architecture diagram directly in the PR comment, showing how the resources in this PR relate to each other and to the broader infrastructure. You can view the VPC, subnets, security groups, instances, and load balancers as a visual diagram directly in GitHub or GitLab.
For complex PRs touching multiple resources, this is the fastest way to understand what’s actually changing. No more mentally reconstructing the architecture from a list of resource addresses. The diagram shows you.
Every section above runs automatically on every PR. There’s nothing to configure beyond connecting your repo: no scripts to maintain, no policy languages to learn, no pipeline changes to make.
Your platform engineers will no longer be the bottleneck. Your junior engineers get senior-level feedback on every change, while your security team gets visibility without slowing anyone down. Your FinOps team stops finding surprises in the monthly bill. And your compliance team gets enforcement without writing a line of Rego.
This is what an AI-powered infrastructure review actually looks like, not a chatbot that guesses at your code, but a system that understands your live infrastructure, your remote state, your drift, your policies, your architecture, your code, and your proposed changes. This gives your team the context they need to ship faster and with confidence.
Your team deserves better than rubber-stamped plans.
Terracotta AI is built for platform engineering teams managing Terraform at scale. We're SOC 2 Type II certified, HIPAA compliant, and currently running with teams in healthcare, financial services, and insurance.
Terracotta AI is agnostic to your IaC tooling and can integrate with GitHub, GitLab, HCP Terraform Run Tasks, and many other tools - no pipeline changes, no new workflows to learn.
Interested in a technical demo?
18.2.2026 16:36Anatomy of an AI-Powered Terraform PR Review: What Your Platform Team Gets on Every Pull RequestTerraform has always been an open ecosystem. Some of the most critical infrastructure patterns in the world are built, shared, and improved in public repositories by engineers who care deeply about doing things the right way.
We want to support that.
Today, we’re officially launching Terracotta AI Pro for public and open-source Terraform repositories, free of charge.
Sign up for free here: https://tryterracotta.com/signup and install your public Terraform repos in just 2 clicks.
If your Terraform repository is public or open source, you get full access to Terracotta AI Pro at no cost.
That includes:
No trial limits. No feature gating. No credit card.
Terracotta AI will only charge when private repositories are attached.
In other words:
If you’re building in the open, you shouldn’t pay just to understand what your Terraform changes are doing.
Terraform is powerful, but it’s also easy to get wrong. Open-source maintainers and contributors shouldn’t have to choose between shipping responsibly and absorbing extra tooling costs.
We believe:
This is our way of investing back into the Terraform community while keeping our business model aligned with real, private-production use.
If you maintain or contribute to a public Terraform repository, you can connect it to Terracotta AI today and start getting full Pro-level reviews immediately.
The automated tier management will be automated in the near future, but for now, email our founders at founders@tryterracotta.com with your name and the email you used to create the account, and get you squared away.
Just better Terraform reviews, where they matter most.
👉 Try Terracotta AI on your public repo today https://tryterracotta.com
6.1.2026 17:34New Year, New Free Terracotta AI Pro Tier for Open Source Terraform Repos: Supporting Open Source Without Strings AttachedA single failed Terraform pull request can waste 15+ engineering hours and cost over $2,000 in combined platform, developer, and management time. Most of that cost comes from unclear intent, manual reviews, and late-stage fixes. The longer issues survive in the workflow, the more expensive they become.
Terraform self-service is one of those ideas that sounds obvious in theory.
Give developers reusable modules. Add guardrails. Automate plans and apply. Let teams move faster without filing tickets or waiting on a central platform group.
In practice, most organizations find that self-service Terraform shifts the bottleneck rather than removing it.
Platform teams still end up on the hook for outages, cost overruns, misconfigurations, and “how did this get merged?” incidents. The difference is that now those issues are spread across more repos, more teams, and more workflows.
After talking with dozens of enterprise platform teams, clear patterns emerge.
Here are the five most common bottlenecks organizations face when adopting self-service Terraform, along with why these issues persist and are hard to resolve permanently. Understanding and addressing these is key to scalable self-service.
Terraform state is deceptively simple until more than one team touches it.
Once self-service is introduced, questions pile up quickly:
Without a clear, enforced strategy, teams tend to fall into one of two failure modes. Either they centralize everything into massive, fragile state files that nobody wants to touch, or they fragment state so aggressively that shared resources become impossible to reason about.
Both approaches increase risk. One makes changes terrifying. The other makes drift and duplication inevitable. Platform teams usually end up acting as human state coordinators, reviewing changes not because they want to, but because they’re the only ones who understand the blast radius.
Every platform team starts with good intentions: build a set of reusable, opinionated Terraform modules that encode best practices for networking, compute, IAM, databases, and more.
Then reality hits.
Modules need versioning. They need documentation. They need clear guidance on what should be overridden and what should never change. And most importantly, they need enforcement.
Developers treat modules like Lego blocks. Platform teams treat them like safety systems. That mismatch creates constant friction. One team overrides a variable to save cost. Another forks the module for a one-off use case. A third pins an old version “just for now” and never upgrades.
Over time, module sprawl takes over. The platform team becomes the only group that understands which modules are safe, deprecated, or quietly breaking standards. Reviews turn into archaeological exercises instead of approvals.
Terraform self-service works great right up until credentials are involved.
Developers need access to provision infrastructure, but handing out long-lived cloud credentials is a non-starter. The “right” solution usually involves some combination of:
Each of these is reasonable on its own. Together, they introduce many moving parts.
When this layer isn’t designed carefully, it can lead to self-service stalls. Developers can write Terraform code, but can’t apply it without platform support. Platform teams end up debugging auth issues, token expiration, or permission errors instead of focusing on architecture and reliability.
The result is a system that looks automated but still depends on humans to keep it running.
Every organization wants developers to move fast, but not that fast.
Self-service means more people can provision infrastructure, which increases the risk of non-compliant resources, surprise costs, and unintended changes with a wide blast radius. To compensate, teams introduce policy-as-code tools like Sentinel, OPA/Rego, Checkov, or custom scripts.
These tools help, but come at a cost. Writing and maintaining good policies is hard. Explaining failures to developers is often the most challenging part.
When policies are enforced late after a plan or during apply, they feel punitive. When they’re enforced early without context, they feel arbitrary. Platform teams spend hours explaining why a policy exists, rather than letting the policy explain itself.
The enforcement problem isn’t just technical. It’s communicative. And most setups don’t handle that well.
Terraform pipelines look simple until something goes wrong.
A “proper” pipeline needs to handle:
apply separationAs teams scale, pipelines grow complex. Edge cases pile up. A failed plan can block multiple teams, and a stuck apply requires someone with context to step in.
Platform teams end up maintaining pipeline logic as a product, often without the time or tooling to make it understandable to others. Developers see a red checkmark. Platform engineers know a chain of dependencies that only they can untangle.
At that point, self-service exists in name only.
Even teams that solve the above still struggle with:
These challenges are persistent and represent the ongoing difficulties faced when scaling Terraform. Recognizing and addressing these issues is central to sustainable self-service success.
These problems aren’t independent. They share a common root cause.
Self-service increases surface area faster than shared understanding.
Terraform encodes what can be configured, but not why it should be configured a certain way. That intent lives in platform engineers’ heads, in old docs, or in tribal knowledge passed around during reviews.
As more teams adopt Terraform, platform teams become interpreters of intent instead of designers of systems. That’s not sustainable.
Solving self-service Terraform isn’t about adding more tools or more gates. It’s about making intent, context, and impact visible where changes actually happen.
Closing the gap between intent, context, and action is the central challenge. Recognizing this helps teams prioritize where to focus for lasting improvement in self-service Terraform.
Terracotta AI shifts Terraform governance from reactive cleanup to proactive understanding.
Instead of discovering problems after a failed apply, a broken environment, or a long review thread, Terracotta surfaces risk, intent, and impact at the moment a change is proposed. Every Terraform pull request is automatically analyzed in the context of your existing modules, environments, and infrastructure state.
Platform teams get enforceable guardrails without writing brittle policies or maintaining custom CI logic. Developers get clear, human-readable feedback that explains what went wrong, why it matters, and how to fix it without waiting on Slack or blocking reviews.
The result is fewer failed PRs, faster reviews, and infrastructure changes that behave the way the platform team intended. Governance becomes something that enables teams to move faster, not something that slows them down.
As Terraform adoption grows, the problem isn’t tooling it’s coordination, context, and consistency. Teams that scale successfully tend to follow a few common patterns.
First, treat Terraform changes as decisions, not just diffs. Plans should clearly explain intent, impact, and risk so reviewers don’t need deep context to understand what’s happening.
Second, encode standards where developers already work. Documentation alone doesn’t scale. Best practices need to appear automatically in pull requests, with clear explanations when they deviate.
Third, separate guidance from enforcement. Not every issue should block progress, but every issue should be visible. Advisory feedback builds trust and learning; mandatory enforcement protects production.
Finally, optimize for fewer failed PRs, not faster retries. The real cost of Terraform mistakes isn’t the apply it’s the hours lost across platform, development, and management when things go wrong.
When teams make intent explicit and feedback immediate, Terraform becomes a system that scales with the organization instead of against it.
Ready to bring clarity and governance to your established Terraform workflows? Terracotta AI helps platform teams explain, enforce, and scale infrastructure standards directly inside pull requests. Book a demo to see it in action.
22.12.2025 19:59The Top 5 Bottlenecks in Self-Service Terraform (and Why They Keep Coming Back)Today, we’re excited to announce a new integration between Terracotta AI and HashiCorp, an IBM Company, and HCP Terraform and Terraform Enterprise via Terraform run tasks, bringing earlier insights, clearer intent, and stronger governance into the Terraform lifecycle without disrupting the workflows teams already rely on.
HCP Terraform has long helped platform teams streamline provisioning, collaboration, and approvals for Infrastructure as Code (IaC). It has become the operational backbone for organizations that need predictable, governed infrastructure at scale. Terracotta AI now complements that foundation by helping teams understand Terraform changes, enforce architectural and module intent, and align developers around best practices before a Terraform apply ever runs.
This integration delivers context, visibility, and governance at the exact moment teams need it most: the moment a Terraform plan is created.
HCP Terraform and Terraform Enterprise provide a unified model for consistent and auditable infrastructure delivery. Run tasks extend this model by allowing external tools to participate in key lifecycle stages such as post-plan and pre-apply.
Run tasks allow teams to introduce additional verification steps, including security scanning, compliance checks, cost controls, drift detection, and architectural validation. All of this is done without rewriting pipelines or rearchitecting how Terraform runs. They can be configured as advisory or mandatory, providing organizations with a flexible yet enforceable way to safeguard infrastructure changes before they proceed.
HCP Terraform run tasks create the extensible governance layer that many teams have long wanted. Terracotta AI now fits directly into that layer.
When HCP Terraform generates a plan, it invokes Terracotta AI through the run task callback. Terracotta inspects the plan, interprets the intent behind the infrastructure change, and posts a structured, human-readable report directly back into the HCP Terraform UI.
Instead of asking reviewers to parse raw JSON plans or relying on someone to notice subtle misconfigurations, Terracotta provides precise and contextual evaluation:
HCP Terraform provides operational consistency. Terracotta adds clarity and enforceable understanding.
The result is more than a plan analysis. It is governance made visible directly inside the Terraform run.
In addition to guardrails enforcement, Terracotta AI provides a dedicated Post-Plan Analysis run task that evaluates every Terraform plan immediately after it is generated. This is the moment where HCP Terraform has assembled the complete set of proposed infrastructure changes, and Terracotta steps in to explain what those changes actually mean.
When the plan finishes, HCP Terraform sends the plan output to Terracotta. Terracotta interprets the changes with full infrastructure context, surfacing issues that would otherwise require manual review or deep Terraform expertise. The analysis includes security misconfigurations, compliance gaps, cost-impacting resources, drift-like inconsistencies, and best practice violations that commonly slip through raw plan diffs.
The results are returned directly into the HCP Terraform run interface. Each finding includes severity, a clear explanation of the issue, the affected resources, and a recommendation for how to fix it. If the workspace is configured in mandatory mode, any blocking issue halts the run. Advisory mode continues execution while still providing detailed guidance.
This Post-Plan step gives teams immediate clarity without slowing deployments and ensures that risks are surfaced before changes move forward in the Terraform workflow.
Beyond standard plan analysis, this integration now introduces a new capability: Terracotta AI Guardrails enforced directly through HCP Terraform run tasks.
Terracotta’s Guardrails let platform teams define architectural intent, usage constraints, and best practices in natural language. These policies are expressed in plain English, not in Rego, HCL, Sentinel, or custom scripts.
With the new Guardrails run task, HCP Terraform sends the plan output to Terracotta. Terracotta evaluates it against all active guardrails. If a policy is violated, Terracotta returns contextual, human-readable reasoning within the HCP Terraform run.
Teams see:
Mandatory mode blocks the run while Advisory mode guides without stopping progress.
This brings policy-as-code, architectural consistency, and intent enforcement directly into HCP Terraform without requiring new languages, tooling, or workflow changes.
Governance is often misunderstood as a restriction. In practice, it is about clarity, predictability, and shared understanding.
Most organizations rely on a patchwork of documents, Slack threads, tribal knowledge, and manual PR reviews to communicate the intent behind infrastructure decisions. Terracotta AI helps make this guidance explicit and enforceable without increasing friction.
Instead of developers guessing why a module default exists or the platform team repeatedly explaining it during reviews, Terracotta puts the reasoning in front of them automatically. Instead of governance being reactive and surfacing late, it becomes proactive and supportive.
The Terraform workflow stays the same. The understanding around each change becomes far stronger.
Configuring Terracotta as a run task takes only minutes. After adding the callback URL and signing secret provided by Terracotta, any workspace can be configured to run Terracotta immediately after the plan stage.
From that point forward, every Terraform plan is automatically analyzed by Terracotta AI, and findings appear directly in the HCP Terraform interface. Severity, affected resources, the rationale behind an issue, and step-by-step remediation guidance all surface in context. Mandatory workspaces stop the run when violations appear. Advisory workspaces provide recommendations without halting progress.
Developers do not need to change their workflow. Platform teams do not need to rebuild pipelines. HCP Terraform continues to operate as the backbone, and Terracotta adds clarity and guardrails exactly where they are required.
This integration gives organizations a way to ensure that Terraform changes are safe, compliant, cost-aware, and aligned with internal architectural intent before they reach the apply phase. It helps platform teams scale governance without slowing development. It reduces review fatigue. It clarifies module usage, resource behavior, and best practices in a way that is both automatic and accessible. And it gives developers immediate contextual feedback that helps them ship safer infrastructure with confidence.
The Terraform workflow stays the same. Understanding, safety, and predictability improve dramatically.
Getting started is super simple. These features are available today. Everything you need is outlined in the Terracotta AI docs here.
Have questions or want to see a live custom demo? We’d love to hear from you. Reach out and get a demo directly from our founders.
9.12.2025 17:50Terracotta AI × HashiCorp: New Terraform Run Tasks Help Customers Enforce Intent, Strengthen Governance, and Catch Risks EarlierEvery engineering org eventually hits the same Terraform wall:
As you scale, infrastructure becomes harder to reason about, PR reviews slow down, governance gets reactive, and every team wants “self-service” without the risk. That’s where Terraform workflow platforms come in. These tools don’t replace Terraform; they solve the surrounding lifecycle: workspaces, policies, automation, promotion flows, and team coordination.
Here are the top 5 Terraform workflow engines to know in 2025, how they differ, and when each one actually makes sense.
Official site: https://www.hashicorp.com/products/terraform
Angle: The “official” Terraform workflow engine.
Terraform Cloud (TFC) and Enterprise (TFE) offer the most opinionated, integrated Terraform experience:
Teams that pick TFC/TFE want a unified Terraform lifecycle and are comfortable adopting the HashiCorp workflow model.
Why it matters:
It’s the reference architecture for Terraform workflows, the baseline every other tool is compared against.
Official site: https://spacelift.io
Angle: The most flexible Terraform-native workflow engine.
Spacelift is popular with platform engineering teams because it supports:
Developers like it because it feels natural in Git.
Platform teams like it because it provides pipeline power without having to build pipelines manually.
Why it matters:
The pick for teams that want customization without owning the entire automation layer.
Official site: https://www.runatlantis.io/
Angle: The OG of PR-based Terraform automation.
Atlantis is as close to “pure GitOps Terraform automation” as you can get:
If a team says:
“We want full control and don’t want a SaaS workflow engine.”
…they’re usually running Atlantis.
Why it matters:
Still, the fastest way to run Terraform securely inside your own walls.
Official site: https://www.env0.com
Angle: Terraform as a self-service platform for developers.
Env0 is built around controlled autonomy:
It treats Terraform like a product that developers can safely consume, with platform teams setting standards and constraints.
Why it matters:
Perfect for teams rolling out developer-facing self-service infrastructure.
Official site: https://www.scalr.com
Angle: Enterprise-first workflows with governance as the priority.
Scalr focuses on:
Enterprises choose Scalr when they want Terraform workflows without Terraform Cloud but still need strong compliance and audit capabilities.
Why it matters:
A strong fit for companies with heavy governance requirements and complex org structures.
Terracotta AI is not a workflow engine.
It’s the pre-merge intelligence layer that runs before any of the tools above, sanitizing and enforcing best practices, so by the time your workflows trigger, they become sanity checks for your Terraform.
With the Terracotta API, you can connect your workflow outputs, contextualizing your workflow and test output directly from your CI tool. This extension of AI functionality allows teams to generate meaningful insight in the form of a retrospective after the CI has triggered.
With Terracotta AI, you get both pre-flight and post-merge sanity checks across your entire IaC pipeline, giving you real insight and confidence from your IaC workflows.
You can think of it like this:
Terracotta plugs directly into GitHub/GitLab and provides:
Terracotta AI is fully CI/CD-agnostic; it works with all five tools because it runs before they do.
If workflow tools answer:
“How should we run Terraform?”
Terracotta answers:
“Is this Terraform safe to run?”
Each Terraform workflow engine solves a different problem:
In 2025, Terraform workflows aren’t solved by choosing one tool.
They’re solved by pairing the right execution engine with the proper pre-merge guardrails and clarity.
19.11.2025 17:29Top 5 Terraform Workflow Tools You Should Know in 2025If you have ever owned Terraform modules in a real engineering organization, you already know something that most companies will not say publicly. The problem is not Terraform. The problem is not even infrastructure complexity. The problem is that once custom modules leave the platform team and fall into the hands of dozens or hundreds of developers with varying levels of context, experience, and priorities, they can become fragmented.
Developers treat modules as Legos. Platform teams treat them as safety systems.
This gap becomes a source of friction, misuse, repetitive Slack questions, late-night incident reviews, and a maintenance burden that grows every quarter.
This article digs into the hidden challenges behind module ownership, explains why the friction grows over time, and shows how Terracotta AI lets platform teams enforce best practices and intent automatically in pull requests. The NAT Gateway example below walks through a real guardrail and a real PR violation, with screenshots to anchor the experience.
Platforms build modules to achieve standardization, safety, and speed. But as organizations scale and new services emerge, your modules slowly become the foundation everyone relies on, yet few actually understand their purpose, their usage, and so forth. Developers tend to treat modules as configurations rather than systems. With enough drift and enough time, the module becomes a black box that only the platform team can truly understand, maintain, and enforce.
Let's break down the core pressures that create this mess.
Platform teams, or any team creating custom modules, carefully choose defaults, guard some settings behind variables, and structure modules to create a predictable infrastructure. Developers see a variable and assume it is safe to override. They often do not understand the cost, redundancy, or compliance assumptions baked into each choice.
A classic example is the NAT Gateway setting.
Production environment expects high availability.
Developers are told to find "cost savings" wherever possible.
One variable can swing both.
Even if you have an internal registry, modules eventually get copied. A team forks yours with a minor tweak, then another team copies their version, then ops discovers three different NAT strategies across four environments.
The platform team always becomes the integration point for these divergent versions.
Terraform modules describe configuration possibilities. They do not explain why something is a default, when a variable should never be touched, or what downstream systems depend on a specific value.
This turns into:
Intent lives in platform engineers' heads, not in code.
Every platform has a document or catalog entry that describes how modules should be used. No one reads it, and if they do, in come the Slack questions. New hires do not even know it exists. Teams use modules inconsistently, and the platform team manually corrects mistakes during PR review. Documentation without enforcement becomes a historical artifact rather than a control mechanism.
Every variable override, every version bump, every weird module instantiation eventually lands in a PR that you are expected to review. You are the stopgap that protects production from misuse.
This is not sustainable at scale.
Customization is not malicious. It is a symptom of scale.
As you grow:
The platform team becomes a bottleneck because the system depends on you to interpret intent for every change. The wider the engineering org becomes and the more democratized Terraform becomes through self-service implementation, the harder it is to maintain consistent infrastructure behavior across teams.
This is the exact moment when module governance turns from a best practice into a constant firefight.
The core issue is simple.
Modules encode configuration, not intent.
The documentation explains the intent, but no one reads it.
PR reviews enforce intent, but reviews do not scale.
Terracotta AI bridges all three by allowing platform teams to define intent in natural-language Policy as Code (i.e., Guardrails) and automatically enforce it in every pull request that touches Terraform or CDKTF.
The platform team writes guardrails like:
If the environment variable is set to prod, make sure the "single_nat_gateway" is set to false. Setting this to false allows multiple NAT gateways, making production NAT instances highly available and redundant.
Terracotta AI interprets this rule and enforces it proactively in PRs before your CI triggers, and it runs tests like Validate, Plan, or Apply.
The intent is preserved, enforcement is automatic, and the context is visible to the developer AND the platform engineering team in real time.
This is the workflow platform teams have tried to build manually for years.
To illustrate the workflow, I'd like to walk through a REAL-WORLD example we use to deploy our OWN infrastructure here at Terracotta AI.
Your rule:
If the environment variable is set to prod, make sure the "single_nat_gateway" is set to false. Setting this to false allows multiple NAT gateways, making production NAT instances highly available and redundant.
This instantly becomes a structured guardrail that Terracotta AI can enforce.
No custom code.
No Terraform Cloud policy engine.
No Rego.
No maintenance overhead.
Your variable:
variable "single_nat_gateway" {
description = "Use single NAT gateway to reduce costs (suitable for dev or staging)"
type = bool
default = true
}In a production context, this default creates a single point of failure. This is precisely the kind of oversight that leads to real outages.
The PR shows:
No hunting through logs.
No Slack messages to the platform team.
No re-explaining the module.
This is the part developers love.
Terracotta explains:
This is effectively built-in documentation delivered when a developer needs it.
You simply need a way to take the intent behind your modules and enforce it consistently where developers work: inside pull requests.
Terracotta AI gives platform teams the missing enforcement layer for module governance. It turns module expertise into guardrails. It prevents misuse before deployment. It protects production without slowing development.
This is what Terraform governance should have always felt like.
Interested in a custom demo for you and your team? Head over to https://tryterracotta.com/schedule-demo and let's chat!
Terracotta AI turned Terraform pull requests into intent-driven, natural-language summaries, solving the final frontier of friction in self-service IaC within engineering orgs—the Platform Engineering and Developer dilemma.
Terracotta AI is a Y Combinator-backed and seed-funded startup building deterministic, infra-aware AI for Platform, DevOps, and SRE teams.
19.11.2025 16:18Real-Time Natural Language Guardrails for Terraform Modules: Because Every Company Customizes Them and Platform Teams Absorb the PainTL;DR:
HashiCorp’s Sentinel is robust but often reactive, enforcing policies only after Terraform plans run. Terracotta AI complements it by bringing those same Sentinel policies into the workflow earlier. In minutes, teams can translate Sentinel rules into natural-language guardrails that run directly in pull requests, giving developers instant, human-readable feedback before CI/CD.
Together, Sentinel and Terracotta create a full-stack governance model: Sentinel remains the enforcement engine in Terraform Cloud, while Terracotta serves as the pre-merge safety layer, catching risks, drift, and policy violations before they ever reach a plan.
Terraform policy enforcement has always walked a fine line, balancing safety with speed. HashiCorp's Sentinel framework was a significant leap forward, giving platform teams a powerful way to codify security, compliance, and cost controls directly into Terraform runs. Sentinel turned governance into code precise, testable, and repeatable. But like any powerful system, it requires expertise and ongoing maintenance to get right. As HashiCorp's actual footprints, teams, and repositories grow, so does the friction of managing and scaling those policies consistently across every workspace.
Terracotta AI builds on that foundation. Rather than replacing Sentinel, it brings the same guardrails into the workflow earlier, acting as an intelligent, pre-merge complement that translates Sentinel logic into natural-language guardrails developers actually understand. In under five minutes, teams can sync existing Sentinel policies with Terracotta AI and start running the same enforcement checks directly inside pull requests, long before Terraform Cloud or Enterprise runs the plan. The result is a faster feedback loop, more transparent communication between developers and platform teams, and fewer policy violations reaching CI/CD.
Sentinel is HashiCorp's policy-as-code framework for Terraform Cloud and Enterprise. It lets teams define governance logic programmatically, things like enforcing encryption on S3 buckets, restricting instance types by environment, or ensuring tagging compliance across resources. Each policy is written in Sentinel's domain-specific language (DSL) and in HashiCorp's Terraform plan data, and it lets platform teams enforce organizational standards in a controlled, versioned way.
A typical Sentinel policy might look like this:
Import "tfplan/v2" as tfplan
main = rule {
all tfplan.resources.aws_s3_bucket as bucket {
bucket.applied.acl not in Sentinel sead", "public-read-write"]
}
}It’s a powerful model, but it’s also a precise one, which means teams need to understand Terraform plan schemas, maintain imports across provider versions, and manage policies as infrastructure evolves. Sentinel enforces control during plan evaluation, giving teams confidence that nothing unsafe reaches deployment.
It's Sentinel's greatest strength, precise enforcement within Terraform Cloud, which also introduces friction when teams move fast. Because Sentinel policies run during or after plan execution, developers often don't see violations until late in the process. By the time a "policy check failed" message appears, they've already opened, pushed, and sent to Sentinel's pipeline. That delay creates a gap between the writing infrastructure and understanding what's wrong with it.
It's not a flaw in Sentinel; it's a timing issue. The model was built for central teams that don't manage Terraform runs, but today's teams operate across "multiple repos, branches, and CI/CD systems. Governance that lives exclusively inside the plan stage can feel disconnected from where collaboration happens: inside pull requests. Platform teams. It's a way to bring Sentinel's governance philosophy closer to where developers work earlier in the feedback loop, with the same enforcement logic but better visibility.
That's precisely where Terracotta AI fits. Terracotta AI brings Sentinel-style governance into the pull request itself, running pre-merge checks, translating Sentinel rules into plain language, and providing actionable context. It connects directly to GitHub or GitLab and automatically scans every Terraform and CDKTF PR before merge, surfacing cost-impact detection and policy violations, all explained in English, not just DSL output.
For example, a Sentinel rule that blocks public S3 buckets translates in TerracottaAI to:
"Block any Terraform change that makes an S3 bucket publicly accessible."
Or a cost policy like restricting small EC2 instance types becomes:
"Flag any EC2 instance smaller than t3.small in production."
The intent stays the same, the enforcement moves earlier, saving time. Developers can see violations before Terraform Cloud runs a plan. Platform leads can track adherence across all repositories. Sentinel continues enforcing compliance during plan and application, but "now the team sees and understands issues long before that stage.
The point of Terracotta AI isn't to replace Sentinel, it's to extend it. Sentinel remains the definitive enforcement layer inside Terraform Cloud and Enterprise, where actual plan data is validated and blocked if needed. Terracotta AI complements that by acting as a sanity check and collaboration layer at the PR level. It catches potential issues earlier, provides context for developers, and helps teams align on policy intent before the pipeline even starts.
The benefits compound at scale. Platform teams no longer spend cycles explaining Sentinel failures after the fact. Developers no longer have to reverse-engineer policy DSL. Everyone gets faster, more precise feedback, without compromising governance.
And since Terracotta AI supports natural language guardrails, teams can layer in additional pre-merge checks that cover contextual risks Sentinel wasn't designed to handle, such as cost visibility, dependency risk, and drift detection across multiple repos.
Step 1: Open up the Guardrails section in the Terracotta AI app and create a new Guardrail:
Step 2: Give the Guardrail a Name and copy and paste your Sentinel Policies into the Context section.
Here is the example Sentinel Policy the Context area we are using:
# sentinel.hcl — baseline governance for Terraform runs
# --- Tag hygiene: require env/owner/etc on all managed AWS resources
policy "require-tags" {
source = "policies/require-tags.sentinel"
enforcement_level = "soft-mandatory" # start soft; ratchet up later
params = {
required = ["Owner", "Environment", "CostCenter"]
exempt_types = ["aws_iam_*"] # example: often not tagged
}
}
# --- Absolutely block deletes/replaces
policy "deny-destroy" {
source = "policies/deny-destroy.sentinel"
enforcement_level = "hard-mandatory"
}
# --- Cap proposed monthly cost using HCP/TFE cost estimates
policy "limit-monthly-cost" {
source = "policies/limit-monthly-cost.sentinel"
enforcement_level = "hard-mandatory"
params = {
max_monthly = 1000.00
}
}
# --- Allowed AWS regions only (guardrails for drift/accidental regions)
policy "allowed-aws-regions" {
source = "policies/allowed-aws-regions.sentinel"
enforcement_level = "soft-mandatory"
params = {
regions = ["us-east-1", "us-west-2"]
}
}When you click create , we will begin parsing and contextualizing the Sentinel Policy into enforcement rules.
Step 3: You are done. Yes, that's it. We will parse the Sentinel rules and enforce them when a Terraform PR is created. You can apply these at a Global or Repo level.
Step 4: Open a repo and see Terracotta AI Guardrails in action, enforcing your Sentinel policies pre-code merge!
The evolution isn't about choosing one over the other; it's about combining both. Sentinel gives platform teams precision and enforcement at the plan stage. Terracotta AI gives them clarity and context earlier in the workflow. Together, they turn policy-as-code into policy-as-intent: consistent governance, understood by both developers and platform engineers, is applied from the first commit to final deployment.
When a Terracotta AI guardrail flags an issue, it doesn't just say a rule failed; it explains the "why":
"EC2 instance t2.micro exceeds your production performance baseline. Replace with t3.small or larger."
That level of transparency makes compliance conversational, not combative. Developers learn as they go. Platform teams enforce standards proactively. Sentinel remains the final Terracotta AI to ensure fewer violations ever.
HashiCorp set the standard for infrastructure governance with Sentinel. Terracotta AI builds on that foundation, extending its reach to where developers actually collaborate, and translating its rules into a context everyone can understand. The two systems are complementary: Sentinel enforces; Terracotta AI informs. Together, they bring the same rigor and consistency HashiCorp pioneered, but with the speed, clarity, and AI assistance modern platform teams need.
Governance doesn't have to slow you down. It just needs to meet you where you work. Terracotta AI brings Sentinel's power earlier in the lifecycle, keeping infrastructure compliant, auditable, and understandable at every step.
Try Terracotta AI for free and bring your Sentinel policies into every PR in under five minutes.
Interested in a demo of Terracotta AI? Sign up at https://tryterracotta.com/schedule-demo and we'll create a custom demo for you
3.11.2025 20:27Enforcing Sentinel Policies Earlier in Your GitOps Pipeline: Migrating Sentinel Policies to AI Guardrails in Under 5 MinutesTL;DR:
Terraform drift sneaks in quietly and turns "source of truth" into "source of surprise." You ship clean plans, and weeks later, production no longer matches your .tf or state. Now every plan feels suspicious, and applying feels risky.
This guide goes deep on what drift is, why it happens, how to detect and remediate it safely, and how Terracotta AI automates drift detection directly in pull requests before risky changes ever reach CI/CD.
Terraform drift occurs when the real infrastructure in your cloud environment no longer matches the configuration stored in Terraform state and code.
In simpler terms, something changed outside of Terraform's control.
That could mean:
Once that happens, Terraform is no longer operating from a reliable truth.
The next plan output becomes confusing or even destructive because it's trying to reconcile the wrong baseline.
The most common cause.
Someone fixes a production issue in the console, bypassing IaC entirely. The hotfix might work in production, but Terraform's state is now outdated.
Drift also occurs when other tools, such as Ansible, Helm, or CloudFormation, modify resources that Terraform also manages.
Each system assumes ownership, and Terraform can't see what changed.
Using ignore_changes in Terraform can suppress legitimate drift:
lifecycle {
ignore_changes = [tags]
}That setting hides tag updates from future plans. It's useful for noise, but dangerous for compliance-related metadata.
Cloud providers constantly evolve APIs. Defaults shift. New fields appear.
A field Terraform once ignored can suddenly appear in drift results or worse, silently mutate infrastructure behavior.
If multiple users or pipelines apply changes without proper state locking, you end up with competing state updates.
One overwrites another, and Terraform's record diverges from the actual environment.
When you run terraform plan, it compares three things:
1. The desired configuration (your .tf files)
2. The state file (the recorded infrastructure snapshot)
3. The real-world resources (fetched from the provider API)
If something differs, Terraform flags it. But this process is reactive; you must run the command to detect drift.
Terraform detects manual console changes, missing resources, or modified attributes that are modeled in the provider schema.
It misses untracked fields, ignores attributes, and doesn't track mutations made by tools Terraform doesn't manage.
That means drift can persist for weeks without notice until the next plan or failed deployment exposes it.
Use remote backends like S3, GCS, or Terraform Cloud with locking enabled to prevent concurrent state writes.
Example for AWS:
terraform {
backend "s3" {
bucket = "iac-state-prod"
key = "networking/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "iac-state-locks"
encrypt = true
}
}Lock down console and CLI write access.
Emergency fixes should follow a break-glass policy and must be reconciled back into Terraform code immediately.
Audit your lifecycle rules regularly.
Only ignore attributes that truly change outside your control, such as timestamps or ephemeral metadata.
Provider updates often resolve drift-detection issues and schema mismatches.
Pin and upgrade versions consistently:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.59"
}
}
}Schedule drift-detection jobs, or, better yet, integrate them directly into your pull request reviews. Catching drift before the merge keeps every change based on live, accurate infrastructure.
Run drift checks locally.
Terraform init
terraform plan -refresh-only -out=drift.planThis updates your state with live resource data and reports differences without applying changes.
It's a quick way to verify that your local state still matches the cloud state.
Set up a nightly or hourly job to run drift checks:
terraform init -input=false
terraform plan -refresh-only -lock-timeout=10m -out=drift.plan
terraform show -json drift.plan > drift.jsonYou can then parse drift.json to notify your team in Slack or Jira when drift is detected.
However, CI-based detection comes with challenges:
• You need cloud credentials for every environment.
• Alerts are often noisy.
• The results lack context for what's risky vs. what's benign.
The ideal approach is to detect drift before code merges.
This ensures your pull requests are based on an accurate infrastructure state.
Running drift checks at review time provides developers and reviewers with a clear picture of costs, security, and drift in one place.
terraform import aws_iam_role.app_role app-role
Use create_before_destroy for safer replacements:
lifecycle {
create_before_destroy = true
}Add guardrails
Set policies for sensitive attributes (CIDRs, IAM permissions, encryption) to prevent recurring risky drift.
Not all drift is dangerous. Some comes from harmless ephemeral fields or evolving APIs.
Common patterns:
• Drift not detected: Upgrade provider or check provider schema coverage.
• Drift keeps returning: Another system (like CloudFormation) is mutating resources.
• Tag drift: Refine tagging policies rather than ignoring all tags.
• Ephemeral attributes: Document and exclude fields like timestamps or last modified dates.
In large orgs, drift compounds across hundreds of workspaces and repos.
• Standardize workspaces - consistent naming, backends, and provider versions.
• Use central identity and state management - OIDC auth for cloud providers.
• Aggregate drift reports - dashboards by team, environment, and severity.
• Codify best practices - use opinionated modules that enforce secure, consistent defaults.
⸻
Drift, blast radius, and cost
Security risk
Operational risk
Cost Impact
Effective drift detection should answer:
Most teams eventually try to script drift checks, run refreshes, parse JSON, and comment on PRs. It's functional but brittle. What's missing is context and trust.
Terracotta AI integrates directly into your GitOps workflow to automate this entirely.
It analyzes uses natural language AI Guardrails within Terraform pull requests using live infrastructure context to detect and explain:
When drift is detected, Terracotta generates plain-language summaries directly in your PR:
No guesswork. No manual refreshes. No extra CI jobs.
Terracotta fits directly into your existing GitHub or GitLab flow, with no new runners, no custom scripts, and no vendor lock-in.
When Terracotta AI flags drift, you can:
• Simulate how Terraform apply would behave without executing it.
• Identify who caused the drift and when.
• Prevent future drift by enforcing policies before the merge.
Drift goes from an afterthought to a preventable, explainable, and auditable part of your review process.
Terracotta AI is building AI-native IaC guardrails for platform teams and their Infrastructure-as-Code deployment pipelines. Our natural language AI guardrails for platform teams easily create and enforce standards, security, and cost control in every Terraform pull request.
If you want to stop firefighting drift and start governing it:
👉 Start reviewing up to 20 Terraform and CDK-TF PRs for free. No CC required.
KubeCon is always a look forward and in 2025, it's clearer than ever: infrastructure isn't just code anymore. It's behavior, intent, policy, performance, and scale.
That's precisely where we live.
At Terracotta AI, we're building the Terraform-native AI reviewer for modern DevOps teams, and what we're learning here will power tomorrow's GitOps-first automation layer.
Join us at Booth #1851 at the Georgia World Congress Center to see what that looks like in action.
KubeCon + CloudNativeCon is the flagship event of the Cloud Native Computing Foundation (CNCF), the open-source foundation behind Kubernetes, Prometheus, Envoy, OpenTelemetry, and dozens of other projects that form the backbone of modern infrastructure.
It's not just a Kubernetes conference, it's the pulse of the cloud-native ecosystem.
Whether you're running Kubernetes at scale, building internal platforms, automating infrastructure-as-code, or exploring how AI and GitOps will shape infra workflows, this is where the future is being constructed and debated.
Why it's different from most tech conferences:
What to expect:
It's where deep technical talks meet hallway architecture debates.
Where you discover what will shape the next 18 months of platform engineering, often before it hits GitHub.
Suppose your team is betting on infrastructure automation, GitOps, or platform maturity. In that case, KubeCon is where the people solving those problems are gathered in one place.
November 10–13, 2025
Georgia World Congress Center (Building B), Atlanta, Georgia
Tip: The All-Access pass includes all co-located events, making it ideal if you're focused on Terraform, GitOps, or platform engineering patterns.
KubeCon isn't just a showcase; it's where the future of cloud-native computing is shaped and challenged.
We're not just talking about AI for Terraform, we're shipping it.
What we're showcasing:
We're currently focused on Terraform because that's where the complexity and risk lie. However, our foundation is designed for more GitOps-native, AI-augmented infrastructure, so you don't have to babysit.
These are in-depth, topic-specific sessions hosted before the main event, featuring some of the best content of the week.
Why it matters
We're particularly excited about these sessions, which are relevant for platform teams, security-minded DevOps, and anyone investing in IaC or GitOps at scale.
Session Speakers With Time & Location
Platform Engineering: Day Zero, The Origin Story - Murriel McCabe, Google Tues, November 11 · 11:15–11:45 AM · B312–314
No Joke: Two Security Maintainers Walk Into a Cluster Jackie Maertens & Nilekh Chaudhari, Microsoft Tues, November 11 · 4:15–4:45 PM · B206
No Chicken Left Behind: Reliability and Observability With Service Mesh at Chick-Fil-A Christopher Lane, Chick-fil-A Wed, November 12 · 3:00–3:30 PM · B312–314
Keynote: Beyond Operations: Scaling Platform Engineering in the CNCF Community Abby Bangser, Syntasso Thurs, November 13 · 9:07–9:23 AM · Exhibit Hall B2
KubeCon + CloudNativeCon North America 2025 is where cloud-native momentum turns into movement.
Terracotta AI is here to help platform teams automate Terraform PRs today, while building the intelligent AI GitOps infrastructure layer they'll depend on next.
Come find us at Booth #1851.
Or skip the wait and get a demo right now.
22.10.2025 15:03Terracotta AI's Guide to KubeCon + CloudNativeCon North America 2025We're excited to announce two new features that expand how you can use Terracotta AI in your Terraform workflows: HCP Terraform Run Tasks and a brand-new Terracotta AI API.
These updates make it easier than ever to integrate Terracotta AI with your existing infrastructure pipeline, whether you're fully committed to HCP Terraform/Enterprise, building internal tooling, or simply seeking more control.
Terracotta AI now integrates directly with HCP Terraform and Enterprise using the Run Tasks.
This means Terracotta AI can run pre-merge checks, such as plan analysis, drift detection, and guardrails, automatically as part of your TFC workflow.
This is IaC guardrails built into your actual HCP Terraform run. Not a separate scanner. Not post-merge. Real enforcement right where you need it.
For teams building internal tooling, developer platforms, or just wanting more control, we’ve launched the Terracotta AI API.
You can now call all our core analysis endpoints (plan, drift, summary, guardrails, and conversations) directly, without requiring GitHub integration.
x-api-key or Authorization: Bearer headerThis unlocks new use cases, such as custom developer portals, internal drift dashboards, or tightly controlled compliance flows that still benefit from Terracotta AI's Terraform expertise.
Whether you want plug-and-play enforcement in HCP Terraform or programmatic access to everything we do, these updates give you more power, more flexibility, and more confidence at scale.
At Terracotta AI, we built our platform around this principle. Instead of post-merge explanations, we provide pre-merge intelligence.
Terraform changes break things. Our AI catches them in the PR.
We analyze your Terraform changes right inside the pull request, before they hit your pipeline. Our AI understands your code, state, and live infrastructure context to automatically detect drift, missing dependencies, cost spikes, exposed secrets, and blast radius issues.
No workflow changes required. We work with your existing GitHub or GitLab process, providing the contextual intelligence that turns complex infrastructure plans into clear, actionable insights.
Ready to flip your workflow from backwards to intelligent?
👉 Learn more about Terracotta AI here: https://tryterracotta.com
29.9.2025 17:31Introducing Two New Ways to Use Terracotta AI across your Terraform Workflows






























