The Role of Drift Detection in Modern Infrastructure

“Drift detection” sounds like one of those must-have features in modern infrastructure. It’s baked into tools, marketed heavily, and often treated as a best practice, but do we really need it? In most well-structured environments, drift shouldn’t be something you detect, it should be something you prevent.

Drift Is a Symptom, Not the Problem

Terraform drift happens when your real infrastructure no longer matches your code. That’s not the root issue. The root issue is someone (or something) was able to change infrastructure outside of Terraform. If that’s possible in your system, then drift detection is just a monitoring layer over a deeper control problem.

The Better Approach: Eliminate Drift at the Source

The goal should be to stop the ability of things changing the environment. If you design your permissions model properly, drift becomes rare by design.

The goal is simple:

No direct changes in production environments
All infrastructure changes go through Terraform
Terraform runs via controlled pipelines

In this model, Terraform isn’t just a tool but it’s the only path to change. If that’s enforced properly, drift doesn’t happen in normal operation. Therefore do you need to spend resources and time to monitor the drift of Terraform?

“Break Glass” Still Matters

In reality though, even if we want to restrict all changes in environments manually, there will come a time in support you need someone to save the day. To enable this you will need a role that has higher privileges to through them fixes in like updating a Network Security Group, or putting a patch on a Virtual Machine.

This is where you need to be strict with both who and what you give:

Limited access to what it needs using fine grain access.
Used only in emergencies and not just because it is easier.
Assigned to trusted engineers (e.g. SREs), so it doesn’t spread.

Following this allows BAU support to be maintained and keeps it controlled, but this only works if there’s discipline afterward:

Every manual change must be reflected back into Terraform.

If you skip that step, drift becomes permanent and your IaC loses credibility. You still question the effort for drift detection as these changes should happen very rarely and the changes reflected back into your code.

When Drift Detection Does Make Sense

However, I understand that even with strong controls, drift detection can still be useful as a safety net for:

Auditing environments
Catching misconfigurations
Detecting accidental or unauthorized changes
Highlighting gaps in your permission model

But it should be your backup plan, not your primary strategy. This shouldn’t be your first focus and where your design/development activities are put. Below is a simple solution I would use for Drift Detection.

A Simple, Tool-Agnostic Way to Detect Drift

If you do want drift detection, you don’t need anything complex.

A straightforward approach:

Run a scheduled pipeline/workflow (e.g. daily)
Execute terraform plan against your environment
Compare the output to detect unexpected changes

The key detail:
Run the plan against the exact version of code that is currently deployed.

If you don’t, you’ll just detect differences between commits not actual drift.

Something that can help with the detection and putting logic into your drift is using the Terraform output json which I have written about before in Terraform plan output to JSON.

From there, you can:

Inspect the planned changes
Detect additions, updates, or deletions
Trigger alerts if anything unexpected appears

Final Thought

If you rely heavily on drift detection, it’s worth asking why. In most cases, the better investment is:

Stronger permissions, tighter control, and a single path for change.

Drift detection is useful but it’s not a substitute for good system design.

Taking It Further: An Agentic Approach to Drift

If you want to push this further, this is where things get interesting and potential for AI use.

Instead of just detecting drift, you can interpret it.

Using an MCP-style integration (for example, something like the Azure and Terraform Model Context Protocol), you can introduce an agent that sits on top of your Terraform pipeline and adds reasoning to the process.

Conceptually, the flow looks like this:

Scheduled pipeline runs terraform plan
Output is converted to JSON
Data is passed into an MCP-enabled agent
The agent:
- Reviews the changes
- Classifies them (expected vs unexpected)
- Correlates with recent deployments or incidents
- Provides a human-readable report
- Suggests remediation actions

Instead of a binary “drift detected” alert, you get something closer to:

“This change appears to be a manual scaling adjustment on a production resource.”
“This likely originated from a break-glass action during an incident.”
“No corresponding Terraform change exists in the current codebase.”

That’s a completely different level of signal, then you can move it from Detection to Action.

Once you trust that reasoning layer, you can go a step further, so the agent doesn’t just report drift it can propose fixes.

For example:

Generate the required Terraform changes to match reality
Or suggest reverting the infrastructure back to code

In more advanced setups, it could even:

Automatically raise a pull request with the proposed changes
Tag the relevant team
Include a summary of why the change is needed

Now drift detection becomes:

Drift → Analysis → Recommendation → Resolution

This solution is what advances the Drift Detection to make it more purposeful and useful to teams.

The Role of Drift Detection in Modern Infrastructure

The Better Approach: Eliminate Drift at the Source

“Break Glass” Still Matters

When Drift Detection Does Make Sense

A Simple, Tool-Agnostic Way to Detect Drift

Final Thought

Taking It Further: An Agentic Approach to Drift

Published by Chris Pateman - PR Coder

Leave a message please Cancel reply

The Better Approach: Eliminate Drift at the Source

“Break Glass” Still Matters

When Drift Detection Does Make Sense

A Simple, Tool-Agnostic Way to Detect Drift

Final Thought

Taking It Further: An Agentic Approach to Drift

Share this page with everyone:

Related

Published by Chris Pateman - PR Coder

Leave a message please Cancel reply