
How I Rebuilt Cloudbeds’ Cloud Architecture with Terraform and Atlantis: A Practical Guide to Eliminating Drift and Enforcing Infra Discipline
Published by Vladyslav Ratslav · Cloud Architect · March 2026
Also published on LinkedIn:Read on LinkedIn
Cloud infrastructure rarely grows in a straight line. It evolves through urgent fixes, new features, quick experiments, and the occasional “temporary” workaround that somehow survives for years. Over time, production and development environments drift apart, and the organization loses the ability to reason about its own infrastructure with confidence.
At Cloudbeds, I led a full Infrastructure‑as‑Code (IaC) transformation to solve exactly this problem. The goal was ambitious: codify the entire cloud architecture, eliminate drift, enforce a promotion‑based workflow, and give every engineer a safe, automated way to propose and validate infrastructure changes.
This is the story of how I approached it, the principles that guided the work, and the system we built to keep infrastructure consistent, predictable, and scalable.
Why This Project Was Necessary
Cloudbeds operated with separate AWS accounts for production and development. Over time, these environments diverged. Some differences were intentional improvements; others were accidental drift. Without a unified IaC foundation, it was difficult to:
- understand the real state of the infrastructure
- maintain consistency across environments
- track changes or enforce governance
- onboard new engineers without tribal knowledge
The solution was clear: Terraform everything. But doing it right required more than just writing code—it required a strategy.
The Core Principle: Import Production First
The most important decision I made was to adopt a service‑first import strategy.
Instead of trying to import the entire cloud estate at once, I worked microservice by microservice. For each one, I:
- Fully imported the production infrastructure into Terraform. This ensured the Terraform state reflected reality—not assumptions.
- Reconstructed the infrastructure map directly from the imported code. This gave me a precise, code‑driven view of how each service actually worked.
- Designed reusable Terraform modules based on real production patterns. No guesswork, no theoretical abstractions—just clean, accurate modules.
This approach guaranteed that Terraform became the single source of truth for production.
Aligning Development with Production
Once a production service was fully codified, I imported the corresponding development environment into the same Terraform configuration. This immediately surfaced every deviation between the two environments.
Each difference went through a structured workflow:
- Detect — Terraform showed exactly what diverged.
- Investigate — I analyzed whether the difference was intentional or accidental.
- Communicate — I discussed findings with service owners and stakeholders.
- Decide — If it was drift, it was removed. If it was an improvement, it was promoted to production.
This process eliminated drift and ensured both environments remained structurally identical.
Building a Reusable IaC Framework
With accurate infrastructure maps in place, I created a suite of reusable Terraform modules covering more than 18 core components. These modules standardized how services were deployed and ensured that every team worked from the same architectural patterns. The module library included:
- VPCs and networking
- ECS services
- EC2 servers and configurations
- IAM roles and policies
- Load balancers
- Databases and storage
- Observability components
By basing every module on real production architecture—not assumptions—I ensured that the entire IaC framework was both accurate and scalable. This modular approach reduced duplication, simplified onboarding, and made infrastructure changes predictable and safe.
Operationalizing IaC with Atlantis on ECS
Codifying infrastructure is only half the battle. The real challenge is operationalizing it in a way that is safe, auditable, and accessible to everyone.
To achieve this, I integrated every Terraform repository with a centralized private Atlantis deployment running on ECS.
Atlantis became the automation engine behind every infrastructure change:
- It monitored all pull requests.
- It automatically generated Terraform plans.
- It enforced the dev‑first promotion workflow.
- It allowed stakeholders to apply changes to development.
- It enabled validated changes to be promoted to production.
This turned infrastructure into a self‑service, PR‑driven workflow. Engineers no longer needed direct AWS access to propose changes. Everything became transparent, reviewable, and consistent.
The Result: A Drift‑Free, Promotion‑Driven Infrastructure
By the end of the project, Cloudbeds had:
- A fully codified cloud architecture for both production and development
- Reusable Terraform modules powering consistent deployments
- Zero drift between environments
- A clear, enforced promotion workflow
- A self‑service PR‑driven IaC process via Atlantis
- A scalable foundation for future infrastructure growth
This wasn’t just a migration—it was a transformation of how infrastructure was built, managed, and delivered.
What I Learned
A project of this scale teaches you a lot about both technology and people. A few lessons stand out:
- Importing production first is the only reliable way to build accurate IaC.
- Drift is inevitable unless you enforce a promotion‑based workflow.
- Automation is not optional—it’s the backbone of infrastructure governance.
- Stakeholder communication is as important as Terraform code.
- Self‑service infrastructure empowers teams and reduces operational load.
Final Thoughts
Infrastructure‑as‑Code is not just a toolset—it’s a philosophy. It requires discipline, clarity, and a willingness to rethink how teams interact with the cloud. At Cloudbeds, we built a system that not only solved immediate problems but also created a foundation for long‑term scalability and reliability.
If you’re planning a similar transformation, start with production, build from reality, and automate everything you can. The payoff is enormous.
← Back
