
atlantis plan, review the output inline, comment atlantis apply. The full workflow lives in your VCS, with one caveat: you self-host and operate it.pre_workflow_hooks and terragrunt run-all glue. Pair this guide with our beginner's guide to Terragrunt if Terragrunt itself is new.terraform binary path or terragrunt-tfpath.Atlantis is an open-source automation tool designed to streamline Terraform workflows through pull request-based interactions. Rather than requiring developers to execute Terraform commands locally or through separate CI/CD interfaces, Atlantis brings plan and apply operations directly into version control, enabling teams to review, discuss, and approve infrastructure changes within familiar pull request interfaces.
Enhanced Collaboration: Infrastructure changes are visible directly in pull requests, enabling focused team discussions about proposed modifications before they're deployed.
Centralized Execution: All Terraform operations run on a dedicated server rather than individual machines, eliminating "works on my machine" issues and ensuring consistency across your organization.
Improved Governance: Pull requests create a natural audit trail for all infrastructure changes, documenting who proposed what, when, and with what approval.
State Management: Atlantis implements project-level locking to prevent concurrent operations on the same infrastructure, complementing Terraform's backend locking mechanisms.
Productivity: Automation of plan generation and the ability to apply changes through simple PR comments accelerates deployment cycles while maintaining rigor.
Understanding the strengths and limitations of Atlantis helps teams make informed decisions about adoption and plan for operational gaps.
GitOps for Infrastructure: Atlantis deeply integrates with your version control system (GitHub, GitLab, Bitbucket, Azure DevOps), enabling teams to manage infrastructure changes using familiar pull request workflows with a clear audit trail.
Automation of plan and apply: Automatic plan generation on PR creation and one-command applies reduce manual effort and accelerate deployment cycles.
Consistency and Standardization: Terraform commands run in a consistent server environment, eliminating "works on my machine" issues and standardizing how changes are reviewed and implemented.
Improved Security: Developers may not need direct cloud provider credentials on their local machines, and changes are reviewed and explicitly approved before being applied.
State Locking: Atlantis automatically locks Terraform states during plan and apply operations, preventing concurrent modifications that could lead to state corruption.
Open Source: Free to use and customizable. Self-hosting gives organizations complete control over the Atlantis instance, its configuration, and credential storage.
Self-Hosting and Maintenance Overhead: Your team is responsible for deploying, maintaining, securing, and upgrading the Atlantis server, which requires ongoing operational effort.
Single Point of Failure: Unless configured for high availability (which adds complexity), the Atlantis server can become a single point of failure that stalls automated workflows.
Concurrency Limitations: By default, Atlantis processes operations sequentially for a given instance. High volumes of concurrent PRs across many teams may lead to queues and delays without scaling strategies such as multiple instances with sharding logic.
Limited Native Integrations Beyond VCS: Compared to commercial platforms (Terraform Cloud, Scalr, env0, Spacelift), Atlantis has fewer built-in integrations for advanced policy checking, cost estimation, or security scanning. These can be added with custom workflows but require more setup.
Basic UI/UX: Atlantis primarily operates via PR comments and has a basic web UI for viewing logs and locks. It lacks the dashboards and reporting features of commercial alternatives.
Workflow Rigidity: The core workflow is tied to pull requests. Complex scenarios not fitting this model may be harder to implement.
No Built-in Advanced RBAC: Role-based access control is primarily managed through VCS repository permissions and Atlantis's own configuration. More granular RBAC within Atlantis is limited.
Without a dedicated automation layer, Terraform teams encounter several challenges:
Atlantis operates as a self-hosted service that you deploy and manage on your infrastructure. This differs from managed solutions and means you retain control over the environment while assuming responsibility for operational maintenance.
Common deployment approaches:
The service listens for webhook events from your version control system and responds to pull request activity and comment commands.
The fundamental Atlantis workflow follows this pattern:
terraform planatlantis apply to execute changesatlantis plan [-d dir] [-w workspace] [-p project_name]: Manually trigger a planatlantis apply [-d dir] [-w workspace] [-p project_name]: Apply a planned changeatlantis unlock: Release a stuck lockatlantis help: Show available commandsBefore deploying Atlantis, ensure you have:
Server Infrastructure: A dedicated server, VM, or container cluster with:
Git and Terraform:
Version Control System:
Cloud Provider Credentials:
The quickest way to get started is using Docker:
docker run --name atlantis -d -p 4141:4141 \
-e ATLANTIS_ATLANTIS_URL="https://atlantis.example.com" \
-e ATLANTIS_GH_USER="your-github-user" \
-e ATLANTIS_GH_TOKEN="your-github-pat" \
-e ATLANTIS_GH_WEBHOOK_SECRET="your-webhook-secret" \
-e ATLANTIS_REPO_ALLOWLIST="github.com/your-org/*" \
-v /path/to/atlantis-data:/atlantis-data \
ghcr.io/runatlantis/atlantis:latest serverFor production Kubernetes environments:
helm repo add runatlantis https://runatlantis.io
helm install atlantis runatlantis/atlantis \
--set atlantisURL=https://atlantis.example.com \
--set github.user=your-github-user \
--set github.token=your-github-token \
--set github.webhook_secret=your-webhook-secret \
--set repoAllowlist="github.com/your-org/*"Atlantis requires GitHub credentials to interact with your repositories. You have two options:
Personal Access Token (simpler but less granular):
repo scopeGitHub App (recommended for production):
/github-app/setup endpointFor the GitHub App approach:
/github-app/setup endpointAfter deploying Atlantis, configure webhooks in your version control system:
Webhook Settings:
https://your-atlantis-domain.com/events (note the /events suffix)application/jsonATLANTIS_GH_WEBHOOK_SECRETThe /events suffix is critical—missing it is a common setup error that prevents Atlantis from receiving notifications.
Every Terraform repository using Atlantis should have an atlantis.yaml file at its root. This file tells Atlantis how to handle infrastructure projects in your repository.
version: 3
automerge: false
parallel_plan: true
parallel_apply: true
projects:
- name: my-app-staging
dir: infra/staging
workspace: staging
terraform_version: v1.5.0
autoplan:
when_modified: ["**/*.tf", "**/*.tfvars", ".terraform.lock.hcl"]
enabled: true
apply_requirements: [approved]Projects Array: Defines Terraform projects Atlantis manages
name: Unique identifier for the projectdir: Directory path (relative to repo root)workspace: Terraform workspace to useterraform_version: Pin specific Terraform versionautoplan: Configure automatic planning behaviorapply_requirements: Conditions that must be met before applyingexecution_order_group: Numeric priority for execution orderdepends_on: List of projects this depends onAutoplan Configuration: Controls when plans automatically trigger
enabled: Whether autoplan is activewhen_modified: File patterns that trigger planningApply Requirements: Enforce approval conditions
approved: PR must be approved by a reviewermergeable: PR must be mergeable (no conflicts)undiverged: PR branch must be up-to-date with base branchUseful when all infrastructure code lives in a single repository:
Configuration for monorepo structure:
version: 3
projects:
- name: network-dev
dir: environments/dev/network
- name: compute-dev
dir: environments/dev/compute
depends_on: [network-dev]
- name: network-prod
dir: environments/prod/network
- name: compute-prod
dir: environments/prod/compute
depends_on: [network-prod]Separate repositories for different infrastructure components:
networking-repo/
├── atlantis.yaml
├── modules/
└── environments/
compute-repo/
├── atlantis.yaml
├── modules/
└── environments/
The when_modified setting determines which file changes trigger plans. Poor patterns cause unnecessary operations.
Inefficient Pattern (too broad):
autoplan:
when_modified: ["**/*.tf"] # Triggers for any .tf file anywhereOptimized Pattern (targeted):
projects:
- name: networking
dir: networking
autoplan:
when_modified:
- "networking/*.tf"
- "networking/*.tfvars"
- "modules/network/**/*.tf"Remember that paths are relative to the project's dir, not the repository root.
While Atlantis includes default plan and apply workflows, custom workflows enable sophisticated automation patterns.
version: 3
projects:
- name: production
dir: environments/production
workflow: prod-workflow
apply_requirements: [approved]
workflows:
prod-workflow:
plan:
steps:
- run: terraform fmt -check
- run: tflint .
- init
- plan:
extra_args: ["-var-file=prod.tfvars"]
apply:
steps:
- apply
- run: ./scripts/post-deploy-validation.shPre-Deployment Validation:
workflows:
secure-workflow:
plan:
steps:
- run: tfsec --no-color .
- run: checkov -d . --quiet
- init
- planCost Estimation Integration:
workflows:
cost-aware:
plan:
steps:
- init
- plan
- show
- run: |
infracost breakdown --path $SHOWFILE \
--format json \
--out-file /tmp/infracost.jsonMulti-Module Orchestration with Environment Variables:
workflows:
multi-env:
plan:
steps:
- env:
name: AWS_REGION
value: us-east-1
- env:
name: TF_VAR_environment
value: production
- init
- plan$PLANFILE: Path to generated plan file$WORKSPACE: Terraform workspace name$PROJECT_NAME: Project name from atlantis.yaml$DIR: Project directory$PULL_NUM: Pull request number$BASE_REPO_OWNER: Repository owner$BASE_REPO_NAME: Repository nameSecuring the Atlantis server is critical since it executes infrastructure changes with elevated permissions.
Firewall Configuration:
# Allow webhook traffic only from VCS provider IPs
iptables -A INPUT -p tcp -s <GITHUB_IPS> --dport 4141 -j ACCEPT
# Deny all other incoming traffic to Atlantis port
iptables -A INPUT -p tcp --dport 4141 -j DROPBest Practices:
atlantis server \
--ssl-cert-file=/path/to/cert.pem \
--ssl-key-file=/path/to/key.pem \
--atlantis-url="https://atlantis.example.com"Requirements:
# Create dedicated non-root user
sudo useradd -r -m -s /bin/false atlantis
# Set restrictive directory permissions
sudo mkdir -p /var/lib/atlantis
sudo chown atlantis:atlantis /var/lib/atlantis
sudo chmod 700 /var/lib/atlantisdocker run --name atlantis \
--user atlantis \
--read-only \
--cap-drop=ALL \
--security-opt=no-new-privileges \
--mount type=volume,source=atlantis-data,target=/var/lib/atlantis \
-p 4141:4141 \
ghcr.io/runatlantis/atlantis:latest server# Generate cryptographically secure webhook secret
webhook_secret=$(openssl rand -hex 32)
# Set in Atlantis configuration
export ATLANTIS_GH_WEBHOOK_SECRET="$webhook_secret"Requirements:
server {
listen 443 ssl;
server_name atlantis.example.com;
# GitHub webhook IP ranges
allow 192.30.252.0/22;
allow 185.199.108.0/22;
allow 140.82.112.0/20;
deny all;
location / {
proxy_pass http://localhost:4141;
}
}atlantis server \
--repo-allowlist="github.com/yourorg/*" \
--gh-webhook-secret="$WEBHOOK_SECRET"resource "aws_iam_role" "atlantis" {
name = "atlantis-terraform-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}]
})
}
resource "aws_iam_role_policy" "atlantis" {
role = aws_iam_role.atlantis.name
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
]
Resource = [
"arn:aws:s3:::${var.terraform_state_bucket}",
"arn:aws:s3:::${var.terraform_state_bucket}/*"
]
}
]
})
}Principles:
For Kubernetes deployments:
resource "aws_iam_openid_connect_provider" "atlantis" {
url = "https://your-atlantis-domain"
client_id_list = ["atlantis"]
thumbprint_list = ["<certificate-thumbprint>"]
}
resource "aws_iam_role" "atlantis_oidc" {
name = "atlantis-oidc-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRoleWithWebIdentity"
Effect = "Allow"
Principal = {
Federated = aws_iam_openid_connect_provider.atlantis.arn
}
Condition = {
StringEquals = {
"${aws_iam_openid_connect_provider.atlantis.url}:sub": "system:serviceaccount:atlantis:atlantis"
}
}
}]
})
}Use external secret managers instead of hardcoding credentials:
provider "vault" {
address = "https://vault.example.com"
}
data "vault_aws_access_credentials" "aws" {
backend = "aws"
role = "atlantis"
}
provider "aws" {
access_key = data.vault_aws_access_credentials.aws.access_key
secret_key = data.vault_aws_access_credentials.aws.secret_key
region = var.aws_region
}Use repos.yaml to enforce organization-wide policies:
repos:
- id: /.*/ # Applies to all repositories
allowed_overrides: [workflow]
allow_custom_workflows: false
apply_requirements: [approved, mergeable]
pre_workflow_hooks:
- run: terraform fmt -check
- run: tflintSecurity Controls:
# Enable basic authentication for web interface
atlantis server \
--web-basic-auth=true \
--web-username=admin \
--web-password=secure-passwordserver {
listen 443 ssl;
server_name atlantis.example.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "DENY" always;
add_header Content-Security-Policy "default-src 'self'" always;
location / {
proxy_pass http://localhost:4141;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}See "securing Terraform Atlantis in production" for deeper security configurations and advanced threat mitigation.
Terragrunt adds powerful DRY (Don't Repeat Yourself) capabilities to Terraform, and Atlantis integrates well with it for managing complex infrastructure-as-code. New to Terragrunt? Start with our beginner's guide to Terragrunt.
Enhanced Automation: PR-based workflows combined with Terragrunt's modular approach creates scalable infrastructure management
DRY Principles: Terragrunt's configuration inheritance prevents repetition across environments while Atlantis automates execution
Dependency Management: Terragrunt explicitly defines module relationships that Atlantis respects during planning and applying
Parallel Execution: Atlantis can execute Terragrunt run-all operations to manage multiple modules efficiently
The default Atlantis image doesn't include Terragrunt, so create a custom image:
FROM ghcr.io/runatlantis/atlantis:latest
ARG TERRAGRUNT_VERSION=v0.55.0
RUN curl -Lo /usr/local/bin/terragrunt \
"https://github.com/gruntwork-io/terragrunt/releases/download/${TERRAGRUNT_VERSION}/terragrunt_linux_amd64" && \
chmod +x /usr/local/bin/terragruntatlantis server \
--repo-allowlist="github.com/your-org/*" \
--atlantis-url="https://your-atlantis-server.com" \
--gh-user="your-github-user" \
--gh-token="your-github-token" \
--gh-webhook-secret="your-webhook-secret" \
--autoplan-file-list="**/*.tf,**/*.tfvars,**/terragrunt.hcl,**/.terraform.lock.hcl"Optimal structure for Terragrunt with Atlantis:
.
├── terragrunt.hcl # Root configuration
├── environments
│ ├── dev
│ │ ├── terragrunt.hcl
│ │ ├── us-east-1
│ │ │ ├── terragrunt.hcl
│ │ │ ├── vpc
│ │ │ │ └── terragrunt.hcl
│ │ │ ├── rds
│ │ │ │ └── terragrunt.hcl
│ │ │ └── eks
│ │ │ └── terragrunt.hcl
│ ├── staging
│ └── prod
└── modules
├── vpc
├── rds
└── eks
The terragrunt-atlantis-config tool automatically generates Atlantis configuration from Terragrunt dependencies:
terragrunt-atlantis-config generate --output atlantis.yaml \
--autoplan --parallel --create-workspace --cascade-dependenciesThis generates an atlantis.yaml that respects your Terragrunt dependency tree.
version: 3
projects:
- name: dev_us_east_vpc
dir: environments/dev/us-east-1/vpc
workflow: terragrunt
autoplan:
enabled: true
when_modified:
- "*.hcl"
- "*.tf*"
- "../../../modules/**/*.tf*"
workspace: dev_us_east_vpc
- name: dev_us_east_rds
dir: environments/dev/us-east-1/rds
workflow: terragrunt
depends_on:
- dev_us_east_vpc
workspace: dev_us_east_rds
workflows:
terragrunt:
plan:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform${ATLANTIS_TERRAFORM_VERSION}"'
- env:
name: TF_IN_AUTOMATION
value: 'true'
- run: terragrunt plan -input=false -no-color -out=$PLANFILE
apply:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform${ATLANTIS_TERRAFORM_VERSION}"'
- run: terragrunt apply -input=false $PLANFILEWhen dependencies haven't been applied yet, use mock outputs:
dependency "vpc" {
config_path = "../vpc"
mock_outputs = {
vpc_id = "mock-vpc-id"
}
mock_outputs_allowed_terraform_commands = ["plan", "validate"]
}workflows:
terragrunt-run-all:
plan:
steps:
- run: cd $DIR && terragrunt run-all plan -out atlantis.tfplan
apply:
steps:
- run: cd $DIR && terragrunt run-all apply atlantis.tfplanConfigure execution order in atlantis.yaml:
version: 3
parallel_plan: true
parallel_apply: true
projects:
- name: network
dir: infrastructure/network
execution_order_group: 1
- name: security
dir: infrastructure/security
execution_order_group: 2
depends_on:
- network
- name: database
dir: infrastructure/database
execution_order_group: 3
depends_on:
- securitySee the linked spoke article "The Ultimate Guide to Terraform Atlantis with Terragrunt" for comprehensive Terragrunt integration patterns and advanced configurations.
Atlantis enables organizations to implement rigorous cost control by making infrastructure changes visible and reviewable before deployment.
Visibility as a Control Mechanism: Every proposed change appears in the PR with a complete terraform plan output. This visibility acts as a powerful checkpoint for preventing accidental resource deployment.
Mandatory Review: Infrastructure changes require team review before implementation. This human checkpoint catches unjustified resource creation that could lead to unnecessary costs.
Version Control Audit Trail: Git history combined with Atlantis PR logs creates a permanent record of infrastructure modifications, enabling cost tracking and allocation.
Integrate cost estimation into your PR workflow:
workflows:
terraform-infracost:
plan:
steps:
- init
- plan
- show
- run: |
infracost breakdown --path $SHOWFILE \
--format json \
--out-file /tmp/infracost-$PULL_NUM.json
repos:
- id: /.*/
workflow: terraform-infracost
post_workflow_hooks:
- run: |
infracost comment github \
--path /tmp/infracost-*.json \
--repo $BASE_REPO_OWNER/$BASE_REPO_NAME \
--pull-request $PULL_NUM \
--github-token $GITHUB_TOKEN \
--behavior update
commands: planThis posts estimated cost impact directly in the PR, enabling data-driven decisions before infrastructure is deployed.
The PR workflow ensures resource deletion is as deliberate as creation:
terraform plan shows exactly what will be destroyedatlantis apply executes the removalThis prevents orphaned resources that silently drain budgets.
Use Atlantis to systematize instance type and capacity adjustments:
# Example: Resize EC2 instance
resource "aws_instance" "app_server" {
instance_type = var.instance_type # Change from t3.large to t3.medium
# ... other configuration
}With Infracost integrated, the PR immediately shows cost savings, providing confidence for the change.
Automate non-production environment lifecycles using Atlantis APIs:
# Shutdown non-prod at end of day
curl -X POST https://atlantis.example.com/apply \
-H "Authorization: Bearer $ATLANTIS_TOKEN" \
-d '{
"workspace": "dev",
"project": "infrastructure",
"pull_request_number": 12345,
"comment": "atlantis apply"
}'Terraform code defines "shutdown" states (ASG scaling to zero, database pauses), which external schedulers trigger to save on non-production costs.
Consistent Tagging: Tags are critical for cloud billing allocation and cost tracking.
provider "aws" {
region = "us-east-1"
default_tags {
tags = {
Environment = var.environment
Project = "phoenix"
CostCenter = "engineering-123"
ManagedBy = "Terraform-Atlantis"
CreatedDate = timestamp()
}
}
}Audit Trail Integration: Correlate Git commit history with Atlantis PR logs to understand cost drivers and allocation.
As organizations explore vendor-neutral infrastructure automation, OpenTofu (the open-source Terraform fork) has emerged as an important alternative. Atlantis fully supports OpenTofu workloads.
Atlantis was designed from inception to be VCS-agnostic and now extends that philosophy to IaC tools. The project supports both Terraform and OpenTofu, allowing organizations to choose their preferred tool without switching automation platforms.
Specify OpenTofu for specific projects in atlantis.yaml:
projects:
- name: project-terraform
dir: project-terraform
terraform_version: 1.5.0
- name: project-opentofu
dir: project-opentofu
terraform_distribution: opentofu
terraform_version: 1.6.0Set default distribution via server flags:
atlantis server \
--terraform-distribution=opentofu \
--default-tf-version=1.6.0Vendor Independence: Use OpenTofu without changing your infrastructure automation tooling
Flexibility: Mix Terraform and OpenTofu projects in the same repository
Future-Proof: OpenTofu's community-driven development ensures long-term support
Cost Control: No licensing fees or commercial vendor lock-in
Provider Ecosystem: Ensure providers you use have OpenTofu versions available
State Compatibility: Terraform and OpenTofu can share state files, enabling gradual migration
Testing: Thoroughly test OpenTofu plans before migrating production workloads
GitHub Actions offers a general-purpose CI/CD platform that can run Terraform, while Atlantis is purpose-built for infrastructure automation. Understanding the tradeoffs helps inform your choice.
| Aspect | Atlantis | GitHub Actions |
|---|---|---|
| Setup Complexity | Host service, configure webhooks | Configure YAML workflows |
| Infrastructure Cost | Server hosting + maintenance | GitHub plan + compute minutes |
| Terraform Integration | Purpose-built, native PR experience | Custom workflow configuration needed |
| State Management | Built-in locking and management | Manual state backend setup |
| Customization | Focused on IaC operations | Extensive third-party action ecosystem |
| Team Learning Curve | Lower for IaC teams | Higher for multi-purpose CI/CD |
Choose Atlantis if: Your team prioritizes a streamlined Terraform workflow, you want centralized execution, or you value native PR integration.
Choose GitHub Actions if: You need multi-purpose CI/CD beyond infrastructure, prefer managed services, or use GitHub exclusively.
The broader ecosystem includes several commercial and open-source platforms:
Each alternative approaches different organizational needs around scale, governance, support requirements, and cost structure.
Store atlantis.yaml in version control to track configuration changes through the same audit process as infrastructure code:
version: 3
automerge: false
parallel_plan: true
parallel_apply: true
projects:
- name: networking
dir: infrastructure/networking
autoplan:
when_modified: ["*.tf", "*.tfvars", "../modules/network/**/*.tf"]
terraform_version: 1.5.0
execution_order_group: 1
- name: database
dir: infrastructure/database
autoplan:
when_modified: ["*.tf", "*.tfvars", "../modules/database/**/*.tf"]
terraform_version: 1.5.0
execution_order_group: 2
depends_on:
- networkingBenefits include configuration review through PRs, rollback capability, and historical audit trail.
Always use remote state backends with locking mechanisms:
terraform {
backend "s3" {
bucket = "terraform-state-bucket"
key = "path/to/my/key"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-lock-table"
}
}For complex backends, use custom workflows:
workflows:
custom_backend:
plan:
steps:
- run: rm -rf .terraform
- init:
extra_args: [
"-backend-config=bucket=terraform-state-bucket",
"-backend-config=key=${WORKSPACE}/state.tfstate",
"-backend-config=dynamodb_table=terraform-lock-table"
]
- planCreate specific IAM roles for Atlantis with minimal necessary permissions:
resource "aws_iam_role" "atlantis" {
name = "atlantis-terraform-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}]
})
}
# Attach only required specific policies, not AdminAccess
resource "aws_iam_role_policy_attachment" "atlantis" {
role = aws_iam_role.atlantis.name
policy_arn = "arn:aws:iam::aws:policy/specific-policy"
}Maintain a version upgrade strategy:
projects:
- name: legacy
dir: legacy
terraform_version: 0.14.11 # Pinned for backward compatibility
- name: modern
dir: modern
terraform_version: 1.5.0 # Current versionUpgrade Strategy:
Organize code to reduce unnecessary plans and conflicts:
terraform-repo/
├── atlantis.yaml
├── modules/
│ ├── networking/
│ ├── compute/
│ └── storage/
├── environments/
│ ├── dev/
│ │ ├── network
│ │ ├── compute
│ │ └── database
│ ├── staging/
│ └── production/
└── README.md
Benefits: Reduced plan frequency, clearer responsibility boundaries, easier dependency management.
Use targeted patterns to trigger plans only when relevant files change:
projects:
- name: networking
dir: networking
autoplan:
when_modified:
- "networking/*.tf"
- "networking/*.tfvars"
- "modules/network/**/*.tf" # Include related modulesTest patterns with sample PRs to verify correctness.
Integrate validation before Terraform operations:
workflows:
validate-and-plan:
plan:
steps:
- run: terraform fmt -check
- run: terraform validate
- run: tfsec --no-color .
- run: checkov -d . --quiet
- init
- planRecommended tools:
terraform fmt and terraform validate: Built-in syntax checkstflint: Extended lintingtfsec: Security vulnerability scanningcheckov: Policy-based security scanningconftest/OPA: Custom policy enforcementterrascan: Compliance and security violation scannerFor organization-wide scanning, use the server-side repos.yaml:
repos:
- id: /.*/
pre_workflow_hooks:
- run: terraform fmt -check
- run: tflint
- run: tfsec . --no-colorIntegrate with policy-as-code tools like OPA/Conftest for custom policy enforcement. Example OPA policy (save as policy/terraform.rego):
package terraform
deny[msg] {
input.resource.aws_s3_bucket[name].acl == "public-read"
msg = sprintf("S3 bucket '%v' is publicly readable", [name])
}
deny[msg] {
input.resource.aws_security_group_rule[name].cidr_blocks[_] == "0.0.0.0/0"
input.resource.aws_security_group_rule[name].type == "ingress"
port = input.resource.aws_security_group_rule[name].to_port
msg = sprintf("Security group rule '%v' allows ingress from internet to port %v", [name, port])
}Implement comprehensive monitoring:
atlantis server --metrics-prometheus-endpoint="/metrics"Key Metrics:
atlantis_project_plan_execution_success/error: Plan success/failureatlantis_project_apply_execution_success/error: Apply success/failureatlantis_project_plan/apply_execution_time: Execution durationAlerting:
For Kubernetes deployments, configure health probes:
livenessProbe:
httpGet:
path: /healthz
port: 4141
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /healthz
port: 4141
initialDelaySeconds: 30
periodSeconds: 30Grafana Dashboard Recommendations: Visualize command execution success/failure rates, execution times, project plan/apply success rates, lock statistics, and server resource utilization.
Centralized Log Forwarding: Forward Atlantis logs to a centralized system (ELK, CloudWatch, etc.) for analysis and retention:
# Filebeat configuration example
filebeat.inputs:
- type: log
paths:
- /var/log/atlantis/atlantis.log
output.elasticsearch:
hosts: ["elasticsearch:9200"]Use strong webhook secrets and HTTPS:
atlantis server \
--ssl-cert-file=/path/to/cert.pem \
--ssl-key-file=/path/to/key.pem \
--gh-webhook-secret="$(openssl rand -hex 32)"Deploy behind a reverse proxy with additional security headers and IP allowlisting.
Successful adoption requires team alignment:
Documentation:
Training:
Rollout Strategy:
For complex infrastructures, properly configure execution order:
version: 3
parallel_plan: true
parallel_apply: true
projects:
- name: network
dir: infrastructure/network
execution_order_group: 1
- name: security
dir: infrastructure/security
execution_order_group: 2
depends_on:
- network
- name: database
dir: infrastructure/database
execution_order_group: 3
depends_on:
- security
- name: application
dir: infrastructure/application
execution_order_group: 4
depends_on:
- databaseThis ensures resources are created in correct sequence while maximizing parallelism where dependencies allow.
Symptoms: AccessDenied, NoCredentialProviders, or VCS authentication errors
Common Causes:
Diagnosis:
# Check Atlantis logs with debug level
docker logs atlantis --tail 100
# Verify credentials in the container
docker exec atlantis aws sts get-caller-identity
docker exec atlantis env | grep AWSResolution:
Prevention: Use centralized secret management (Vault, AWS Secrets Manager) and rotate credentials regularly.
Symptoms: Atlantis doesn't comment on PRs or respond to commands
Common Causes:
/events suffix is common)Diagnosis: Check VCS webhook delivery logs first. Then verify:
# Test webhook connectivity
curl -X POST https://atlantis.yourcompany.com/events \
-H "Content-Type: application/json" \
-H "X-GitHub-Event: ping" \
-d '{"zen": "test"}' -v
# Check Atlantis server logs
docker logs atlantis --grep "webhook"Resolution:
/eventsATLANTIS_GH_WEBHOOK_SECRETSymptoms: "Project locked by PR #XYZ" message, operations blocked
Common Causes:
Diagnosis:
Resolution:
# For stale locks, manually unlock via PR comment
atlantis unlock
# For performance issues, consider running Terraform more efficiently
# Refine atlantis.yaml to split broad projects into more granular onesSymptoms: Atlantis plan differs from local plan, unexpected resource changes
Common Causes:
.terraform.lock.hcl filesDiagnosis:
# Check Terraform version in Atlantis
atlantis version -p <project_name>
# Compare lock files
diff local/.terraform.lock.hcl remote/.terraform.lock.hcl
# Verify environment variables
docker exec atlantis env | grep TF_VARResolution:
atlantis.yaml.terraform.lock.hcl to version controlSymptoms: Autoplan failures, wrong workflow execution, project not found errors
Common Causes:
when_modified patternsDiagnosis:
# Validate YAML syntax locally
yamllint atlantis.yaml
# Check server logs with debug level
docker logs atlantis --tail 200 | grep -i errorResolution:
when_modified paths are relative to project dirrepos.yaml allows desired overridesSymptoms: Slow plan/apply operations, high server CPU/memory, lock contention
Common Causes:
Diagnosis:
# Monitor server resources
docker stats atlantis
# Check Terraform state size
ls -lh terraform.tfstate
# Enable Atlantis profiling
curl http://localhost:4141/debug/pprof/Resolution:
--parallel-pool-size based on capacity--data-dirSymptoms: Unauthenticated UI, HTTP webhooks, overly broad permissions
Diagnosis: Security audit covering:
Resolution:
--web-basic-auth=trueSymptoms: Apply fails due to state drift, undiverged requirement blocks apply
Causes:
Resolution:
# Configure merge checkout strategy with undiverged requirement
apply_requirements: [approved, undiverged]Use branch protection rules requiring branches to be up-to-date, and implement automatic PR updates with external tools.
See the linked spoke article "Troubleshooting Common Terraform Atlantis Issues" for more detailed diagnostic procedures and resolution steps.
For teams that want the Atlantis-style PR comment workflow without the self-hosting burden, Scalr provides a managed alternative that reproduces core Atlantis patterns with additional governance features.
In addition to existing VCS features such as "Trigger runs for draft pull requests" and "Send the plan summary back to pull request comments", Scalr offers "Allow triggering plan-only runs from the PR comments" and "Allow triggering apply runs from the PR comments". Once admins enable these, users can trigger runs directly from PR comments.

Connect the VCS you want to use for your Atlantis workflow
Once VCS admins enable end-users to trigger plans and applies from the PR, users can add the following comments to trigger runs in Scalr:
Once the plan is completed, the result (success/failure, resource changes, etc.) is automatically posted as a comment in the pull request thread, keeping your code review and infrastructure workflows fully integrated.

You can now use comments to trigger Terraform / OpenTofu plan and apply
Want to limit which Scalr environments can have runs executed from PR comments? Scalr's integration with Open Policy Agent (OPA) can prevent various run sources, including PR comments. An OPA policy check can deny any run with the source comment-github or deny any run that is not from that source. A common use case is allowing PR-driven runs from lab and development environments but not production.
Avoid State Updates by Unmerged PRs: Scalr displays warnings when changes are attempted from branches with unmerged pull requests and automatically prevents auto-apply operations when the state-generating branch differs from your run's configuration branch.

Prevent Apply from a Non-Mergeable PR: Using the apply-before-merge workflow, the /scalr apply command can be restricted to only execute after a PR is approved and passes branch protection checks, enforced through the merge_error attribute in the run input.
Automatic Base Branch Merge Before Run Execution: VCS-driven workspaces can automatically merge the base branch into the head branch before triggering a run, ensuring runs execute against the latest code and reducing false-positive results.
Terraform Atlantis transforms infrastructure automation by embedding Terraform operations directly into pull request workflows. Its GitOps approach to infrastructure enhances collaboration, governance, and auditability while providing teams with centralized control over infrastructure changes.
The combination of Atlantis with Terragrunt, OpenTofu support, cost optimization integrations, and robust security controls makes it a powerful choice for organizations seeking to scale their infrastructure-as-code practices. When paired with proper team training, documented workflows, and monitoring practices, Atlantis enables teams to manage complex infrastructure reliably and efficiently.
However, managing Atlantis at scale requires attention to operational details—credential management, webhook configuration, version management, and performance optimization. Teams should carefully consider their organizational capacity for managing these operational aspects.
For organizations prioritizing operational simplicity, integrated governance features, or enterprise support, platforms like Scalr offer managed alternatives that abstract away much of Atlantis's operational burden while providing similar workflow capabilities, policy enforcement, and team collaboration features.
Whether you choose Atlantis or explore managed alternatives, the key is establishing GitOps practices that bring infrastructure changes through the same rigorous review and approval processes as application code—ensuring consistency, auditability, and reliability across your infrastructure ecosystem.
External Resources:
This pillar article consolidates comprehensive knowledge about Terraform Atlantis, synthesizing multiple focused guides into a single authoritative reference for infrastructure teams implementing GitOps workflows in 2026.
