-
Notifications
You must be signed in to change notification settings - Fork 162
Description
Description
TL;DR
CRITICAL BUG: terraform destroy fails catastrophically when enable_aws_load_balancer_controller = true
in the EKS Blueprints Addons module because the controller creates untracked ALBs that block VPC deletion⛔. Which ends up Requiring multiple destroy attempts and manual AWS CLI intervention to clean up orphaned load balancers. This completely violates Infrastructure-as-Code principles and makes the addons module unsuitable for production use.
- [ x ] ✋ I have searched the open/closed issues and my issue is not listed.
Versions
-
Module version [Required]:
aws_load_balancer_controller | aws-ia/eks-blueprints-addon/aws | 1.1.1
-
Terraform version:
Terraform: ~> 1.0 -
Provider version(s):
AWS Provider: >= 5.70
Kubernetes Provider: >= 2.32
Helm Provider: >= 2.15
TLS Provider: ~> 4.0
Local Provider: >= 2.5
Random Provider: ~> 3.6
Kubectl Provider: >= 1.19.0
Reproduction Code [Required]
Terraform Configuration
module "eks_addons" {
source = "aws-ia/eks-blueprints-addons/aws"
version = "~> 1.0"
cluster_name = module.eks.cluster_name
cluster_endpoint = module.eks.cluster_endpoint
oidc_provider_arn = module.eks.oidc_provider_arn
cluster_version = module.eks.cluster_version
eks_addons = {
aws-ebs-csi-driver = { most_recent = true }
coredns = { most_recent = true }
kube-proxy = { most_recent = true }
eks-pod-identity-agent = {}
}
# This configuration breaks terraform destroy
aws_load_balancer_controller = {
set = [
{
name = "enableServiceMutatorWebhook"
value = "false"
}
]
}
# This innocent-looking flag breaks everything
enable_aws_load_balancer_controller = var.enable_lb_ctl
tags = var.tags
depends_on = [module.eks]
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
# Standard EKS configuration
}
Required Manual Cleanup Workflow (This should NOT be necessary):
# Step 1: Hunt down the invisible load balancers
aws elbv2 describe-load-balancers --query "LoadBalancers[*].{Name:LoadBalancerName,Type:Type,State:State.Code,DNSName:DNSName}" --output table --profile $PROFILE
# Step 2: Extract load balancer names
alb_name=$(aws elbv2 describe-load-balancers --query "LoadBalancers[*].LoadBalancerName" --output text --profile $PROFILE)
# Step 3: Get ARN for deletion
alb_arn=$(aws elbv2 describe-load-balancers --names $alb_name --query 'LoadBalancers[0].LoadBalancerArn' --output text --profile $PROFILE)
# Step 4: Manually delete what Terraform should have handled
aws elbv2 delete-load-balancer --load-balancer-arn "$alb_arn" --profile $PROFILE
# Step 5: Wait for deletion to propagate
# Step 6: Re-run terraform destroy
# Step 7: Hope nothing else was missed
Expected behavior
When I run terraform destroy, ALL resources should be cleaned up automatically in a single operation. No manual intervention should be required. That's the fundamental promise of Infrastructure-as-Code - declarative, reproducible, hands-off infrastructure management.
Actual behavior
BROKEN DESTROY PROCESS:
terraform destroy runs
- EKS cluster and most resources get destroyed
- VPC DELETION FAILS with DependencyViolation errors
- Manual detective work required to identify phantom ALBs created by AWS Load Balancer Controller
- Mandatory manual cleanup with multiple AWS CLI commands
- Multiple destroy cycles required to fully clean up infrastructure
- Complete failure of IaC automation
Typical error that forces manual intervention:
Error: error deleting VPC (vpc-xxxxxxxxx): DependencyViolation: The vpc 'vpc-xxxxxxxxx'
has dependencies and cannot be deleted.
Additional context
Root Cause Analysis
When enable_aws_load_balancer_controller = true
, the EKS Blueprints Addons module installs the AWS Load Balancer Controller via Helm. This controller then:
- Automatically creates ALBs for Kubernetes services/ingress resources
- Places them in the VPC subnets (creating hidden dependencies)
- Does NOT track them in Terraform state (invisible to dependency graph)
- Blocks VPC deletion during destroy operations
- Requires manual cleanup (violating IaC principles)
The controller operates outside Terraform's knowledge, creating resources that Terraform cannot manage or clean up.
Impact on Production Operations
- Environment teardown is BROKEN - cannot cleanly destroy test/staging environments
- Development velocity killed - developers waste time on manual cleanup instead of building features
- Cost management nightmare - orphaned resources accumulate charges
- Manual intervention required - completely defeats Infrastructure-as-Code automation
- Risk of incomplete cleanup - hidden dependencies may leave other orphaned resources
Failed Workaround Attempts
- Tried disabling the controller before destroy - doesn't help with already-created ALBs
- Attempted to track ALBs with data sources - controller-created resources appear after apply
- Used destroy-time provisioners - dependency ordering still fails
- Manual pre-destroy cleanup scripts - brittle and defeats IaC principles